Founded in 2019 as part of the Arabesque group, Arabesque AI uses its proprietary artificial intelligence engine to analyze and predict financial market behavior, using this insight to build a wide range of customizable investment strategies. What sets Arabesque AI apart from other firms applying AI to financial contexts is the scope of its ambitions: Rather than narrowing its focus to particular markets, styles, or asset classes, Arabesque AI uses artificial intelligence and analytics to extract general principles that can be applied elsewhere. For example, a system that can sieve through the noise in the data among European equity markets should be able to do the same in US equity markets, provided the system is built with the generic ability to reliably identify predictive relationships in financial data. “It’s about finding signals in the noise and then using those signals to find the solution to a wider problem,” says Nikolaos.
With the group expanding its services to meet rapidly increasing demand for AI technology solutions in financial markets, Arabesque AI aims to provide its offerings to a broad spectrum of customers across the world. When scaling up its product, Arabesque AI quickly understood that elastic scaling would be crucial to realizing its ambitions. So Arabesque AI turned to Google Cloud as the backbone of its platform.
“The rate at which our scope was growing would create capacity and maintenance issues with an on-premises or hybrid infrastructure. Going forward, provisioning new servers and keeping them running would take up too much resources,” explains Nikolaos. “With Google Cloud, we could minimize infrastructure management while simultaneously maximizing our scalability.”
Supporting company growth with a limitless cloud infrastructure
When Arabesque AI first launched, it did so with a hybrid (both on-premises and cloud) infrastructure model, using its own technology and as many open source tools as possible. “We want to avoid vendor lock-in and dependencies as much as we could,” says Nikolaos. “We never want to be in a position where we advise our clients on a particular course of action that we can’t fully understand or control.”
Core services were running on-premises, with the ability to expand into the cloud for large tasks such as training a new algorithm. But early on, it became clear that Arabesque AI’s plans would be better served by fully embracing cloud computing. “We had the beginnings of a cloud presence, but aligning it with our on-premises servers was very painful,” says Nikolaos. “It was clear to us that the hybrid solution we had wasn’t going to work in the long term.” Capacity needs started to grow quickly as more data sources and sets were acquired and new algorithms were trained and deployed. Arabesque AI’s researchers were spending substantial amounts of time on DevOps management that could have spent more effectively on AI research, their core business task.
“Google Kubernetes Engine has been great for us because it makes autoscaling and deploying in multiple regions very, very easy. Using Cloud Run in tandem helps us to automate even more of the infrastructure management and be more productive within our resource constraints as a small team.”
—Matthias Baetens, Senior Associate, Arabesque AI
A scalable decision engine for financial portfolios
In late 2019, Arabesque AI participated in the Google Cloud for Startups program, graduating in March 2020. During the program, it benefited from access and support from Google Cloud experts and the financial freedom to experiment with the platform and explore all of its functionalities. Arabesque AI was impressed with the managed services available on Google Cloud, its ease of use, and its commitment to open source technology.
Shortly afterwards, Arabesque AI migrated its entire infrastructure to Google Cloud. Built on Kubernetes, the open source container system developed at Google, the platform is now orchestrated by Google Kubernetes Engine (GKE), which manages much of the infrastructure overhead. Over time, Arabesque AI increasingly used Cloud Run to further reduce the burden of infrastructure management.
“Google Kubernetes Engine has been great for us because it makes autoscaling and deploying in multiple regions very, very easy,” says Matthias Baetens, Senior Associate and Software Engineer at Arabesque AI. “Using Cloud Run in tandem helps us to automate even more of the infrastructure management and be more productive within our resource constraints as a small team.”
The platform is split into several parts: First, Arabesque AI takes in data over several pipelines from third-party sources. This allows it to customize investment portfolios to clients’ needs, one of the core features of its platform. One such data source is Arabesque S-Ray, a sister company within the Arabesque Group, which collects environmental, social, and corporate governance (ESG) metrics on thousands of companies across 75 countries. Next, Arabesque AI uses a combination of Cloud Functions, Cloud Run, Pub/Sub, and GKE to load these datasets into Cloud Storage. From there, ESG metrics and other market data can be easily accessed, processed, and analyzed using BigQuery, which functions as Arabesque AI’s scalable data warehouse and enables users to run fast queries on up to petabytes of data with zero operational overhead.
This data forms the basis on which Arabesque AI’s proprietary artificial intelligence engine operates, running on GKE nodes. Periodically, whenever a new model has to be trained, Arabesque AI needs to scale up to use thousands of cores within a few days and then scale back down again. Preemptible node pools within GKE have made that process extremely cost-effective and easy to manage. The AI engine creates new signals from the market information stored in BigQuery, often swelling the data to several times its previous size, and this combination of proprietary signals and existing market data goes into constructing new strategies. “BigQuery has proved extremely useful in our journey into the land of serverless data at scale,” shares Matthias, “particularly for leveraging the Global Database of Events, Language, and Tone (GDELT) dataset in our Knowledge Graph efforts and the ability to cope with streaming insert and large-scale analytics.”
Cloud Endpoints pushes the final product out to clients, which is a collection of investment strategies and analytics. Clients who wish to build their own custom investment strategies can do so using Arabesque AI’s web app, built with App Engine and Cloud Identity for simplified, secure identity and access management. Clients who wish to customize their own dashboard can do so by accessing the product via APIs. Finally, the entire process is logged and monitored using the Google Cloud operations suite to ensure that any problems or bugs are dealt with swiftly. Orchestration relies on Argo, Cloud Tasks, and Cloud Scheduler, while the team keeps an eye on Workflows.
“As a growing company, research is a core business objective for us. With Google Cloud, most of our resources go toward research and only minimal effort is spent on operations. That’s the opposite of what we had before and means we can improve our platform at a much faster rate.”
—Nikolaos Kaplis, CTO, Arabesque AI
Soon after completing the migration in early 2020, Arabesque AI worked closely with Google Cloud Premier Partner DoiT International to further optimize the infrastructure. “DoiT International really helped to simplify the billing process for us. The team is also a great resource to call on when we need extra support. DoiT International’s engineers provided us with excellent guidance on the migration to BigQuery, for example, saving us two days that we would have spent working out a solution by ourselves,” says Nikolaos.
Focusing on what matters with Google Cloud
As a growing company, resource efficiency is key to Arabesque AI, so every gain is important, whether it’s computational time or human resources. Migrating to Google Cloud has meant that it can redirect its team to focus on the objectives that really matter to Arabesque AI and its customers.
“As a growing company, research is a core business objective for us. With Google Cloud, most of our resources go toward research and only minimal effort is spent on operations,” says Nikolaos. “That’s the opposite of what we had before, and means we can improve our platform at a much faster rate.”
Arabesque AI has also been able to scale at a pace it couldn’t have hoped to achieve with its previous on-premises infrastructure. With its data on Google Cloud, the company has increased its ability to stream and analyze data by 10 times, leading to a great increase in coverage. What’s remarkable is that it has done this at the same time as reducing costs by taking advantage of the elasticity of Google Cloud, scaling up and down as needed. Using preemptible instances in GKE and only paying for what it uses with Cloud Run and Cloud Functions, Arabesque AI has reduced server costs by around 75%.
A key driver of Arabesque AI’s move to Google Cloud was the ability to use open source technology, starting with Kubernetes. While this focus on open source has helped the company avoid being dependent on any one platform and prevented vendor lock-in issues, it has also had a beneficial effect for the team. “The more we used open source tools and technology such as Kubernetes, the easier it was to hire and train new talent,” says Nikolaos. “We more than doubled in size in less than a year, and the time needed to train our people went down by half. We couldn’t have achieved that without being able to use the technology we wanted to on Google Cloud. It’s a very developer-friendly environment.”
With its new infrastructure in place, Arabesque AI’s priority is to welcome its first customers on board, including DWS, one of the world’s largest asset managers. In fact, Arabesque AI is working closely with DWS, with the goal of bringing AI-driven investment products to market. This joint product development work makes use of Google Cloud.
As well as growing its customer base, the company is also looking at new ways to serve its clients. Arabesque AI is hard at work on developing Knowledge Graph, a project which aims to quickly extract insights from unstructured information such as news or social media feeds. The plan is for it to run on top of Dataflow, the fully managed data processing service that reacts to large torrents of events in real time and leverages the models that Arabesque AI has built internally.
“The goal is to take unstructured text, a news article for example, and automatically extract everything that is relevant for our engine,” says Nikolaos. “We’re building on existing natural language processing algorithms to create natural language understanding. If there’s a news story about a merger between two companies, for example, we want to identify exactly which companies they are and what the relationship is between them. We’ll be able to help our system find the signals necessary to improve our predictions.”