Written by David Steiner, Principal Engineer, Validus Risk Management Ltd.
Written by Fergus Strangways-Dixon, Lead Platform Engineer, Validus Risk Management Ltd.
Nihilson Gnanadason, Senior Solutions Architect – AWS
Validas
Increased market uncertainty over the past few years has demonstrated the need for market participants to dynamically understand the risk and impact of market movements on their portfolios. To get an accurate view, market data needs to be updated frequently and in a timely manner.
Validus Risk Management is a leading independent, technology-enabled financial services firm, providing advice and solutions focused on alternative, private and illiquid assets to institutional investors and fund managers. Validus provides effective and efficient risk management solutions to clients around the world, mitigating market risks such as foreign currency, inflation and interest rate exposures, providing comprehensive risk management through its award-winning technology platform, RiskView. Provides a set of management tools.
Historically, Validus’ risk analysis has leveraged data from AWS partner and global leader in business and financial information, news, and insights, Bloomberg. The data used was delivered in a request/response-based manner via Bloomberg’s REST API for cloud-native market data.
While Bloomberg’s REST API and cloud-native access to license data in the cloud addresses many use cases, Validus’ risk management requirements have evolved to require streaming data.
To meet this streaming need while leveraging the benefits of the cloud, Amazon Web Services (AWS) offers delivery of B-PIPE, Bloomberg’s high-performance real-time market data feed. Additionally, the AWS connectivity model uses Bloomberg’s Open API (BLPAPI), which can provide consistency and resiliency.
B-PIPE provides comprehensive integration with normalization and intelligence, including access to 35 million products, 330+ exchanges, and 80 billion ticks per day in an event-driven architecture via AWS PrivateLink Provides real-time access to market data.
Traditionally, integrating market data feeds can take months and require large enterprise infrastructures with operationally complex components such as Apache Kafka. With the combination of AWS managed services and B-PIPE’s easy setup and integration through AWS PrivateLink, this is no longer the case.
In this post, learn how Validus leveraged B-PIPE and AWS PrivateLink to implement a proof-of-concept (PoC) feed in one week, allowing the team to scale to production load. I’ll explore. The proposed solution propagates real-time updates from Bloomberg all the way to the RiskView platform’s React frontend.
Architecture overview
The design in Figure 1 shows how Validus can quickly implement real-time market data to meet both business and operational requirements by combining an AWS serverless solution with B-PIPE delivered through AWS PrivateLink. It shows what can be done.
Technology selection
The architecture and technology were selected to meet the following requirements:
To power event-driven use cases, ingested data must be streamed to other services. For the current use case, it is acceptable to limit to one update per second per ticker. Market data must be readily available for 5 days and can be queried immediately, as calculations and front-end graphs depend on it. Ad-hoc historical queries require data to be retained for long periods of time.
Other deciding factors include infrastructure and operational costs, speed of development, and scalability for future use cases. These factors led us to choose a managed, serverless AWS service.
At the core of Validus’ solution is Amazon DynamoDB, which also functions as a streaming service using DynamoDB Streams. Real-time communication with the front end (another feature that has traditionally been difficult to implement and scale) is achieved using API Gateway’s WebSocket API.
A combination of AWS Lambda and AWS Fargate is used for computing.
Figure 1 – Top-level architecture.
Market data capture service
Bloomberg offers client libraries for B-PIPE in several popular programming languages, including C++, C#, Java, and Python. The client library allows you to establish persistent connections through AWS PrivateLink.
Highly ephemeral computing such as AWS Lambda is not suitable for this service because the connection to Bloomberg is long-running. AWS Fargate provides a great middle ground.
The ingestion service is a configurable component that allows you to tailor writes to DynamoDB according to your requirements. The flexible computing provided by AWS Fargate and B-PIPE’s high-frequency capabilities allows you to scale this to meet future requirements. Currently, this is set to one aggregate update per second per ticker that includes some aggregate metrics from all ticks, such as the minimum, maximum, and average values.
Aggregated ticks are stored in DynamoDB. There are approximately 3,000 tickers related to Validus’ trading population. This results in up to 3,000 writes per second.
Amazon DynamoDB as a streaming service
DynamoDB and DynamoDB Streams provide cost-effective, low-maintenance short-term storage and streaming solutions. The DynamoDB model is very simple because the short-term storage access patterns are well known and the data is simple.
The ticker (such as the EUR/USD spot rate) is a natural partition key, and the timestamp is a good sorting key. Data should only be stored for 5 days. DynamoDB’s TTL (time to live) feature is a convenient way to clean up after 5 days.
Figure 2 – DynamoDB ticker item with ticker name (n) and timestamp
DynamoDB offers two options for streaming change data capture (CDC): Kinesis Data Streams for DynamoDB and DynamoDB Streams. The number of shards and consumers per shard are important aspects of streaming performance. DynamoDB Streams supports two consumers per shard, which was sufficient for our throughput requirements.
For higher throughput requirements, consider using Kinesis Data Streams, which supports up to 5 consumer processes per shard or up to 20 concurrent consumers per shard with enhanced fanout.
If queries are not required, or if you already have a bespoke time series database in place, you can use a dedicated streaming solution. The industry standard streaming platform is Apache Kafka, which is definitely appropriate in this case. However, it has higher operational overhead.
Amazon Kinesis is a great alternative high-performance managed streaming service with convenient integration with Amazon Simple Storage Service (Amazon S3) and other services through Amazon Kinesis Data Firehose.
long term storage
To facilitate the third requirement, data is also stored in Amazon S3 and queried through Amazon Athena. Amazon S3 and Amazon Athena support efficient data formats such as Parquet and Avro, giving you flexibility while keeping costs low.
Most streaming services have great integration with Amazon S3, and minimal work is required to start storing your data long-term.
Broadcasting station service
The Broadcaster is a core service responsible for providing updates to front-end consumers and includes three input sources:
Semi-static internal data, such as transactions, obtained from existing services. market data. Retrieved from market data feeds. Subscriber. Stored in DynamoDB by AWS Lambda, which processes WebSocket API subscription messages.
The appropriate values are recalculated each time new market data is updated. The service then examines the list of subscribers to determine which subscribers are interested in renewing. The list of subscribers is stored in DynamoDB keyed by the connection ID with a set of ticker names.
Figure 3 – DynamoDB item representing a subscriber.
Validus aims to support hundreds of active subscriptions, where regular full scans of this data are sufficient. If you need to support more connections, your broadcaster service must maintain a cache of this data. This cache can be kept up to date by new subscription events from DynamoDB.
You can push updates to your front end by posting them to your connection through API Gateway’s WebSocket API.
If a GraphQL-based solution is preferred, you can also consider AWS AppSync instead of API Gateway. This supports real-time GraphQL subscriptions and eliminates the need to manually manage a list of subscribers.
performance indicators
The amount and speed of data required at each stage of processing can have a significant impact on design decisions. Ingesting 3,000 tickers aggregates each ticker’s real-time flow into a single event of approximately 200 bytes per second. This equates to 36 MB per minute or 51.84 GB per day.
The Python data ingestion agent hosted in Amazon Elastic Container Service (Amazon ECS) on AWS Fargate can process more than 300 tickers per vCPU, depending on the activity in your feed.
These agents insert aggregated batches into DynamoDB at the end of every second. Latency is 50-100ms. Peak data flow benchmarking was assisted by Datadog, which provided real-time container metrics while increasing the number of tickers assigned to agents. Using a custom Datadog query, Validus was able to query the average vCPU required for every 100 tickers.
Multi-CPU Amazon ECS hosted on a Fargate central broadcast server written in Kotlin consolidates shards of DynamoDB streams and broadcasts them to a peak of 500 consumers within 1 second of writing to DynamoDB. End-to-end P99 latency is less than 3 seconds.
Development environment
Bloomberg sets up two AWS PrivateLink endpoints, one for development and one for production. You can connect to your development endpoint by setting up a remote development environment within the same Amazon Virtual Private Cloud (Amazon VPC), or by connecting from your local device using a client VPN endpoint associated with your Amazon VPC.
To ensure a reproducible and reviewable infrastructure, we highly recommend using the Infrastructure as Code tool. The AWS Cloud Development Kit (AWS CDK) is suitable for this. Other popular tools include Terraform and AWS CloudFormation.
AWS CDK significantly speeds up deployments by requiring only minor adjustments to the configuration of the development CDK code to accommodate higher loads and configurations when the service is ready to be deployed to production infrastructure. will improve.
It also allows for quick and traceable experimentation and benchmarking of your infrastructure when changes are as simple as merging pull requests and allowing them to be deployed in a CI/CD pipeline.
summary
Access to real-time market data is no longer a luxury for large companies. With Bloomberg’s cloud-native real-time market data solution and managed AWS services, a single developer can create a PoC feed and run it end-to-end in days without the support of an infrastructure team.
Due to the entire infrastructure written in AWS CDK and the scalability of the services used, it is very easy to promote your development feed to production feed using the two production endpoints provided by Bloomberg.
The result is a cost-effective, high-performance feed with the ability to power your front end with thousands of updates per second. Once you have a better understanding of performance characteristics, you can further reduce costs by reserving reserved compute and DynamoDB capacity.
In this post, we explored how Validus built a real-time market data integration for Bloomberg using serverless managed services on AWS. With this implementation, Validus will accelerate the integration of market data for its customers.
To learn more about Validus, please visit our website.
Read more about Bloomberg B-PIPE on AWS here.
.
.
Bloomberg – AWS Partner Spotlight
Bloomberg is an AWS Partner and a global leader in business and financial information, news, and insights.
Contact Bloomberg | Partner Overview