The Billion Data Point Challenge: Building a Query Engine for High Cardinality Time Series Data

December 10, 2018

Table of Contents

Uber, like most large technology companies, relies extensively on metrics to effectively monitor its entire stack. From low-level system metrics, such as memory utilization of a host, to high-level business metrics, including the number of Uber Eats orders in a particular city, they allow our engineers to gain insight into how our services are operating on a daily basis. As our dimensionality and usage of metrics increases, common solutions like Prometheus and Graphite become difficult to manage and sometimes cease to work.

Due to a lack of available solutions, we decided to build an in-house, open source metrics platform, named M3, that could handle the scale of our metrics. A major component of the M3 platform is its query engine, which we built from the ground up and have been using internally for several years. As of November 2018, our metrics query engine handles around 2,500 queries per second (Figure 1), about 8.5 billion data points per second (Figure 2), and approximately 35 Gbps (Figure 3).

These numbers have been constantly trending upwards at a rate much higher than Uber’s organic growth due to the increased adoption of metrics across various parts of our stack.

Source: uber.com

Tags :

comments powered by Disqus

M3: Uber’s Open Source Large-Scale Metrics Platform for Prometheus

M3, Uber’s open source metrics platform for Prometheus, facilitates scalable and configurable multi-tenant storage for large-scale metrics. To facilitate the growth of Uber’s global operations, we need to be able to quickly store and access billions of metrics on our back-end systems at any given time. As part of our robust and scalable metrics infrastructure, we built M3, a metrics platform that has been in use at Uber for several years now.

Early thoughts on Uber’s flying car from an aviation engineer

Cross shard transactions at 10 million requests per second

Dropbox stores petabytes of metadata to support user-facing features and to power our production infrastructure. The primary system we use to store this metadata is named Edgestore and is described in a previous blog post, (Re)Introducing Edgestore. In simple terms, Edgestore is a service and abstraction over thousands of MySQL nodes that provides users with strongly consistent, transactional reads and writes at low latency.

The Billion Data Point Challenge: Building a Query Engine for High Cardinality Time Series Data

Tags :

Share :

Related Posts

M3: Uber’s Open Source Large-Scale Metrics Platform for Prometheus

Early thoughts on Uber’s flying car from an aviation engineer

Cross shard transactions at 10 million requests per second