Multi-tenant, horizontally scalable
Prometheus as a Service
@tom_wilkie
November 2016
weavecortex
https://github.com/weaveworks/cortex
Prometheus and Kubernetes: A Perfect Match
Prometheus and Kubernetes: Deploying
https://www.weave.works/blog/
Prometheus and Kubernetes: Monitoring Your Applications
Prometheus and Kubernetes: Monitoring Your Infrastructure
Design
requirements:
1. API compatible with Prometheus
2. easy to operate and manage
3. tens of thousands of users, tens of
millions samples/s
4. cost effective to run
5. reuse as much of Prometheus as possible
… so we can sell it
Aim: build proof of concept as
quickly as possible
16/06 started design doc
26/07 launch jobs
25/08 give talk at PromCon!
… make it robust …
09/11 talk at KubeCon
http://goo.gl/prdUYV
Retriever
scraping
your jobs
Your DC
Weave Cloud
Frontend,
Authenticator
Distributor
Ingester
Distributor…
IngesterIngester
DynamoDB S3
Is a vanilla OSS Prometheus. Does service discovery, scraping and
relabelling.
Configured to send samples to Weave Cloud:
remote_write:
url: https://cloud.weave.works/api/prom/push
basic_auth:
password: <redacted>
PRs up stream for generic write path: #1930 #1957 #1987
Retriever
Retriever
scraping
your jobs
Frontend,
Authenticator
Distributor
Ingester
Distributor…
IngesterIngester
DynamoDB S3
Your DC
Weave Cloud
• Uses consistent hashing to assign
timeseries to Ingesters
• Input to hash is (user ID, metric
name)
• Tokens stored in Consul
• Also currently handles queries
Distributor
http://goo.gl/U9u1U2
Retriever
scraping
your jobs
Frontend,
Authenticator
Distributor
Ingester
Distributor…
IngesterIngester
DynamoDB S3
Your DC
Weave Cloud
• Heavily modified MemorySeriesStorage
• Use same chunk format as Prometheus
• Keeps everything in memory (for up to an hour)
• Also stores in memory inverted index for queries
• Flushes chunks to S3 and indexes them in DynamoDB
Ingester
Retriever
scraping
your jobs
Frontend,
Authenticator
Distributor
Ingester
Distributor…
IngesterIngester
DynamoDB S3
Your DC
Weave Cloud
External inverted index maintained in DynamoDB, chunks stored in S3
Item in DynamoDB looks like:
{
hash key: “{user ID}:{metric name}:{hour}”,
range key: “{label name}:{label value}:{chunk ID}”,
metric: ...,
from, through: ...,
ID: ...,
}
DynamoDB S3
Demo
Evaluation
Why not just run my own Prometheus?
Lots left to do…
Features:
• Recording rules
• Alerting & Alertmanager
Reliability:
• Replication between
ingesters, commit log etc
• Ingestor lifecycle
• Separate query service?
Performance:
• Query parallelisation
• Background chunk
coalescing
Code:
• Code cleanup
• Upstream appropriate
changes
Questions?
Sign up at https://cloud.weave.works/
$ kubectl -n kube-system apply -f 
‘https://cloud.weave.works/k8s/cortex.yaml?t=...'
https://github.com/weaveworks/cortex
Try It Out!
We’re hiring!
London BerlinSan Francisco

Weave Cortex: Multi-tenant, horizontally scalable Prometheus as a Service