Deep Dive into Cortex Metrics — Part I

Pavan Kumar
Nerd For Tech
Published in
6 min readNov 21, 2021

--

Deep Dive into Cortex Metrics.io Part I

Are your applications running on Kubernetes? Is it highly scalable and you are happy with the way it works? Wait a minute, How are you monitoring them? Ahh, Prometheus Right? Cool, Did you ever wonder how scalable and Highly available your Prometheus Cluster Is? Before that, here is a mail from your boss asking you to find out the number of http_requests that your website received last Xmas or Let’s make this the Indian Style. Your boss wants to know the number of customers who had visited your website ( total number of http_requests ) the last Sankranthi ( An year ago ). Now you tried accessing your Prometheus / Grafana servers. You just realized that the metrics are not found. What do you tell your boss now? Well before this situation actually arises let us try to fix this by using Cortex. Cortex is a CNCF incubation project that provides horizontally scalable, highly available, multi-tenant, long-term storage for Prometheus. Cortex is primarily used as a remote write destination for Prometheus, exposing a Prometheus-compatible query API. As mentioned in Cortex Documentation

  • Horizontally scalable: Cortex can run across multiple machines in a cluster, exceeding the throughput and storage of a single machine. This enables you to send the metrics from multiple Prometheus servers to a single Cortex cluster and run “globally aggregated” queries across all data in a single place.
  • Highly available: When run in a cluster, Cortex can replicate data between machines. This allows you to survive machine failure without gaps in your graphs.
  • Multi-tenant: Cortex can isolate data and queries from multiple different independent Prometheus sources in a single cluster, allowing untrusted parties to share the same cluster.
  • Long-term storage: Cortex supports S3, GCS, Swift, and Microsoft Azure for long-term storage of metric data. This allows you to durably store data for longer than the lifetime of any single machine, and use this data for long-term capacity planning.
Image Credits: Cortex

What is the entire story all about? (TLDR)

  1. Getting to know Cortex Components.
  2. Implementing HA-Prometheus with Cortex, Prometheus Operator, and AWS S3( Object Store ).- Part II

Story Resources

  1. GitHub Link: https://github.com/pavan-kumar-99/medium-manifests
  2. GitHub Branch: cortex

Understanding Cortex Components

In the first part of this article, let us start by understanding the components present in the Cortex. In Part II of this article, we will work towards building a cortex cluster. So let us demystify each component in detail and understand their usage with a very useful architecture diagram from the official website of Cortex.

Cortex Architecture

a) Prometheus

Prometheus Instances are responsible for scraping samples from various targets and pushing them to Cortex using the Prometheus Remote Write API. Cortex requires that each HTTP request bear a header specifying a tenant ID for the request ( Uses the string fake by default ). Request Authentication and authorization are handled by an external reverse proxy.

Prometheus configuration for Remote Push

b) Block Storage

The blocks storage is a Cortex storage engine based on Prometheus TSDB: it stores each tenant’s time series into their own TSDB which writes out their series to an on-disk block (defaults to 2h block range periods). The supported backends for the blocks storage are:

Storage Flow Architecture

b.1) The store-gateway is responsible to query blocks and is used by the querier at query time. The store-gateway is required when running the blocks storage.

b.2) The compactor is responsible for merging and deduplicating smaller blocks into larger ones to reduce the number of blocks stored in the long-term storage for a given tenant and query them more efficiently. It also keeps the bucket index updated and, for this reason, it’s a required component.

c) Distributor

The distributor service is responsible for handling incoming samples from Prometheus. It’s the first stop in the write path for series samples. Once the distributor receives samples from Prometheus, each sample is validated for the correctness and to ensure that it is within the configured tenant limits, falling back to default ones in case limits have not been overridden for the specific tenant. Valid samples are then split into batches and sent to multiple ingesters in parallel. Distributors are stateless and can be scaled up and down as needed.

c) Ingester

The ingester service is responsible for writing incoming series to a long-term storage backend on the write path and returning in-memory series samples for queries on the read path.

Incoming series are not immediately written to the storage but kept in memory and periodically flushed to the storage (by default, 12 hours for the chunks storage and 2 hours for the blocks storage). For this reason, the queriers may need to fetch samples both from ingesters and long-term storage while executing a query on the read path. Ingesters are semi-stateful.

d) Querier

The querier service handles queries using the PromQL query language.

Queriers fetch series samples both from the ingesters and long-term storage: the ingesters hold the in-memory series which have not yet been flushed to the long-term storage. Because of the replication factor, it is possible that the querier may receive duplicated samples; to resolve this, for a given time series the querier internally deduplicates samples with the same exact timestamp.

Queriers are stateless and can be scaled up and down as needed.

e) Query Frontend

The query frontend is an optional service providing the querier’s API endpoints and can be used to accelerate the read path. When the query frontend is in place, incoming query requests should be directed to the query frontend instead of the queriers. The querier service will be still required within the cluster, in order to execute the actual queries.

f) Ruler

The ruler is an optional service executing PromQL queries for recording rules and alerts. The ruler requires a database storing the recording rules and alerts for each tenant.

g) Alertmanager

The alertmanager is an optional service responsible for accepting alert notifications from the ruler, deduplicating and grouping them, and routing them to the correct notification channel, such as email, PagerDuty or OpsGenie.

High Availability Tracker

The distributor features a High Availability (HA) Tracker. When enabled, the distributor deduplicates incoming samples from redundant Prometheus servers. This allows you to have multiple HA replicas of the same Prometheus servers, writing the same series to Cortex, and then deduplicate these series in the Cortex distributor.

The HA Tracker requires a key-value (KV) store to coordinate which replica is currently elected. The distributor will only accept samples from the current leader. Samples with one or no labels (of the replica and cluster) are accepted by default and never deduplicated.

The supported KV stores for the HA tracker are:

Conclusion

These are the various components that are present in the Cortex. In the Part2 of this article, I have explained how to set up an HA-Prometheus with RemoteWrite happening to a cortex instance and then cortex pushing it to the Object store for Retention ( S3 bucket ).

Until next time…..

Recommended

--

--

Pavan Kumar
Nerd For Tech

Senior Cloud DevOps Engineer || CKA | CKS | CSA | CRO | AWS | ISTIO | AZURE | GCP | DEVOPS Linkedin:https://www.linkedin.com/in/pavankumar1999/