Prometheus: Storage, aggregation and query with M3
This document is a getting started guide to using M3DB as remote storage for Prometheus.
M3 Coordinator configuration
To write to a remote M3DB cluster the simplest configuration is to run m3coordinator
as a sidecar alongside Prometheus.
Start by downloading the config template. Update the namespaces
and the client
section for a new cluster to match your cluster’s configuration.
You’ll need to specify the static IPs or hostnames of your M3DB seed nodes, and the name and retention values of the namespace you set up. You can leave the namespace storage metrics type as unaggregated
since it’s required by default to have a cluster that receives all Prometheus metrics unaggregated. In the future you might also want to aggregate and downsample metrics for longer retention, and you can come back and update the config once you’ve setup those clusters. You can read more about our aggregation functionality here.
It should look something like:
listenAddress: 0.0.0.0:7201
logging:
level: info
metrics:
scope:
prefix: "coordinator"
prometheus:
handlerPath: /metrics
listenAddress: 0.0.0.0:7203 # until https://github.com/m3db/m3/issues/682 is resolved
sanitization: prometheus
samplingRate: 1.0
extended: none
tagOptions:
idScheme: quoted
clusters:
- namespaces:
# We created a namespace called "default" and had set it to retention "48h".
- namespace: default
retention: 48h
type: unaggregated
client:
config:
service:
env: default_env
zone: embedded
service: m3db
cacheDir: /var/lib/m3kv
etcdClusters:
- zone: embedded
endpoints:
# We have five M3DB nodes but only three are seed nodes, they are listed here.
- M3DB_NODE_01_STATIC_IP_ADDRESS:2379
- M3DB_NODE_02_STATIC_IP_ADDRESS:2379
- M3DB_NODE_03_STATIC_IP_ADDRESS:2379
writeConsistencyLevel: majority
readConsistencyLevel: unstrict_majority
writeTimeout: 10s
fetchTimeout: 15s
connectTimeout: 20s
writeRetry:
initialBackoff: 500ms
backoffFactor: 3
maxRetries: 2
jitter: true
fetchRetry:
initialBackoff: 500ms
backoffFactor: 2
maxRetries: 3
jitter: true
backgroundHealthCheckFailLimit: 4
backgroundHealthCheckFailThrottleFactor: 0.5
Now start the process up:
m3coordinator -f <config-name.yml>
Or, use the docker container:
docker pull quay.io/m3db/m3coordinator:latest
docker run -p 7201:7201 --name m3coordinator -v <config-name.yml>:/etc/m3coordinator/m3coordinator.yml quay.io/m3db/m3coordinator:latest
Prometheus configuration
Add to your Prometheus configuration the m3coordinator
sidecar remote read/write endpoints, something like:
remote_read:
- url: "http://localhost:7201/api/v1/prom/remote/read"
# To test reading even when local Prometheus has the data
read_recent: true
remote_write:
- url: "http://localhost:7201/api/v1/prom/remote/write"
Also, we recommend adding M3DB
and M3Coordinator
/M3Query
to your list of jobs under scrape_configs
so that you can monitor them using Prometheus. With this scraping setup, you can also use our pre-configured M3DB Grafana dashboard.
- job_name: 'm3db'
static_configs:
- targets: ['<M3DB_HOST_NAME_1>:7203', '<M3DB_HOST_NAME_2>:7203', '<M3DB_HOST_NAME_3>:7203']
- job_name: 'm3coordinator'
static_configs:
- targets: ['<M3COORDINATOR_HOST_NAME_1>:7203']
NOTE: If you are running M3DB
with embedded M3Coordinator
, you should only have one job. We recommend just calling this job m3
. For example:
- job_name: 'm3'
static_configs:
- targets: ['<HOST_NAME>:7203']
Querying With Grafana
When using the Prometheus integration with Grafana, there are two different ways you can query for your metrics. The first option is to configure Grafana to query Prometheus directly by following these instructions.
Alternatively, you can configure Grafana to read metrics directly from M3Coordinator
in which case you will bypass Prometheus entirely and use M3’s PromQL
engine instead. To set this up, follow the same instructions from the previous step, but set the url
to: http://<M3_COORDINATOR_HOST_NAME>:7201
.