M3DB, a distributed time-series database designed for high performance and scalability, is commonly used to power monitoring systems like Prometheus. To enhance scalability and performance, deploying M3Coordinator and M3DB nodes as separate clusters is a widely recommended practice.
Check out our previous blog to dive deeper into the M3DB architecture and its components.
As we already discussed in our previous posts, we explored how M3DB works, what components it involves, and what the whole architecture looks like. Hence, in this blog, weāll explore the deployment steps and examples for setting up separate M3Coordinator and M3DB nodes.
Why Separate M3Coordinator and M3DB Nodes?
But the first question that arises is why we need to separate M3Coordinator and M3DB Nodes.
There are various benefits to separating the M3Coordinator and M3DB nodes:
- Enhanced Scalability: The query and storage layers can scale independently.
- Improved Performance: By shifting query processing to M3Coordinator, the strain on M3DB nodes was lessened.
- Fault Isolation: Reduced effect on the storage cluster of coordinator or query-related problems.
Additionally, combining M3Coordinator and M3DB into a single component within a Kubernetes cluster can lead to challenges in scaling for production environments. Scaling such a combined deployment requires increasing the replicas of both the coordinator and storage layers simultaneously, which not only raises resource usage but also significantly increases costs due to higher disk utilization.
Let’s start the setup of M3DB!!
Prerequisites
- Kubernetes cluster with sufficient resources or minikube installed on your local.
- Familiarity with Kubectl for managing Kubernetes resources.
Step-by-Step Deployment Guide
Step 1: Deploy M3DB Operator:
To install M3DB Operator for managing M3DB deployments, use below command:
kubectl apply -f https://raw.githubusercontent.com/m3db/m3db-operator/v0.14.0/bundle.yaml
Step 2: Deploy Etcd Cluster
An etcd cluster is necessary in an M3DB configuration because it acts as the metadata store and offers the following essential features:
- Cluster Metadata Management
- Service Discovery
- Leader Election between M3DB Nodes
- High Availability and Fault Tolerance
To install etcd into your cluster, you can use the below command:
kubectl apply -f https://raw.githubusercontent.com/m3db/m3db-operator/v0.14.0/example/etcd/etcd-basic.yaml
The above command will create the 3 Etcd nodes in the cluster.
Please note that if you get any error while running the above command related to the image, you can find the compatible image versions here.
Step 3: Deploy M3DB Coordinator
Acts as a bridge between Prometheus (or any other query clients) and M3DB. It handles query processing, data aggregation, and routing.
Queries are processed by the M3Coordinator and sent to the relevant M3DB nodes.
apiVersion: v1
kind: ConfigMap
metadata:
name: m3-coordinator-config
data:
m3coordinator.yml: |
listenAddress: 0.0.0.0:7201
clusters:
- namespaces:
- namespace: "<NAMESPACE NAME>"
retention: <RETENTION PERIOD>
type: unaggregated
client:
config:
service:
env: default/m3db-cluster
zone: embedded
service: m3db
cacheDir: /var/lib/m3kv
etcdClusters:
- zone: embedded
endpoints:
- http://etcd-0.etcd:2379
- http://etcd-1.etcd:2379
- http://etcd-2.etcd:2379
metrics:
scope:
prefix: "coordinator"
prometheus:
handlerPath: /metrics
listenAddress: 0.0.0.0:7203
sanitization: prometheus
samplingRate: 1.0
extended: none
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: m3-coordinator
spec:
replicas: 1
selector:
matchLabels:
app: m3coordinator
template:
metadata:
labels:
app: m3coordinator
spec:
containers:
- name: m3-coordinator
image: quay.io/m3db/m3coordinator:v1.5.0
ports:
- containerPort: 7201
volumeMounts:
- mountPath: /etc/m3coordinator/m3coordinator.yml
name: coordinator-config
subPath: m3coordinator.yml
- mountPath: /var/lib/m3kv
name: m3kv-cache
env:
- name: ETCD_ENDPOINTS
value: "http://etcd-0.etcd:2379,http://etcd-1.etcd:2379,http://etcd-2.etcd:2379"
volumes:
- name: coordinator-config
configMap:
name: m3-coordinator-config
- name: m3kv-cache
persistentVolumeClaim:
claimName: m3-coordinator-cache
---
apiVersion: v1
kind: Service
metadata:
name: m3-coordinator
spec:
ports:
- port: 7201
targetPort: 7201
name: http
- port: 7203
targetPort: 7203
name: prom-http
selector:
app: m3coordinator
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: m3-coordinator-cache
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
Apply the configuration:
kubectl apply -f m3db-coordinator.yaml
Please make sure that the <NAMESPACE NAME> defined in the above config should be the same in both the m3db-coordinator and m3db cluster YAML configuration.
Please note that this config refers to the internal metrics of a coordinator exposed on port 7203, and this can be used to monitor the coordinator itself. You can also add your custom prefix
to all the internal metrics exposed via m3coordiantor to customize them as per your need.
You can also exec into the m3coordinator pod to view these metrics by calling the curl call.
curl <service-name>.<K8s-Namespace>.svc.cluster.local:7203/metrics
metrics:
scope:
prefix: "coordinator"
prometheus:
handlerPath: /metrics
listenAddress: 0.0.0.0:7203
Step 4: Deploy the M3DB Cluster
M3DB nodes are responsible for data storage.
Example YAML for M3DB Cluster:
apiVersion: operator.m3db.io/v1alpha1
kind: M3DBCluster
metadata:
name: m3db-cluster
spec:
image: quay.io/m3db/m3dbnode:latest
replicationFactor: 1
numberOfShards: 1
etcdEndpoints:
- http://etcd-0.etcd:2379
- http://etcd-1.etcd:2379
- http://etcd-2.etcd:2379
isolationGroups:
- name: <NAME OF GROUP>
numInstances: <NUMBER OF NODES YOU HAVE IN K8S CLUSTER>
podIdentityConfig:
sources: []
namespaces:
- name: <NAMESPACE NAME>
options:
retentionOptions:
retentionPeriod: "<RETENTION PERIOD>"
blockSize: "2h"
bufferFuture: "10m"
bufferPast: "10m"
dataDirVolumeClaimTemplate:
metadata:
name: m3db-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
limits:
storage: 1Gi
externalCoordinator:
serviceEndpoint: "<M3COORDINATOR ENDPOINT>:7201"
selector:
app: m3coordinator
In selector
part you can add the labels that are attached to your m3coordinator deployment as it selects the coordinator based on this label.
In <M3COORDINATOR ENDPOINT>
, add the service endpoint for the above m3coordinator deployed.
In K8s, service can be referred as <service-name>.<K8s-Namespace>.svc.cluster.local
Apply the configuration:
kubectl apply -f m3db-cluster.yaml
Step 5: Configure Prometheus to send metrics to the M3Coordinator
Here we are using Prometheus to scrape all the metrics scraped by the coordinator from the m3db Database.
To know more about the Prometheus and how to setup , you can refer here.
scrape_configs:
- job_name: 'm3coordinator'
static_configs:
- targets: ['<service-name>.<K8s-Namespace>.svc.cluster.local:7203']
remote_write:
- url: "http://<service-name>.<K8s Namespace>.svc.cluster.local:7201/api/v1/prom/remote/write"
remote_read:
- url: "http://<service-name>.<K8s-Namespace>.svc.cluster.local:7201/api/v1/prom/read"
You can see the metrics in Prometheus by port-forwarding it to the localhost.
Conclusion
Separating M3Coordinator and M3DB nodes is a best practice for managing large-scale time-series workloads. This architecture allows for optimized performance, independent scaling, and better fault isolation. By following the steps in this guide, you can set up and operate a robust M3 deployment tailored to your needs.
It is recommended to keep the M3Coordinator and M3DB nodes separate for the management of large-scale time-series workloads and production-grade systems. This architecture makes better fault isolation, independent scaling, and optimal performance possible. By following the steps in this guide, you can set up and operate a robust M3 deployment tailored to your needs.
🚩 Our Recent Posts
- M3DB: The Ultimate Remote Storage Solution for Prometheus ā Architecture and Key Components Explained
- How to Create a Systemd Service in Linux?
- How to keep 99.9% Uptime for application using Montioring & Alerting?
Iām a DevOps Engineer with 3 years of experience, passionate about building scalable and automated infrastructure. I write about Kubernetes, cloud automation, cost optimization, and DevOps tooling, aiming to simplify complex concepts with real-world insights. Outside of work, I enjoy exploring new DevOps tools, reading tech blogs, and play badminton.