How to deploy in Kubernetes a reliable MongoDb cluster?

Posted on 23 September 2021.

MongoDB is a popular document store. Developers like it because it allows them to move fast, SREs like it because it has good scalability and tooling. Percona, a company specialized in providing solutions in the database world, offers Percona MongoDB. It is a MongoDB distribution with several features.

Among those, we have:

Encryption at rest

Improved monitoring

Integration with kubernetes

Other integrations for authentication, auditing, logs …

We are going to see how to deploy multiple secure and reliable MongoDB clusters in Kubernetes.

Theory of operations

Thankfully, one of the features of Percona MongoDB is its integration with Kubernetes.

Percona provides a MongoDB operator, along with a chart for deploying it, and another chart to deploy and manage your MongoDB cluster instances.

One of the resources managed by the percona operator is the PerconaServerMongoDB

So, first, we install the operator, then we will deploy our first kubernetes managed PerconaServerMongoDB instance.

Installing the operator

Percona provides pretty good documentation. There are guides for minikube, gke, and eks, but it can be mostly reduced to:

kubectl apply -f https://raw.githubusercontent.com/percona/percona-server-mongodb-operator/main/deploy/bundle.yaml

As the filename suggests, it’s a bundle that installs the operator. It deploys the operator in a kubernetes deployment along with its CRDs, a Role, a ServiceAccount, and a RoleBinding.

Note that in our production deployment, we use the provided chart rather than this installation.

Our first PerconaServerMongoDB

Getting our first PerconaServerMongoDB up and running is as simple as running

cat <<EOF | kubectl apply -f -
apiVersion: psmdb.percona.com/v1
kind: PerconaServerMongoDB
metadata:
  name: my-cluster-name
spec:
  image: percona/percona-server-mongodb:4.4.6-8
  replsets:
	- name: rs0
  	size: 3
  	volumeSpec:
    	persistentVolumeClaim:
      	resources:
        	requests:
          	storage: 1Gi
EOF

We deploy a MongoDB replicaSet named rs0 with 3 replicas (size: 3), which is not to be confused with a Kubernetes ReplicaSet.

In further detail, the PerconaServerMongoDB operator will create 3 StatefulSets running mongod. Mongo DB is the daemon that manages a single replica of a replicaSet.

Deploying multiple `PerconaServerMongoDBs`

For one of our projects, we wanted to deploy multiple instances of an application that required a mongodb instance.

We choose to deploy each instance of the application along with its MongoDB instance in a different Kubernetes namespace.

Lessons learned

You need an operator instance per namespace

Soon we discovered that the operator manages only resources within its own namespace. The operator can be configured to watch another namespace, but cannot be configured to watch all cluster’s namespaces.

We need to deploy one operator per namespace, an easy fix.

About the kubectl.kubernetes.io/last-applied-configuration annotation

As we are deploying several instances, we have a pretty big infrastructure. We choose to manage our application delivery with ArgoCD.

ArgoCD does Kubectl apply for you, but, we ended in a situation where the kubectl.kubernetes.io/last-applied-configuration annotation couldn’t be used correctly because the resource specification is too long, the kube api server tells us metadata.annotations: Too long: must have at most 262144 bytes.

Our solution was to add argocd.argoproj.io/sync-options: Replace=true annotations on our PerconaServerMongoDBs, so ArgoCD will use kubectl replace instead of kubectl apply. The Percona MongoDB operator handled that perfectly: the kubectl replace doesn’t recreate the whole MongoDB, the operator notices the changes to the resources and updates the resource accordingly.

Google HMAC Keys

You can configure your PerconaServerMongoDB instances to schedule backups and instruct the resource where to store them.

Our need was a google cloud bucket.

You can do so pretty easily by adding the following snippet to your PerconaServerMongoDB instance:


backup: 	 
  enabled: true    
  storages:    
    backup:    
      type: s3    
      s3:    
        bucket: S3-BACKUP-BUCKET-NAME-HERE    
        credentialsSecret: my-cluster-name-backup-bucket    
        endpointUrl: https://storage.googleapis.com

Then, the my-cluster-name-backup-bucket secret would hold:

apiVersion: v1
kind: Secret
metadata:
  name: my-cluster-name-backup-bucket
type: Opaque
stringData:
  AWS_ACCESS_KEY_ID: GOOGXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 
  AWS_SECRET_ACCESS_KEY: kVPWt2JJHRPyjtix1/p/M6TeIOKwdc7SwUCSRlIW

It took us some time to find the command / api calls to create the equivalent of AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY.

You create them with:

gsutil hmac create <user>@<project>.iam.gserviceaccount.com

Bonus: once you have those HMAC keys you can then use them with s3 clients that allow you to specify endpoints.

Giving back

With our requirements we ended doing two modest contributions:

The first one adds to the percona mongodb helm chart the support for tolerations. In our setup, we were deploying our mongodb instances on specific nodes and so were using tolerations which wasn’t supported
The second one adds to the percona mongodb helm chart the ability to configure the ServiceAccount name used to execute backups. This is needed when you want to run backups with different service accounts on different MongoDB instances.

The first one has already been merged and released!

We successfully deployed multiple reliable and secure MongoDb clusters in Kubernetes.

Percona provides a secure and manageable MongoDB distribution. They took care of integrating well in the Kubernetes ecosystem with a high-quality operator and Helm charts.

During our multi-instance deployment, we faced several challenges and here we shared some of them.We also were happy to contribute two small improvements to the psmdb-db helm chart. Other than that, we were impressed with the quality of the product, its tooling, documentation, and operability.

We could also discuss monitoring and scaling our Percona mongodb instances in Kubernetes, but that would be left for another post.