How to persist data from mongodb running in kubernetes cluster

Posted December 7, 2020 5.8k views
MongoDBDockerKubernetesDigitalOcean Managed Kubernetes

Hello all! I’m still very much learning the basics of kubernetes so forigve this question.

I have an application deployed to a digital ocean kubernetes cluster. One of the deployments in the cluster is a mongodb database using the mongo image from docker hub.

I’m trying to figure out how I can persist the data such that if I restart the mongo deployment / pod the data is not lost.... Trying to google this is making me feel all kinds of dumb. I’m guessing the solution has something to do with block storage volumes? Any nudge in the right direction would be beyond appreciated!

1 comment
  • @Daxcor’s comment should point you in the right direction. I’m sure there are tutorials out there specifically geared towards Mongo+Kubernetes.

    I also recommend looking at an alternative approach: Let someone else manage your database for you. I’m sure there are plenty of reasonably priced MongoDB SaaS providers who can keep your data healthy and backed up, and you won’t have to worry about deploying anything.

    I’m not sure about Mongo, but DynamoDB and Azure Storage are so cheap, why bother with running it yourself? :-)

These answers are provided by our Community. If you find them useful, show some love by clicking the heart. If you run into issues leave a comment, or add your own answer to help others.

Submit an Answer
2 answers

I feel your pain, you need to read up on persistent volumes (pv) and persistent volume claims (pvc).

Block storage is something DO sells you
PV is the connection from the block storage to kubernetes
PVC is a pod making a claim on that storage all or in part so it is usable by an application.

You can also have storage on a (node) droplet but if the node goes away so does your data. Block storage -> pv -> pvc is the best way.

Good luck.

  • Thank you so much for this it was incredibly helpful! I finally got round to having a play this morning and I’ve got it set up and working (kind of).

    I clearly still have a gap in my understanding though and I’d be really grateful for any advice.

    I set up the block storage, pvc and a stateful set for my mongo db which appears to all be working great. Now if I delete and restart the pod running my mongo image my data is persisted!

    I wanted to see what would happen if I destroyed the kubernetes cluster entirely, made a new one and applied all the same k8s manifests. My assumption was that because the name of my PVC remained the same it would connect to the same block storage as the original cluster.

    What actually happened was that a new block storage volume was created and I can now see both in my volumes tab in the DO dashboard.

    In a real world scenario where a kluster was destroyed how would one go about resotring data from the old volume?

    My mongo db and pvc:

    apiVersion: apps/v1
    kind: StatefulSet
      name: tickets-mongo-depl
        app: tickets-mongo
      replicas: 1
      serviceName: 'tickets-mongo'
          app: tickets-mongo
            app: tickets-mongo
            - name: tickets-mongo-storage
                claimName: tickets-db-bs-claim
            - name: tickets-mongo
              image: mongo
                - mountPath: '/data/db'
                  name: tickets-mongo-storage
    apiVersion: v1
    kind: PersistentVolumeClaim
      name: tickets-db-bs-claim
      storageClassName: do-block-storage
        - ReadWriteOnce
          storage: 1Gi
    apiVersion: v1
    kind: Service
      name: tickets-mongo-srv
      type: ClusterIP
        app: tickets-mongo
        - name: tickets-db
          protocol: TCP
          port: 27017
          targetPort: 27017

    Again, massively grateful for the help!

Digital ocean provides a “storage class” that automates the provisioning of the block storage to your cluster. Now I do not know this for certain, I do believe that you can tell the cluster not to auto provision. I am pretty sure you can manually provision the DO block storage from the previous cluster to the new cluster, and recreate the pv from that physical device. This would never be an automated procedure. You would have to dig into the docs further and learn more than just creating deployment yaml files etc. I haven’t done this yet. There is a great kubernetes slack channel, and there is a specific digital ocean channel. There are some staff from DO that hang out there and help.