Question

How to prepare K8s cluster for the production?

I’m planning to test K8s on DO by running a few backends (that include api, workers and crons) & frontend applications on it, I’ve prepared Helm charts for deployment, CI to build images (currently without deploy) and ingress configuration.

I’am wondering what else is recommended to configure/setup on the cluster before running my applications in production? I want to keep it as simple as possible, but at the same time it should be secure and reliable enough.

Thanks in advance!


Submit an answer


This textbox defaults to using Markdown to format your answer.

You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!

Sign In or Sign Up to Answer

These answers are provided by our Community. If you find them useful, show some love by clicking the heart. If you run into issues leave a comment, or add your own answer to help others.

John Kwiatkoski
DigitalOcean Employee
DigitalOcean Employee badge
September 29, 2020
Accepted Answer

Hi there,

I would ensure you have the following prior to moving to production. Note that this not exhaustive and some may not apply to you but these are often ones I see forgotten about:

  • Fault tolerance of your application. Identify single points of failure if any and attempt to remove them if possible.
  • Certificate management for any available and exposed services.
  • Backup of the cluster objects. Data loss can happen anywhere and on any platform. Ensure you have a backup and restore procedure.
  • Idendify any potential maintenance activities and what those would look like and ensure they can be done on a managed service.
  • Familiarity with DO api tokens and permissions. If a developer leaves are you familiar with how to revoke the API token associated with their kubeconfig?
  • Upgrades. Get your planned upgrade cadence sorted out and planned. DOKS replaces nodes on upgrade, make sure the rescheduling of those workloads dont temporarily overload the other nodes. Plan out your capacity.
  • Set requests and limits wherever possible. This is really helpful for capacity planning and can prevent the above bullet point from happening if good requests and limits are set.
  • Verify your configurations for storage prior to going GA ensure you PV mountpoints match wehere data is being written to in the container. I have seen users accidently mount the pv to the wrong directory and then lose data because the data was ephemeral and lost upon the next pod restart.
  • Scaling plan. Make sure you know how to deal with increased load on yoru cluster whether thats using an HPA for pods or cluster autoscaling for nodes.
  • CICD. Sounds like you alread version control your deployments.
  • Portability. How easy would it be for you to pivot cloud providers in the event of a catastrophic failure or natural disaster to strike a DC? Being able to quickly shift to deploy your application to a different region or even a different provider can save a lot of headache down the road.

Hope this helps a bit!

Regards,

John

Try DigitalOcean for free

Click below to sign up and get $200 of credit to try our products over 60 days!

Sign up

Get our biweekly newsletter

Sign up for Infrastructure as a Newsletter.

Hollie's Hub for Good

Working on improving health and education, reducing inequality, and spurring economic growth? We'd like to help.

Become a contributor

Get paid to write technical tutorials and select a tech-focused charity to receive a matching donation.

Welcome to the developer cloud

DigitalOcean makes it simple to launch in the cloud and scale up as you grow — whether you're running one virtual machine or ten thousand.

Learn more
DigitalOcean Cloud Control Panel