We are using GPU nodes in our cluster and are facing slow startup times for ML pods. One possible solution is to use Nydus or Stargaze for fast container startup, and Dragonfly OSS or Spegel for P2P layer caching. However, none of these solutions work in DOKS, because they require modifying the container runtime configuration (containerd/config.toml), which is not accessible. What would be the best approach to implement these solutions in a managed cluster?
https://github.com/dragonflyoss/dragonfly
This textbox defaults to using Markdown to format your answer.
You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!
Hi there,
As far as I am aware with the managed DigitalOcean Kubernetes, it does not let you modify containerd.
Since these all require changes at the runtime layer, the best step is to reach out to the DigitalOcean support and ask if there is any workaround or if support for alternative snapshotters is planned or if this could be requested as a new feature.
Get paid to write technical tutorials and select a tech-focused charity to receive a matching donation.
Full documentation for every DigitalOcean product.
The Wave has everything you need to know about building a business, from raising funding to marketing your product.
Stay up to date by signing up for DigitalOcean’s Infrastructure as a Newsletter.
New accounts only. By submitting your email you agree to our Privacy Policy
Scale up as you grow — whether you're running one virtual machine or ten thousand.
Sign up and get $200 in credit for your first 60 days with DigitalOcean.*
*This promotional offer applies to new accounts only.