• Blog
  • Docs
  • Careers
  • Get Support
  • Contact Sales
DigitalOcean
  • Featured AI Products

    Compute

    Build, deploy, and scale cloud compute resources

    Containers and Images

    Safely store and manage containers and backups

    Managed Databases

    Fully managed resources running popular database engines

    Management and Dev Tools

    Control infrastructure and gather insights

    Networking

    Secure and control traffic to apps

    Security

    Help protect your account and resources with these security features

    Storage

    Store and access any amount of data reliably in the cloud

    Browse all products

  • AI/ML

    CMS

    Data and IoT

    Developer Tools

    Gaming and Media

    Hosting

    Security and Networking

    Startups and SMBs

    Web and App Platforms

    See all solutions

  • Community

    Documentation

    Developer Tools

    Get Involved

    Utilities and Help

  • Become a Partner

    Marketplace

  • Pricing
  • Log in
  • Sign up
  • Log in
  • Sign up

Company

  • About
  • Leadership
  • Blog
  • Careers
  • Customers
  • Partners
  • Referral Program
  • Affiliate Program
  • Press
  • Legal
  • Privacy Policy
  • Security
  • Investor Relations

Products

  • GPU Droplets
  • Bare Metal GPUs
  • Inference Engine
  • Data & Learning
  • Model Library
  • Droplets
  • Kubernetes
  • Functions
  • App Platform
  • Load Balancers
  • Managed Databases
  • Spaces
  • Block Storage
  • Network File Storage
  • API
  • Uptime
  • Cloud Security Posture Management (CSPM)
  • Identity and Access Management (IAM)
  • Cloudways
  • View all Products

Resources

  • Community Tutorials
  • Community Q&A
  • CSS-Tricks
  • Write for DOnations
  • Currents Research
  • DigitalOcean Startups
  • Wavemakers Program
  • Compass Council
  • Open Source
  • Newsletter Signup
  • Marketplace
  • Pricing
  • Pricing Calculator
  • Documentation
  • Release Notes
  • Code of Conduct
  • Shop Swag

Solutions

  • AI Training GPU
  • GPU Inference
  • VPS Hosting
  • Website Hosting
  • VPN
  • Docker Hosting
  • Node.js Hosting
  • Web Mobile Apps
  • WordPress Hosting
  • Virtual Machines
  • View all Solutions

Contact

  • Support
  • Sales
  • Report Abuse
  • System Status
  • Share your ideas

Company

  • About
  • Leadership
  • Blog
  • Careers
  • Customers
  • Partners
  • Referral Program
  • Affiliate Program
  • Press
  • Legal
  • Privacy Policy
  • Security
  • Investor Relations

Products

  • GPU Droplets
  • Bare Metal GPUs
  • Inference Engine
  • Data & Learning
  • Model Library
  • Droplets
  • Kubernetes
  • Functions
  • App Platform
  • Load Balancers
  • Managed Databases
  • Spaces
  • Block Storage
  • Network File Storage
  • API
  • Uptime
  • Cloud Security Posture Management (CSPM)
  • Identity and Access Management (IAM)
  • Cloudways
  • View all Products

Resources

  • Community Tutorials
  • Community Q&A
  • CSS-Tricks
  • Write for DOnations
  • Currents Research
  • DigitalOcean Startups
  • Wavemakers Program
  • Compass Council
  • Open Source
  • Newsletter Signup
  • Marketplace
  • Pricing
  • Pricing Calculator
  • Documentation
  • Release Notes
  • Code of Conduct
  • Shop Swag

Solutions

  • AI Training GPU
  • GPU Inference
  • VPS Hosting
  • Website Hosting
  • VPN
  • Docker Hosting
  • Node.js Hosting
  • Web Mobile Apps
  • WordPress Hosting
  • Virtual Machines
  • View all Solutions

Contact

  • Support
  • Sales
  • Report Abuse
  • System Status
  • Share your ideas
© 2026 DigitalOcean, LLC.Sitemap.
Product updates

NVIDIA GTC 2026 Confirmed It: The Inference Era Is Here

author

By Meghan Grady

Head of Marketing & Communications

  • Published: March 27, 2026
  • 3 min read
<- Back to blog home

Last week at NVIDIA GTC 2026, one message was clear: AI has moved beyond the training era and into the era of production inference. The conversation was no longer just about building faster chips and smarter models; it was about what it takes to run AI at scale with the latency, reliability, and economics real products demand. Reuters called it an “inference boom,” and even the CPU became part of the conversation again as inference workloads push the industry to optimize the full system, not just the accelerator.

That shift matters because inference is where AI becomes a business. Training ushered in this wave of AI innovation; inference is what turns that innovation into real products and real customer experiences. It is where cost per token, time to first token, orchestration, and uptime start to matter just as much as model quality.

GTC made it clear that the industry is moving beyond chips to the broader operating infrastructure architecture required to support AI-native companies. As inference becomes the operational layer of AI, the conversation has moved toward a cohesive system spanning chips, platforms, models and applications, which maps directly to what customers are asking us for today. Rather than making isolated infrastructure decisions, businesses are seeking ways to run AI in production that manage latency, improve token economics, and reduce operational complexity. This need is especially critical as AI agents evolve from a new application pattern into a core infrastructure requirement, demanding fast, secure systems capable of supporting constant activity and real-world workloads at scale.

That is the backdrop for what we announced with NVIDIA last week and the vision for the DigitalOcean Agentic Inference Cloud. Across infrastructure, platform, and deployment, the focus was the same: help AI builders move from experimentation to production with less friction. We introduced a new Richmond data center purpose-built for AI inference, featuring NVIDIA HGX B300 systems and a 400 Gbps non-blocking RDMA fabric for demanding reasoning and agentic workloads. We’re bringing NVIDIA Dynamo 1.0 to DigitalOcean Kubernetes and expanding model access with new options optimized for reasoning, long-context, multimodal, and agentic use cases. And we’re making it easier to build and deploy always-on agents through NVIDIA NemoClaw and the NVIDIA Agent Toolkit, with both a seamless deployment of agents and models from build.nvidia.com to DigitalOcean Serverless Inference and a 1-Click Droplet, simplifying and shortening setup for NVIDIA NemoClaw.

We have already begun to see the momentum firsthand. When OpenClaw took off, our team moved quickly to make it easier for builders to put it to work in production. Since then, DigitalOcean has seen more than 43,000 OpenClaw deployments, with strong adoption from teams building always-on assistants and agentic applications.

If these themes resonate with you, I hope you’ll join us at DigitalOcean Deploy on April 28, 2026 in San Francisco. We’re bringing together leaders from NVIDIA, VAST Data, vLLM, Arcee AI, Character.AI, Workato and more to share practical lessons on what it takes to run AI inference at scale, from real-world architecture and performance to economics and operational efficiency.

About the author

Meghan Grady
Meghan Grady
Author
Head of Marketing & Communications
See author profile
See author profile

Share

  • Ai Ml
  • Product Updates

Join us at Deploy 2026

Join leaders from Character, Workato, VAST Data, Arcee, and the vLLM ecosystem as they break down how they are running AI in production at scale.

Register today

Related Articles

Run Codex in the cloud – DigitalOcean for Codex is now available
Product updates

Run Codex in the cloud – DigitalOcean for Codex is now available

Ari Sigal
  • June 25, 2026
  • 3 min read

Read more

Server-Side Tools Are Now Available for DigitalOcean Inference Engine
Product updates

Server-Side Tools Are Now Available for DigitalOcean Inference Engine

Grace Morgan
  • June 17, 2026
  • 3 min read

Read more

Model Evaluations: Prove Your Routing Policy Actually Works
Product updates

Model Evaluations: Prove Your Routing Policy Actually Works

Sathish Jothikumar

  • June 4, 2026
  • 7 min read

Read more