Hippocratic AI

How Hippocratic AI Holds a 99.9% Safety Score Across 10 Million Patient Calls on DigitalOcean's AI-Native Cloud

At the heart of the patient experience is a hardware and software stack you can trust. I'm completely convinced all systems will fail and all nodes fail. What matters is having partners when things break, and how fast you can get back up. And that has been a great experience with DigitalOcean.

Debajyoti Datta

Hippocratic AI, Co-Founder

Hippocratic AI builds generative AI agents that call patients, walk them through post-surgery recovery plans, check in on chronic disease management, and help close care gaps that would otherwise fall through the cracks. The company’s Polaris constellation architecture uses a primary model to lead each patient conversation while more than twenty specialized support models run concurrently alongside it, reducing hallucination, surfacing clinical evidence, and cross-checking the primary model’s output for safety. With more than 180 million patient interactions to date across chronic disease management, medication adherence, care gap closure, and clinical scheduling, Hippocratic AI is operating at a scale where the line between infrastructure performance and patient safety disappears. Hippocratic AI, working in close collaboration with DigitalOcean on NVIDIA architecture, scaled to a 99.9% clinical safety score across more than 10 million real patient calls, supporting 2x production inference throughput at the latency required for live clinical conversations.

A medication adherence call that drops mid-sentence is not a user experience flaw. It is a clinical interruption. The production stack that DigitalOcean engineered, running on NVIDIA Hopper and Blackwell Ultra hardware and informed by Hippocratic AI’s clinical requirements, has delivered.

The results Hippocratic AI achieved on DigitalOcean’s AI-Native Cloud:

  • 2x production inference throughput: Enabled by platform-level inference optimizations using NVIDIA H200 and B300 GPUs.

  • 40% reduction in end-to-end P99 latency: Achieved by combining DigitalOcean’s infrastructure with Hippocratic AI’s model-level optimizations.

  • 2x reduction in prefill latency: On long-context clinical sessions compared to prior-generation stateless serving configurations.

  • ~30% higher per-node throughput: Driven by hardware-aware scheduling, model quantization such as NVFP4 methods on NVIDIA B300 GPU nodes, and custom kernels.

Choosing a Cloud Partner for Healthcare AI

Hippocratic AI’s Polaris system orchestrates a constellation of 22 specialized large language models totaling 4.2 trillion parameters. These models run real-time voice and text interactions with patients, where each conversation demands sub-second responsiveness and zero tolerance for mid-session failure. The system has sustained a 99.9% clinical safety score and an average patient satisfaction rating of 8.95 out of 10 across more than 10 million real patient calls, validated by more than 7,500 clinical staff.

To maintain a median time-to-first-token of 400 milliseconds at production scale, Hippocratic AI operates on the latest GPU hardware available. The company runs a multi-cloud infrastructure internally, with different model architectures requiring different GPU types.

“NVIDIA has incredible hardware, Hopper, Blackwell GPUs, and DigitalOcean has been literally one of our main partners to get to this hardware the fastest,” says Debajyoti Datta, Co-Founder, Hippocratic AI.

Hippocratic AI had been looking for cloud partners to scale, and DigitalOcean proved to be one of the fastest paths to the newest NVIDIA hardware. DigitalOcean provided early access to NVIDIA HGX™ B300 GPU nodes and immediate access to NVIDIA H200 nodes as well as hands-on engineering support on a platform tuned for sustained inference workloads. Hippocratic AI’s team rolled production workloads onto the NVIDIA GPUs through DigitalOcean, and the partnership deepened from there.

“The collaboration is based on the fact that we have to be on the latest hardware with the best inference stack,” Datta says.

Developing an Inference Stack for Patient Safety

Over the past year, DigitalOcean worked in close collaboration with Hippocratic AI and NVIDIA to optimize every layer of the inference path. Informed by Hippocratic AI’s real-world production requirements, and with early access to NVIDIA HGX™ B300 GPUs and deep architectural support across Hopper and Blackwell, DigitalOcean engineered its AI-Native Cloud to meet this bar. The result: hardware-aware scheduling, optimized inference runtimes for sustained high-concurrency workloads, and platform-level support for FP8 and NVFP4 quantization, custom MoE kernels, KV-cache optimization, and a cache-aware routing architecture that maximizes KV-cache hit rate and context reuse across long-horizon clinical sessions.

The combined result on long-context clinical sessions is approximately 30% higher per-node throughput and a 2x reduction in prefill latency compared to a prior-generation stateless serving configuration. These gains compound the production improvements Hippocratic AI announced at DigitalOcean Deploy in April 2026, where the company reported a 2x improvement in production inference throughput and a 40% reduction in end-to-end P99 latency.

For Hippocratic AI, these are not abstract benchmarks. Meeting the latency target means the system can generate thinking tokens mid-call, which produced a 4x throughput improvement from the updated hardware and software stack. It means more concurrent patient sessions at the same quality level, and scaling from a pilot to a population.

“The demands of safety-critical AI workloads are fundamentally different from consumer applications. DigitalOcean and Hippocratic AI are demonstrating how tightly integrated infrastructure and inference optimization, built on NVIDIA H200 and B300 hardware, can deliver both performance and reliability at scale,” explains Dave Salvator, Director of Accelerated Computing Products, NVIDIA.

Hippocratic AI is also one of the first production customers operating on NVIDIA HGX™ B300 hardware through DigitalOcean’s collaboration with NVIDIA. For workloads where every token affects clinical experience, NVIDIA Blackwell Ultra unlocks a step-change in capacity per node. It allows Hippocratic AI to support more concurrent sessions at the same latency targets and to extend context windows on long clinical conversations. The NVIDIA B300 GPU nodes also enable newer quantization methods like NVFP4, delivering measurable differences in both throughput and latency that translate directly into patient experience.

The hardware matters because Hippocratic AI’s conversations are not short chatbot interactions. A chronic disease management call can run 45 minutes. If a technical failure occurs at minute 44, that entire session and the clinical value it carried is lost. Reliable infrastructure is what separates a demo from a deployed healthcare product.

When Things Break, Your AI Cloud Matters

Datta is candid about the realities of operating at this scale. GPU infrastructure is evolving. Driver updates and node interruptions are constants, and the speed of recovery is what counts.

“What I deeply care about is when it breaks, do we have reliable partners with whom we can scale? The DigitalOcean team has been great with us,” Datta says.

DigitalOcean provides hands-on support for driver updates and node maintenance, with rapid replacement when hardware goes down.

When Hippocratic AI has specific questions about the inference stack, the hardware upgrades, or the timeline for new node availability, DigitalOcean’s engineering team works through them directly.

"Even little things, like the support for our nodes, when we need very specific questions about the inference stack and the hardware. DigitalOcean has been great in that regard,” Datta says.

Paddy Srinivasan, DigitalOcean’s CEO, and the company’s leadership team have engaged directly with Hippocratic AI on infrastructure decisions and scaling needs.

180 Million Interactions and Growing

Hippocratic AI has now processed more than 180 million patient interactions across its clinical workflows. Behind that number are patients who received a timely check-in about their medication, or a follow-up call after surgery that caught a worsening symptom before it became an emergency room visit. Hippocratic AI’s agents have helped surface critical care gaps for patients managing chronic conditions, and the team internally emphasizes that this happens hundreds of times every day. The infrastructure that keeps those conversations running without interruption is inseparable from the clinical outcomes they produce.

“What Hippocratic AI has built in healthcare AI is remarkable — hundreds of millions of real patient interactions across some of the most complex and sensitive moments in people’s lives. Delivering that at 99.9% clinical safety is what production AI looks like when it matters most. This is what purpose-built inference delivers, and it’s what our AI-Native Cloud makes possible. Hippocratic AI’s results are the proof,” Srinivasan says.

The company continues to expand its footprint on DigitalOcean, adding nodes regularly. It plans to adopt newer hardware releases and GPUs as they become available through the platform.

The significance of this work extends beyond infrastructure metrics. Hippocratic AI is simultaneously a technology product and a healthcare company. Its engineers publish in top AI research journals. Its clinical staff evaluate every model output against real patient safety standards. The result is a system where infrastructure and clinical validation serve a single outcome: keeping patients safe.

What began as a way to staff the calls that health systems couldn’t has become something larger: a foundation for delivering clinical care at the scale of entire populations, built on infrastructure that grows with ambition.

“I’m very bullish on generative AI applications everywhere, and healthcare presents so many great opportunities. I think this is such a great time to build,” says Datta.

More stories

Hippocratic AI

Read how DigitalOcean and NVIDIA help Hippocratic AI create safe, compliant AI agents for healthcare appointment management and close care gaps.

Traversal uses DigitalOcean’s AI and GPU infrastructure to power advanced root cause analysis, helping enterprises understand complex system issues.

From launch spikes to daily play, Double Eleven trusts DigitalOcean Droplets to keep Rust online.

Start building today

From GPU-powered inference and Kubernetes to managed databases and storage, get everything you need to build, scale, and deploy intelligent applications.

Dark mode is coming soon.