Building production-ready retrieval-augmented generation (RAG) systems can be complex, time-consuming, and often requires months of engineering effort. Developers and enterprises struggle to ingest diverse data sources, structure content for semantic search, and maintain accurate, verifiable answers.
Enhancements to DigitalOcean Gradient™ AI Knowledge Bases, now in public preview, are designed to solve this problem. Its code-first feature lets developers create, manage, and query knowledge bases entirely from code, giving full control over ingestion, chunking, embedding, and retrieval without having to worry about the underlying infrastructure.
Many existing solutions let developers create a basic knowledge base, but they often struggle to scale, customize, or integrate it into production workflows. The improvements address this by providing a code-first, developer-focused toolkit that handles the full knowledge base lifecycle. Developers can ingest data from files, Dropbox, web crawlers, control chunking and embedding strategies, and run natural language queries that return citation-backed answers with metadata filters. With well-documented APIs and SDKs, these integrations are seamless, letting developers manage everything entirely in code.
The public preview highlights the essential tools developers need to build and manage knowledge bases effectively:
Direct API Access: Query knowledge bases directly without needing an agent, giving full control for integration into apps or RAG pipelines.
Customizable ingestion: Ingest content from supported sources such as files, web crawlers, Dropbox, and JSON datasets. Supports structured data, sitemap crawling, and accurate parsing of complex PDFs.
Flexible chunking and embedding: Choose the chunking strategy that fits your content and select from high-performance embedding models (including a multi-lingual embedding model). Intelligent defaults allow for quick setup.
Advanced retrieval and citations: Run queries with exact-page citations, metadata filters, and hybrid search.
Developer-first tooling: Fully code-driven SDK and API functions makes creation and integration seamless.
The improvements are available in public preview. Start building smarter AI applications faster by managing your knowledge bases entirely in code. Explore the API documentation to start experimenting, and see how quickly you can turn your data into actionable, context-rich answers.
To explore the Knowledge Base improvements, enable the public preview on your Feature Preview page in the DigitalOcean Cloud Console. Once you’ve opted in, access will be granted within approximately 10–15 minutes.