Convex RAG component enables developers to build retrieval-augmented generation systems with vector search and document embeddings in their apps.
npm install @convex-dev/ragA component for semantic search, usually used to look up context for LLMs. Use with an Agent for Retrieval-Augmented Generation (RAG).
Key Features
Add Content: Add or replace content with text chunks and embeddings.
Semantic Search: Vector-based search using configurable embedding models
Namespaces: Organize content into namespaces for per-user search.
Custom Filtering: Filter content with custom indexed fields.
Importance Weighting: Weight content by providing a 0 to 1 "importance".
Chunk Context: Get surrounding chunks for better context.
Graceful Migrations: Migrate content or whole namespaces without disruption.
The @convex-dev/rag component provides vector search and document embedding APIs that work directly with Convex functions. You can store documents, generate embeddings, and retrieve relevant context for your chatbot responses using the built-in similarity search.
This component eliminates the need for separate vector databases by integrating embeddings directly into Convex. It handles document chunking, embedding generation, and semantic search through simple function calls in your Convex backend.
The RAG component lets you index your documents and perform semantic searches to find relevant context before LLM queries. It supports real-time updates to your knowledge base and provides ranked results based on vector similarity.
The @convex-dev/rag component supports OpenAI's text embedding models and other popular embedding providers through configurable adapters. You can specify which model to use when initializing the RAG system in your Convex functions.
The Convex RAG component automatically handles incremental updates through Convex's reactive data layer. When documents change, only the affected embeddings are regenerated, and the vector index updates in real-time without full re-indexing.
Yes, the @convex-dev/rag component provides configurable chunking strategies including fixed-size, semantic, and custom splitting methods. You can adjust chunk size, overlap, and splitting logic based on your document types and use case requirements.
The Convex RAG component integrates seamlessly with streaming LLM APIs by providing the retrieved context upfront. You can use the search results to augment your prompts before streaming responses through Convex actions or HTTP endpoints.