RAG

Convex RAG component enables developers to build retrieval-augmented generation systems with vector search and document embeddings in their apps.

Installation

npm install @convex-dev/rag

About RAG

A component for semantic search, usually used to look up context for LLMs. Use with an Agent for Retrieval-Augmented Generation (RAG).

Key Features
Add Content: Add or replace content with text chunks and embeddings.
Semantic Search: Vector-based search using configurable embedding models
Namespaces: Organize content into namespaces for per-user search.
Custom Filtering: Filter content with custom indexed fields.
Importance Weighting: Weight content by providing a 0 to 1 "importance".
Chunk Context: Get surrounding chunks for better context.
Graceful Migrations: Migrate content or whole namespaces without disruption.

Benefits

Use cases

how to implement RAG with Convex for AI chatbot

The @convex-dev/rag component provides vector search and document embedding APIs that work directly with Convex functions. You can store documents, generate embeddings, and retrieve relevant context for your chatbot responses using the built-in similarity search.

retrieval augmented generation setup with vector database

This component eliminates the need for separate vector databases by integrating embeddings directly into Convex. It handles document chunking, embedding generation, and semantic search through simple function calls in your Convex backend.

building knowledge base search for LLM applications

The RAG component lets you index your documents and perform semantic searches to find relevant context before LLM queries. It supports real-time updates to your knowledge base and provides ranked results based on vector similarity.

Frequently asked questions

What embedding models does the Convex RAG component support?

The @convex-dev/rag component supports OpenAI's text embedding models and other popular embedding providers through configurable adapters. You can specify which model to use when initializing the RAG system in your Convex functions.

How does RAG handle document updates and re-indexing?

The Convex RAG component automatically handles incremental updates through Convex's reactive data layer. When documents change, only the affected embeddings are regenerated, and the vector index updates in real-time without full re-indexing.

Can I customize the document chunking strategy for RAG?

Yes, the @convex-dev/rag component provides configurable chunking strategies including fixed-size, semantic, and custom splitting methods. You can adjust chunk size, overlap, and splitting logic based on your document types and use case requirements.

Does the RAG component work with streaming LLM responses?

The Convex RAG component integrates seamlessly with streaming LLM APIs by providing the retrieved context upfront. You can use the search results to augment your prompts before streaming responses through Convex actions or HTTP endpoints.

Links