RAG
Retrieval-augmented generation with pgvector — multi-tenant vector store + generic use cases for indexing, retrieval, and stats.
The @app/rag package wraps a pgvector table and the @app/ai embedding
provider into four small use cases the rest of the codebase can call:
EmbedAndStoreUseCase, RetrieveRelevantContextUseCase,
DeleteEmbeddingsBySourceUseCase, GetEmbeddingStatsUseCase.
It is opt-in. The boilerplate ships with a stock postgres:16-alpine
image; pgvector and the migration only matter if you actually use the
package.
When to install
Install @app/rag when you want to:
- Embed domain text (notes, descriptions, transcripts) and search it later by semantic similarity.
- Inject "relevant context" snippets into an LLM prompt at turn time (classic RAG pattern — pairs cleanly with the assistant package).
- Power an admin observability surface that shows per-org index size.
If you only need streaming chat completions, skip this — @app/ai is
enough.
Switching to pgvector
In docker-compose.dev.yml, swap the Postgres image:
postgres:
# image: postgres:16-alpine # default
image: pgvector/pgvector:pg16 # required for @app/ragThe data directory is compatible — no volume reset needed.
Migration
Copy the reference SQL from packages/rag/prisma/migrations/0001_rag_init/
into your apps/server/prisma/migrations/ directory (with a fresh
timestamp), then:
cd apps/server && bun prisma migrate devAdd the matching Embedding model to schema.prisma — the snippet is in
the package README.
Wiring example
import { OpenAiEmbeddingProvider } from '@app/ai';
import {
PgvectorEmbeddingStore,
EmbedAndStoreUseCase,
RetrieveRelevantContextUseCase,
} from '@app/rag';
const embeddings = new OpenAiEmbeddingProvider({
apiKey: env.OPENAI_API_KEY,
model: 'text-embedding-3-small',
expectedDimension: 1536,
});
const store = new PgvectorEmbeddingStore(prisma);
const indexer = new EmbedAndStoreUseCase(embeddings, store);
await indexer.execute({
organizationId: 'org_123',
sourceType: 'note',
sourceId: 'note_42',
snippet: 'Quarterly review with Acme Corp scheduled for May 12.',
});
const retriever = new RetrieveRelevantContextUseCase(embeddings, store);
const hits = await retriever.execute({
organizationId: 'org_123',
query: 'what meetings do I have with Acme?',
topK: 5,
});Multi-tenant safety
organizationId is required on every read and write. The pgvector store
filters by it on every SELECT. Add an integration test on the consumer
side that inserts overlapping rows across two orgs and asserts no leakage
— the package can't enforce this without your DB, so it's your contract
to keep.
Vector dimension
The migration ships vector(1536) to match text-embedding-3-small. To
use a larger model (e.g. text-embedding-3-large → 3072), edit the
migration column width, recreate the IVFFlat index, and pass
{ dimension: 3072 } to the store + expectedDimension: 3072 to the
embedding provider so a future model swap fails loud rather than
corrupting the index.
Listener pattern
Pair EmbedAndStoreUseCase with your domain events to keep the index
fresh. When a note/task/etc. is created or updated, build a snippet
string and call indexer.execute(...). When it's deleted, call
DeleteEmbeddingsBySourceUseCase so stale rows don't keep showing up in
search.
The assistant package (coming next) provides a generator for these listeners.