SaaS Starter

Assistant

Transport-agnostic LLM assistant primitives — tool shape, persona builder, OpenAI tool-calling loop. Pairs with @app/mcp and @app/rag.

The @app/assistant package is the transport-agnostic core of the boilerplate's AI story:

  • Tool shapeAssistantTool<I, O, Ctx> with a Zod input schema and a structured ToolResult ({ ok, summary, payload } / { ok, error }).
  • Tool registry — bridges to the Vercel AI SDK's ToolSet so the same tools register with generateText({ tools }) in-process.
  • Persona buildercomposePersona({ baseRules, toolCatalog?, glossary? }) stitches a system prompt without imposing tone or content.
  • OpenAI completerOpenAiAssistantCompleter drains streamText for text-only completions and runs generateText with stopWhen: stepCountIs for the bounded multi-hop tool loop, surfacing per-tool traces.
  • Run-turn orchestratorrunAssistantTurn(completer, registry, input) returns { text, promptTokens, completionTokens, toolCalls }.

Why a separate package from @app/ai?

@app/ai is a thin port + adapter layer for streamText / embed / transcribe — used by the existing /api/v1/ai/chat endpoint that streams plain text. @app/assistant adds the missing pieces a real assistant needs: usage metrics, structured tool-call traces, and a bounded multi-step tool loop. Keeping them separate means a consumer that only wants chat streaming doesn't carry the tool-loop machinery.

Tool shape

import { z } from 'zod';
import type { AssistantTool, ToolContext } from '@app/assistant';

interface MyCtx extends ToolContext {
  userId: string;
  organizationId: string;
}

const Schema = z.object({ id: z.string().min(1) });

export const archiveNoteTool: AssistantTool<z.infer<typeof Schema>, void, MyCtx> = {
  name: 'archive_note',
  description: 'Archive a note by id.',
  inputSchema: Schema,
  execute: async (input, ctx) => {
    try {
      await archiveNote({ id: input.id, by: ctx.userId, in: ctx.organizationId });
      return { ok: true, summary: `Archived note ${input.id}.` };
    } catch (err) {
      return { ok: false, error: err instanceof Error ? err.message : 'unknown' };
    }
  },
};

The same shape works across two transports today:

  • In-process via runAssistantTurn — pass the tool to a ToolRegistry and the AI SDK runs execute() for you on each LLM tool call.
  • MCP via @app/mcp — see packages/mcp/src/tool-adapter.ts. The adapter derives JSON Schema from the Zod schema using z.toJSONSchema() and bridges execute(input, ctx) to MCP's handler(args) shape.

The plop generator (bun gen mcp:tool) emits the unified AssistantTool shape — see MCP for the wired example.

Composing a turn

import {
  composePersona,
  runAssistantTurn,
  ToolRegistry,
  OpenAiAssistantCompleter,
} from '@app/assistant';

const registry = new ToolRegistry<MyCtx>([archiveNoteTool, /* ... */]);
const completer = new OpenAiAssistantCompleter({ apiKey: env.OPENAI_API_KEY });

const systemPrompt = composePersona({
  baseRules: 'You are a helpful productivity assistant. Be brief.',
  toolCatalog: registry.buildCatalogSummary(),
});

const result = await runAssistantTurn(completer, registry, {
  systemPrompt,
  messages: [{ role: 'user', content: 'archive note abc123' }],
  ctx: { userId: 'u_42', organizationId: 'org_1' },
  model: 'gpt-5.4-mini',
});

console.log(result.text);          // final reply
console.log(result.toolCalls);     // [{ name, ok, result }]
console.log(result.promptTokens);  // usage for cost accounting

RAG integration

Pair with @app/rag to inject relevant snippets at turn time:

import { RetrieveRelevantContextUseCase } from '@app/rag';

const hits = await retriever.execute({
  organizationId: ctx.organizationId,
  query: lastUserMessage,
  topK: 5,
});

const systemPrompt = composePersona({
  baseRules: 'You are ...',
  toolCatalog: registry.buildCatalogSummary(),
  glossary: hits.map((h) => `- ${h.snippet}`).join('\n'),
});

The retrieval block is just a markdown bullet list injected as the "glossary" section — the LLM treats it as additional context.

What's NOT in the package

  • Conversation history persistence — bring your own (Prisma table, Redis stream, in-memory Map). The turn loop is stateless; you feed it the message tail you want the LLM to see.
  • Domain-event listeners for re-indexing — those are consumer-specific (when a note changes, what snippet do you build?). The plop generator in a follow-up will scaffold these.
  • A persona contentcomposePersona is a string assembler. The voice, the rules, the language are yours to write.

This is intentional — the package is the LLM glue, not the assistant.

On this page