SaaS Starter
Observability

Observability

OpenTelemetry, Sentry, Prometheus metrics, structured logging — and the gotchas baked into the bootstrap order.

The boilerplate ships three telemetry surfaces: traces (OTel), errors (Sentry), metrics (Prometheus), plus structured logs (pino). Each is opt-in: leave its config empty and the adapter no-ops without breaking boot.

Logging — pino

apps/server/src/infrastructure/logger/. Pino with pretty output in dev, JSON in prod. Every controller gets a child logger via Awilix; HTTP requests get a correlationId attached by middleware, propagated to every downstream log line for that request.

logger.info({ userId, orgId }, 'created project');
logger.warn({ err }, 'Stripe webhook signature mismatch');

Set LOG_LEVEL=debug to see Prisma queries and event-bus dispatch.

Tracing — OpenTelemetry

apps/server/src/otel-init.ts initializes @opentelemetry/sdk-node with auto-instrumentations for Express, HTTP, Postgres, ioredis, BullMQ. Spans export via OTLP to OTEL_EXPORTER_OTLP_ENDPOINT. Leave the endpoint unset and traces still build in-process — they just don't ship anywhere.

The bootstrap-order gotcha

OTel must be imported before Express, Prisma, and Redis load — otherwise the auto-instrumentations attach to functions that have already been resolved, and most spans are silently dropped.

The first import in apps/server/src/index.ts:

import './otel-init.js';   // MUST be first.
import express from 'express';
// ...

ESM hoists imports above any in-body code, so you can't fix this by calling startOtel() later in the file. Keep ./otel-init.js as the first import — don't move it, don't sandwich it between others.

OTEL_SERVICE_NAME defaults to mern-saas-server. Override per environment.

Errors — Sentry

The server uses @sentry/bun (not @sentry/node — the runtime is Bun). The client uses @sentry/nextjs with instrumentation-client.ts and instrumentation.ts.

SENTRY_DSN=https://[email protected]/...
SENTRY_TRACES_SAMPLE_RATE=0.1
APP_VERSION=

Leave SENTRY_DSN empty to disable error reporting entirely.

The capture gotcha

logger.error(...) does not auto-capture to Sentry. It only writes a structured log line. For errors that should page someone, use the helper:

import { captureError } from '@/infrastructure/observability/capture-error.js';

try {
  await provider.charge(card);
} catch (err) {
  captureError(err, { userId, orgId, paymentId });
  throw err;
}

captureError(err, ctx) adds the context as Sentry tags + extra, then forwards the error to the SDK. In test and dev (no SENTRY_DSN) it's a no-op.

Metrics — Prometheus

apps/server/src/infrastructure/http/metrics.ts exposes a /metrics endpoint when METRICS_ENABLED=true (default). Default metrics include:

  • HTTP request count + latency histogram, labelled by route + method + status.
  • Process metrics (CPU, memory, event loop lag).
  • BullMQ counters per queue (waiting, active, completed, failed).
  • Postgres pool gauges via Prisma's metrics extension.

Scrape it with Prometheus or any compatible agent (Grafana Agent, Datadog, etc.). The endpoint is unauthenticated — keep it on a private network.

Correlation IDs

Every request gets an x-correlation-id (incoming if present, generated otherwise). It's:

  • Attached to every log line for the request via the child logger.
  • Returned in the response header so clients can echo it in bug reports.
  • Set as a Sentry tag and an OTel span attribute.

When a user reports a problem, ask for the correlation id; you can pivot from there to logs, traces, and the Sentry event.

Suggested dashboards

The pre-defined chart views you'll want in Grafana / your analog:

  • p50 / p95 / p99 latency by route.
  • 4xx / 5xx rate by route.
  • BullMQ waiting + failed by queue.
  • Active subscriptions count (queried from Postgres, not from a metric — billing is the source of truth).
  • Auth attempts per minute, segmented by IP and email rate limiter (signal of brute force).

Health check

GET /health is unauthenticated, doesn't touch the DB, and returns { status: 'ok' }. Use it as the liveness probe. For readiness (which depends on DB + Redis), use /ready.

On this page