Observability
OpenTelemetry, Sentry, Prometheus metrics, structured logging — and the gotchas baked into the bootstrap order.
The boilerplate ships three telemetry surfaces: traces (OTel), errors (Sentry), metrics (Prometheus), plus structured logs (pino). Each is opt-in: leave its config empty and the adapter no-ops without breaking boot.
Logging — pino
apps/server/src/infrastructure/logger/. Pino with pretty output in dev, JSON in prod. Every controller gets a child logger via Awilix; HTTP requests get a correlationId attached by middleware, propagated to every downstream log line for that request.
logger.info({ userId, orgId }, 'created project');
logger.warn({ err }, 'Stripe webhook signature mismatch');Set LOG_LEVEL=debug to see Prisma queries and event-bus dispatch.
Tracing — OpenTelemetry
apps/server/src/otel-init.ts initializes @opentelemetry/sdk-node with auto-instrumentations for Express, HTTP, Postgres, ioredis, BullMQ. Spans export via OTLP to OTEL_EXPORTER_OTLP_ENDPOINT. Leave the endpoint unset and traces still build in-process — they just don't ship anywhere.
The bootstrap-order gotcha
OTel must be imported before Express, Prisma, and Redis load — otherwise the auto-instrumentations attach to functions that have already been resolved, and most spans are silently dropped.
The first import in apps/server/src/index.ts:
import './otel-init.js'; // MUST be first.
import express from 'express';
// ...ESM hoists imports above any in-body code, so you can't fix this by calling startOtel() later in the file. Keep ./otel-init.js as the first import — don't move it, don't sandwich it between others.
OTEL_SERVICE_NAME defaults to mern-saas-server. Override per environment.
Errors — Sentry
The server uses @sentry/bun (not @sentry/node — the runtime is Bun). The client uses @sentry/nextjs with instrumentation-client.ts and instrumentation.ts.
SENTRY_DSN=https://[email protected]/...
SENTRY_TRACES_SAMPLE_RATE=0.1
APP_VERSION=Leave SENTRY_DSN empty to disable error reporting entirely.
The capture gotcha
logger.error(...) does not auto-capture to Sentry. It only writes a structured log line. For errors that should page someone, use the helper:
import { captureError } from '@/infrastructure/observability/capture-error.js';
try {
await provider.charge(card);
} catch (err) {
captureError(err, { userId, orgId, paymentId });
throw err;
}captureError(err, ctx) adds the context as Sentry tags + extra, then forwards the error to the SDK. In test and dev (no SENTRY_DSN) it's a no-op.
Metrics — Prometheus
apps/server/src/infrastructure/http/metrics.ts exposes a /metrics endpoint when METRICS_ENABLED=true (default). Default metrics include:
- HTTP request count + latency histogram, labelled by route + method + status.
- Process metrics (CPU, memory, event loop lag).
- BullMQ counters per queue (waiting, active, completed, failed).
- Postgres pool gauges via Prisma's metrics extension.
Scrape it with Prometheus or any compatible agent (Grafana Agent, Datadog, etc.). The endpoint is unauthenticated — keep it on a private network.
Correlation IDs
Every request gets an x-correlation-id (incoming if present, generated otherwise). It's:
- Attached to every log line for the request via the child logger.
- Returned in the response header so clients can echo it in bug reports.
- Set as a Sentry tag and an OTel span attribute.
When a user reports a problem, ask for the correlation id; you can pivot from there to logs, traces, and the Sentry event.
Suggested dashboards
The pre-defined chart views you'll want in Grafana / your analog:
- p50 / p95 / p99 latency by route.
- 4xx / 5xx rate by route.
- BullMQ waiting + failed by queue.
- Active subscriptions count (queried from Postgres, not from a metric — billing is the source of truth).
- Auth attempts per minute, segmented by IP and email rate limiter (signal of brute force).
Health check
GET /health is unauthenticated, doesn't touch the DB, and returns { status: 'ok' }. Use it as the liveness probe. For readiness (which depends on DB + Redis), use /ready.