Node.js Performance - Full Stack Reference Map
// Metrics · profiling · load testing · tracing · database · frontend · production checklist
36 total topics
Metrics
Profiling
Load
Frontend
01Metrics - What To Measure- Latency, throughput, event-loop lag, RED, USE, and SLO numbers
📊Percentile Latencyp50/p95/p99
Measure tail latency. Average hides the slow users.
request start
-> response finish
-> store duration
-> p50 / p95 / p99
e.g.: perf_hooks, process.hrtime.bigint, prom-client histogram
🔢RED Methodservice health
Track Rate, Errors, and Duration for every Node service.
R = requests/sec
E = 5xx errors
D = latency percentiles
e.g.: Express middleware, Fastify hooks, Prometheus RED dashboard
🖥️USE Methodprocess
Measure Utilization, Saturation, and Errors inside the Node process.
CPU utilization
heap/rss memory
event-loop lag
process errors
e.g.: process.memoryUsage, process.cpuUsage, monitorEventLoopDelay
🔄Event Loop Laglag
Detect main-thread blocking, the most common Node latency killer.
set timer
-> expected time
-> actual time
-> lag = drift
e.g.: perf_hooks monitorEventLoopDelay, timer drift detector
🧮Throughput Windowrps
Track requests per second over a rolling time window.
record request time
-> remove old samples
-> count/window = RPS
e.g.: rolling counters, Prometheus counters, autocannon RPS
02Profiling - CPU, Memory, Async- Find exactly where time and memory go inside Node.js
🏥Clinic Flamecpu
Generate flamegraphs under load and find hot functions.
start app with clinic
-> apply load
-> open flamegraph
-> optimize wide bars
e.g.: clinic flame, autocannon, Chrome DevTools
🎯perf_hooks Timingbuilt-in
Use built-in precision timers to measure exact code paths.
performance.mark(start)
-> run work
-> performance.measure(name)
e.g.: perf_hooks PerformanceObserver, performance.timerify
🔍V8 CPU Profilerv8
Use Node’s built-in V8 sampling profiler.
node --prof app.js
-> generate isolate log
-> node --prof-process
e.g.: node --prof, --prof-process, chrome://inspect
📸Heap Snapshotsmemory
Compare heap snapshots to find retained objects and leaks.
snapshot A
-> simulate traffic
-> snapshot B
-> compare retained objects
e.g.: v8.writeHeapSnapshot, Chrome DevTools memory tab
🧠Memory Leak Guardleaks
Watch heap trend and alert when memory climbs continuously.
sample heap
-> keep rolling window
-> detect upward slope
e.g.: process.memoryUsage, memwatch-next, heapdump
🧵Async Bottlenecksawait
Find slow promise chains, sequential awaits, and stuck async resources.
sequential await
-> waterfall latency
-> parallelize safe calls
e.g.: async_hooks, Promise.all, OpenTelemetry spans
03Load Testing- Generate controlled traffic and read the latency/RPS curve
🚀Autocannon API Testload
Fast local HTTP benchmarking for Node APIs.
warm app
-> run load
-> inspect p99 + RPS
-> change one variable
e.g.: autocannon, Fastify benchmarks, Express comparison
📈k6 Scenario Testk6
Script realistic user journeys with thresholds.
virtual users
-> scenario steps
-> thresholds
-> pass/fail CI
e.g.: k6, thresholds, staged load, CI performance gates
⚡Spike Testspike
Find how the service behaves when traffic jumps suddenly.
baseline
-> sudden high load
-> recovery
-> watch errors
e.g.: k6 ramping-vus, autocannon bursts, queue depth
🛁Soak Testsoak
Run moderate traffic for a long time to reveal leaks and drift.
steady traffic
-> hours
-> heap/RSS/latency trend
e.g.: k6 soak, heap snapshots, event-loop lag trend
📉Capacity Curvecapacity
Find the point where more RPS causes latency to explode.
increase concurrency
-> plot RPS + p99
-> find knee of curve
e.g.: autocannon matrix, Grafana dashboard, SLO capacity planning
04Tracing & Instrumentation- Connect request latency to spans, logs, metrics, and downstream calls
🧭OpenTelemetry Setupotel
Instrument Node services with traces and context propagation.
request
-> root span
-> db/cache/http spans
-> exporter
e.g.: @opentelemetry/sdk-node, Jaeger, Tempo, Honeycomb
✍️Manual Spansspans
Add custom spans around business operations.
start span
-> do work
-> add attributes
-> end span
e.g.: trace.getTracer, span attributes, error recording
🪵Structured Logslogs
Log JSON with request IDs so performance issues can be searched.
request id
-> log fields
-> trace id
-> searchable event
e.g.: pino, Winston, AsyncLocalStorage request context
📡Prometheus Metricsmetrics
Expose process, RED, and custom metrics from Node.
/metrics
-> scrape
-> Prometheus
-> Grafana dashboard
e.g.: prom-client, default metrics, histograms
🧵Request Contextcontext
Preserve request IDs across async calls.
incoming request
-> AsyncLocalStorage
-> logs/spans use context
e.g.: AsyncLocalStorage, correlation IDs, traceparent
05Database Performance- Node database pools, slow queries, N+1, Redis, and caching
🐘Postgres Pool Sizingpg
Size DB pools per Node process without exhausting Postgres.
pods * workers * poolMax
<= database connection budget
e.g.: pg Pool, RDS max_connections, pgbouncer
🐢Slow Query Wrapperquery
Time every query and log slow SQL automatically.
wrap pool.query
-> measure duration
-> warn if slow
-> run EXPLAIN
e.g.: pg query wrapper, EXPLAIN ANALYZE, Prisma middleware
🔁N+1 Detectionorm
Detect list queries that trigger one query per row.
1 list query
+ N child queries
= latency explosion
e.g.: DataLoader, joins, Prisma include, GraphQL resolvers
🔴Redis Pipeliningredis
Batch many Redis commands into one network round-trip.
100 GETs
without pipeline = 100 RTT
with pipeline = 1 RTT
e.g.: ioredis pipeline, mget, Redis cluster
📦Cache Aside TTLcache
Use Redis cache-aside with TTL and stampede protection.
check cache
-> miss? dedupe inflight
-> fetch DB
-> setex
e.g.: ioredis, cache-manager, lru-cache, stampede protection
06Frontend Performance From Node- Browser performance controlled by Node servers, headers, and tests
🎭Playwright Perf Testbrowser
Use Node scripts to measure real browser navigation metrics.
launch browser
-> goto page
-> read navigation/resource timings
e.g.: Playwright, navigation timing, LCP/CLS checks
📨HTTP Cache Headersheaders
Set Cache-Control, ETag, Vary, and compression from Node.
request
-> etag/cache-control
-> 304 or cached response
e.g.: compression, express.static, ETag, stale-while-revalidate
🌊Streaming Responsesstreams
Stream large responses instead of building huge payloads in memory.
read chunks
-> res.write
-> backpressure
-> res.end
e.g.: Node streams, pipeline, server-sent events, CSV export
⚙️Fast JSON Serializationjson
Use schemas to serialize hot JSON responses faster.
schema
-> compiled serializer
-> res.end(serialized)
e.g.: fast-json-stringify, Fastify schemas
🗜️Compression Strategygzip/brotli
Compress text responses while avoiding waste on tiny payloads.
JSON/HTML/CSS
-> gzip/brotli
-> fewer bytes
-> lower latency
e.g.: compression middleware, CDN Brotli, threshold tuning
07Production Checklist- The combined Node performance playbook before and after launch
⚙️Process Configruntime
Tune cluster, heap, NODE_ENV, and libuv thread pool.
NODE_ENV=production
heap limit
threadpool
cluster workers
e.g.: PM2 cluster, Node cluster, UV_THREADPOOL_SIZE
👥Cluster Workersscale
Run one Node process per CPU core for CPU isolation and throughput.
primary
-> fork N workers
-> each worker handles requests
e.g.: cluster module, PM2 cluster mode, Kubernetes replicas
🚦Backpressuresafety
Protect Node from accepting more work than it can finish.
queue depth high
-> reject/rate-limit
-> recover before timeout storm
e.g.: Bottleneck, p-limit, BullMQ concurrency, HTTP 429
🗺️Decision Tablediagnosis
Map symptoms to measurements and fixes.
high p99 -> event-loop lag?
high CPU -> flamegraph
slow DB -> query timing
e.g.: Grafana, traces, logs, flamegraphs, slow query logs
🛠️Tool Stacktools
Recommended production Node performance toolkit.
measure
-> profile
-> load test
-> trace
-> optimize
-> verify
e.g.: clinic.js, autocannon, k6, OpenTelemetry, prom-client, Grafana
Optimize in this order: measure first, profile the bottleneck, change one thing, load test again, then ship with dashboards and alerts.