Database Connection Pooling vs Direct Connections
Context
Prism's backend plugins need to connect to data stores (PostgreSQL, Redis, ClickHouse, etc.). Each plugin must decide: use connection pooling or direct connections per request?
The Tradeoff
Connection Pooling:
- Pre-established connections reused across requests
- Lower latency (no TCP handshake + auth per request)
- Fixed resource usage (pool size limits)
- Complexity: pool management, health checks, stale connection handling
Direct Connections:
- New connection per request
- Higher latency (TCP + TLS + auth overhead: ~5-50ms)
- Unbounded resource usage (connections scale with request rate)
- Simplicity: no pool management needed
Why This Matters at Scale
From Netflix's experience at 8M QPS:
- Connection churn kills performance at scale
- PostgreSQL max_connections: typically 100-200 (too low for high concurrency)
- Redis benefits from persistent connections (pipelining, reduced latency)
- But: connection pools can become bottlenecks if undersized
Decision
Use connection pooling by default for all backends, with backend-specific tuning:
Connection Pool Strategy Matrix
Backend | Pool Type | Pool Size Formula | Rationale |
---|---|---|---|
PostgreSQL | Shared pool per namespace | max(10, RPS / 100) | Expensive connections, limited max_connections |
Redis | Shared pool per namespace | max(5, RPS / 1000) | Cheap connections, benefits from pipelining |
Kafka | Producer pool | max(3, num_partitions / 10) | Producers are heavyweight, batching preferred |
ClickHouse | Shared pool per namespace | max(5, RPS / 200) | Query-heavy, benefits from persistent HTTP/2 |
NATS | Single persistent connection | 1 per namespace | Multiplexing over single connection |
Object Storage (S3/MinIO) | No pool (HTTP client reuse) | N/A | HTTP client handles pooling internally |
Pool Configuration
# Per-backend pool settings
backends:
postgres:
pool:
min_size: 10
max_size: 100
idle_timeout: 300s # Close idle connections after 5 min
max_lifetime: 1800s # Recycle connections after 30 min
connection_timeout: 5s
health_check_interval: 30s
redis:
pool:
min_size: 5
max_size: 50
idle_timeout: 600s # Redis connections are cheap to keep alive
max_lifetime: 3600s
connection_timeout: 2s
health_check_interval: 60s