RFC-054: Netflix-Inspired Platform Enhancements for Prism
Status: Proposed Author: Platform Team Created: 2025-11-14 Updated: 2025-11-14
Abstract
This RFC proposes platform-level enhancements to Prism inspired by Netflix's Data Gateway architecture and operational learnings. Netflix's decade-plus experience running a data abstraction layer at global scale provides valuable lessons for making Prism more resilient, cost-efficient, and developer-friendly.
Key Enhancements:
- Platform-Level Resilience - Circuit breakers, adaptive load shedding, backpressure at proxy layer
- Automated Data Lifecycle Management - TTL policies, automated cleanup, cost tracking per namespace
- Multi-Modal Access Patterns - Same data accessible through different pattern abstractions
- Capability-Aware Routing - Intelligent backend selection based on operation requirements
- Standardized Observability - Platform-wide metrics, tracing, and cost attribution
Netflix demonstrated that a well-designed platform layer can shield applications from database complexity while maintaining extreme reliability (99.99%+ uptime) at massive scale (petabytes of data, billions of operations/day).
Motivation
Netflix's Data Gateway Success
Netflix's Data Gateway platform provides several specialized abstractions:
- Key-Value DAL: Simple CRUD operations with multi-backend support (Cassandra, EVCache, DynamoDB)
- TimeSeries: Temporal event data with time-based queries (Cassandra, Elasticsearch)
- Distributed Counter: Counting with configurable accuracy/latency trade-offs (EVCache, Cassandra)
- Write-Ahead Log: Durable event logging with guaranteed ordering (already in RFC-051)
Key Characteristics:
- Abstraction Platform - Common infrastructure for multiple data access patterns
- Resilience First - Circuit breakers, load shedding, backpressure built into platform
- Data Lifecycle - TTL, cleanup, and cost management as first-class concerns
- Operational Excellence - Automated capacity planning, incident learning, best practice enforcement
Current Prism Architecture
Prism has similar pattern abstractions (RFC-014, RFC-046):
- KeyValue Pattern - Key-value storage with backend slots
- PubSub/Queue Patterns - Messaging with producer/consumer abstractions
- Multicast Registry - Identity registration with selective multicast
- Graph Pattern - Graph traversal and algorithms (RFC-053)
- Session Store - Distributed session management
What's Missing:
- Platform-level resilience - Patterns implement their own circuit breaking
- Standardized data lifecycle - Each pattern handles TTL differently
- Cost visibility - No unified cost tracking across patterns
- Multi-modal access - Can't access same data through different patterns
- Capability negotiation - Limited backend selection intelligence
Goals
- Add Platform-Level Resilience - Circuit breakers, load shedding, backpressure in proxy
- Automate Data Lifecycle - TTL policies, cleanup automation, cost tracking
- Enable Multi-Modal Access - Same backend data through multiple pattern views
- Intelligent Backend Selection - Route operations to optimal backend based on capabilities
- Standardize Observability - Unified metrics, tracing, cost attribution
Non-Goals
- Replacing existing pattern designs (build on top of RFC-014, RFC-046)
- Changing protobuf APIs (additive enhancements only)
- Requiring application code changes (platform improvements are transparent)
Platform-Level Resilience
Problem: Resilience Currently Per-Pattern
Current State: Each pattern implements its own resilience:
// patterns/keyvalue/pattern.go
func (kv *KeyValuePattern) Store(ctx context.Context, req *StoreRequest) (*StoreResponse, error) {
// Pattern-specific circuit breaker
if kv.circuitBreaker.IsOpen() {
return nil, errors.New("circuit breaker open")
}
// Pattern-specific retry logic
return kv.backend.Set(ctx, req.Key, req.Value)
}
Problems:
- Duplicate circuit breaker implementations across patterns
- Inconsistent behavior (different timeout/retry configs)
- No cross-pattern coordination (all patterns fail independently)
- Difficult to configure (must set per-pattern)
Solution: Platform-Level Resilience Layer
Architecture:
┌─────────────────────────────────────────────────────────┐
│ Client Application │
└────────────────────┬────────────────────────────────────┘
│ gRPC request
▼
┌─────────────────────────────────────────────────────────┐
│ Prism Proxy (Rust) │
│ ┌───────────────────────────────────────────────────┐ │
│ │ Platform Resilience Layer (NEW) │ │
│ │ │ │
│ │ 1. Circuit Breaker (per backend + global) │ │
│ │ 2. Adaptive Load Shedding (based on latency) │ │
│ │ 3. Backpressure (queue depth monitoring) │ │
│ │ 4. Request Hedging (parallel requests) │ │
│ │ 5. Timeout Management (operation-specific) │ │
│ └───────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌───────────────────────────────────────────────────┐ │
│ │ Pattern Layer (KeyValue, PubSub, etc.) │ │
│ │ - Resilience handled by platform │ │
│ │ - Focus on business logic only │ │
│ └───────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────┘
Circuit Breaker Configuration
Per-Backend Circuit Breakers:
namespaces:
- name: user-sessions
pattern: keyvalue
backend:
type: redis
# Platform-level resilience (NEW)
resilience:
circuit_breaker:
enabled: true
error_threshold: 50 # Open after 50% error rate
min_requests: 100 # Minimum requests before opening
timeout: 30s # Stay open for 30s
half_open_requests: 10 # Test with 10 requests when half-open
adaptive_load_shedding:
enabled: true
target_latency_p99: 100ms # Target P99 latency
shed_probability: 0.5 # Start shedding at 50% over target
min_accept_rate: 0.1 # Always accept 10% of requests
backpressure:
enabled: true
queue_depth_threshold: 1000 # Start rejecting if queue > 1000
queue_latency_threshold: 5s # Reject if queue time > 5s
timeouts:
operation_timeout: 5s # Per-operation timeout
connection_timeout: 1s # Backend connection timeout
Global Circuit Breaker:
# Proxy-level configuration
proxy:
global_resilience:
circuit_breaker:
enabled: true
# Open if ANY backend has >80% error rate
global_error_threshold: 0.8
# Fail fast for all namespaces
fail_fast_on_global_open: true
Adaptive Load Shedding (Netflix Pattern)
Problem: Traffic spikes saturate backend, causing cascading failures.
Netflix Solution: Shed load based on latency feedback loops.
Implementation:
// pkg/proxy/resilience/load_shedding.rs
pub struct AdaptiveLoadShedder {
target_latency: Duration,
current_latency: Arc<AtomicU64>, // P99 latency
shed_probability: Arc<AtomicU64>, // 0-100%
}
impl AdaptiveLoadShedder {
pub fn should_accept_request(&self) -> bool {
let current_p99 = Duration::from_nanos(
self.current_latency.load(Ordering::Relaxed)
);
if current_p99 <= self.target_latency {
return true; // Fast path: latency good, accept all
}
// Latency over target: start shedding
let overage_ratio = current_p99.as_millis() as f64
/ self.target_latency.as_millis() as f64;
// Shed probability increases linearly with overage
// 2× target latency = 50% shed, 3× = 66% shed, etc.
let shed_prob = 1.0 - (1.0 / overage_ratio);
// Random accept/reject based on shed probability
rand::random::<f64>() > shed_prob
}
pub fn update_latency(&self, latency: Duration) {
// Update P99 latency with exponential moving average
self.current_latency.store(
latency.as_nanos() as u64,
Ordering::Relaxed
);
}
}
Client Experience:
Normal conditions (P99 < 100ms):
- All requests accepted ✅
Moderate load (P99 = 200ms, 2× target):
- 50% of requests shed ⚠️
- Error: 429 Too Many Requests (shed_reason: "latency_overload")
High load (P99 = 500ms, 5× target):
- 80% of requests shed 🚨
- Backend protected from saturation
- Error: 429 Too Many Requests (shed_reason: "latency_overload")
Backpressure Monitoring
Problem: Proxy queue builds up, requests timeout before execution.
Solution: Monitor queue depth and latency, reject early.
// pkg/proxy/resilience/backpressure.rs
pub struct BackpressureMonitor {
queue_depth: Arc<AtomicUsize>,
max_queue_depth: usize,
queue_wait_times: Arc<Mutex<VecDeque<Duration>>>,
}
impl BackpressureMonitor {
pub fn check_accept_request(&self) -> Result<(), Error> {
// Check queue depth
let depth = self.queue_depth.load(Ordering::Relaxed);
if depth > self.max_queue_depth {
return Err(Error::QueueFull {
current_depth: depth,
max_depth: self.max_queue_depth,
});
}
// Check queue latency (median of last 100 requests)
let wait_times = self.queue_wait_times.lock().unwrap();
let median_wait = calculate_median(&wait_times);
if median_wait > Duration::from_secs(5) {
return Err(Error::QueueLatencyTooHigh {
median_wait,
threshold: Duration::from_secs(5),
});
}
Ok(())
}
}
Metrics:
# Queue depth (target: <1000)
prism_queue_depth{namespace="user-sessions"} 850
# Queue wait time (target: <1s)
prism_queue_wait_seconds{namespace="user-sessions", quantile="0.5"} 0.2
prism_queue_wait_seconds{namespace="user-sessions", quantile="0.99"} 3.5
# Rejection rate due to backpressure
rate(prism_requests_rejected_total{reason="queue_full"}[5m]) 120/sec
Automated Data Lifecycle Management
Problem: Manual Data Cleanup
Current State: Applications must implement their own cleanup:
# Application code (bad - manual cleanup)
client.keyvalue.store(key="session:123", value=data)
# Remember to clean up later... (often forgotten)
schedule_cleanup("session:123", after=3600) # 1 hour
Problems:
- Developers forget to set TTL
- Data accumulates indefinitely
- Storage costs balloon
- Performance degrades (scan operations slow)
Netflix's Approach: TTL as First-Class Concern
Key Insight: Data cleanup should be automatic and declarative, not manual.
Netflix Strategies:
- Default TTL policies - Every key has a TTL unless explicitly persistent
- Snapshot Janitors - Automated cleanup of old snapshots
- Cost attribution - Track storage costs per namespace
- Automated tiering - Move old data to cheaper storage
Solution: Declarative Data Lifecycle Policies
Configuration:
namespaces:
- name: user-sessions
pattern: keyvalue
# Data lifecycle policies (NEW)
lifecycle:
default_ttl: 3600 # 1 hour default for all keys
# Pattern-specific overrides
ttl_policies:
- key_pattern: "session:*"
ttl: 3600 # 1 hour for sessions
- key_pattern: "cache:*"
ttl: 300 # 5 min for cache
- key_pattern: "user:*:profile"
ttl: 86400 # 1 day for profiles
- key_pattern: "config:*"
ttl: -1 # Persistent for config
# Automated cleanup
cleanup:
enabled: true
scan_interval: 3600 # Scan every hour
batch_size: 1000 # Delete 1000 keys per batch
# Cost tracking
cost_tracking:
enabled: true
storage_cost_per_gb_month: 0.023 # $0.023/GB/month
operation_cost_per_million: 0.40 # $0.40/million ops
# Tiered storage (Netflix pattern)
tiering:
enabled: true
hot_tier:
backend: redis # Hot: Redis
max_age: 86400 # 1 day
warm_tier:
backend: postgres # Warm: Postgres
max_age: 604800 # 7 days
cold_tier:
backend: s3 # Cold: S3
retention: 2592000 # 30 days, then delete
Client API (TTL is automatic):
# Application code (good - TTL automatic)
client.keyvalue.store(
key="session:123",
value=data
# No TTL needed - uses lifecycle policy "session:*" -> 1 hour
)
# Explicit TTL override if needed
client.keyvalue.store(
key="session:123",
value=data,
ttl=7200 # 2 hours (overrides policy)
)
Automated Tiering Example
Scenario: Session data accessed frequently for 1 day, occasionally for 7 days, rarely after.
Configuration:
lifecycle:
tiering:
enabled: true
# Day 0-1: Hot tier (Redis)
hot_tier:
backend: redis
max_age: 86400
replication: 3
# Day 1-7: Warm tier (Postgres)
warm_tier:
backend: postgres
max_age: 604800
# Day 7-30: Cold tier (S3)
cold_tier:
backend: s3
retention: 2592000
storage_class: GLACIER # Cheap archival
# Day 30+: Delete
delete_after: 2592000
Data Flow:
Day 0: Write → Redis (hot tier)
- P99 latency: 1ms
- Cost: $0.10/GB/month
Day 1: Background job moves to Postgres (warm tier)
- P99 latency: 10ms
- Cost: $0.05/GB/month
Day 7: Background job moves to S3 (cold tier)
- P99 latency: 100ms
- Cost: $0.004/GB/month (Glacier)
Day 30: Automated deletion
- Cost: $0
Cost Tracking and Attribution
Metrics:
# Storage cost per namespace
prism_storage_cost_usd_per_month{namespace="user-sessions"} 1250.00
# Operation cost per namespace
prism_operation_cost_usd_per_month{namespace="user-sessions"} 350.00
# Total cost breakdown
prism_total_cost_usd_per_month{
namespace="user-sessions",
cost_type="storage"
} 1250.00
prism_total_cost_usd_per_month{
namespace="user-sessions",
cost_type="operations"
} 350.00
# Storage by tier
prism_storage_bytes{namespace="user-sessions", tier="hot"} 500GB
prism_storage_bytes{namespace="user-sessions", tier="warm"} 2TB
prism_storage_bytes{namespace="user-sessions", tier="cold"} 10TB
Cost Dashboard:
Namespace: user-sessions
├── Storage Costs: $1,250/month
│ ├── Hot (Redis): $500/month (500GB)
│ ├── Warm (Postgres): $250/month (2TB)
│ └── Cold (S3 Glacier): $500/month (10TB)
├── Operation Costs: $350/month
│ ├── Reads: $200/month (500M/month)
│ └── Writes: $150/month (375M/month)
└── Total: $1,600/month
Recommendations:
⚠️ Consider increasing hot_tier TTL to reduce tiering overhead
⚠️ 96% of reads in hot tier (good caching)
⚠️ Cold tier has 85% unused data → reduce retention to 15 days
Multi-Modal Access Patterns
Problem: Single Access Pattern Per Backend
Current State: Each namespace has one pattern.
# Can ONLY access as KeyValue
namespaces:
- name: user-profiles
pattern: keyvalue
backend:
type: postgres
Limitations:
- Same data can't be queried differently
- Must duplicate data to support multiple access patterns
- No way to run analytics on operational data
Netflix's Approach: Multiple Abstractions Over Same Data
Example: TimeSeries data accessible as:
- TimeSeries Pattern - Time-range queries for recent events
- Counter Pattern - Aggregated counts with different accuracy modes
- Analytics - Export to Parquet on S3 for Spark jobs
Solution: Multi-Modal Namespace Configuration
Configuration:
namespaces:
- name: user-events
# Primary pattern (write path)
primary_pattern: timeseries
backend:
type: postgres
table: user_events
# Secondary patterns (read views) - NEW
secondary_patterns:
# View 1: Counter aggregation
- pattern: counter
mode: eventually_consistent_global
aggregation:
group_by: [user_id, event_type]
window: 3600 # 1 hour windows
# View 2: KeyValue lookup
- pattern: keyvalue
key_template: "user:{user_id}:events"
materialized_view: true # Precompute for fast lookup
# View 3: Analytics export
- pattern: analytics_export
destination: s3
format: parquet
partition_by: [date, event_type]
schedule: "0 2 * * *" # Daily at 2 AM
Client Usage:
# Write via primary pattern (TimeSeries)
client.timeseries.write(
namespace="user-events",
timestamp=now(),
event_type="play_video",
user_id="user:123",
data={"video_id": "v456", "position": 120}
)
# Read via Counter pattern (aggregated)
count = client.counter.get(
namespace="user-events",
key="user:123",
event_type="play_video",
time_range="last_24h"
)
print(f"Videos played: {count}") # 15
# Read via KeyValue pattern (latest events)
events = client.keyvalue.retrieve(
namespace="user-events",
key="user:user:123:events"
)
print(f"Last 10 events: {events[:10]}")
# Analytics via S3 export
# spark.read.parquet("s3://prism-analytics/user-events/date=2025-11-14/")
Benefits:
- Write once, read many ways
- No data duplication
- Pattern-specific optimizations
- Cost-effective (single storage, multiple views)
Multi-Modal Architecture
┌────────────────────────────────────────────────────────┐
│ Client Applications │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │TimeSeries│ │ Counter │ │ KeyValue │ │
│ │ Client │ │ Client │ │ Client │ │
│ └──────────┘ └──────────┘ └──────────┘ │
└───── ──┬────────────┬──────────────┬──────────────────┘
│ │ │
│ write │ read │ read
▼ ▼ ▼
┌────────────────────────────────────────────────────────┐
│ Prism Proxy (Rust) │
│ ┌──────────────────────────────────────────────────┐ │
│ │ Multi-Modal Router (NEW) │ │
│ │ - Routes to primary pattern for writes │ │
│ │ - Routes to secondary patterns for reads │ │
│ │ - Maintains consistency guarantees │ │
│ └─────────────────────────── ───────────────────────┘ │
└────────────────────┬───────────────────────────────────┘
│
▼
┌───────────────────────┐
│ PostgreSQL Backend │
│ ┌─────────────────┐ │
│ │ user_events │ │ ← Single table
│ │ (primary) │ │
│ └─────────────────┘ │
│ ┌─────────────────┐ │
│ │ Materialized │ │ ← Counter view
│ │ View: counts │ │
│ └─────────────────┘ │
│ ┌─────────────────┐ │
│ │ Materialized │ │ ← KeyValue view
│ │ View: latest │ │
│ └─────────────────┘ │
└───────────────────────┘
Capability-Aware Backend Selection
Problem: Static Backend Assignment
Current State: Backend chosen at namespace creation.
namespaces:
- name: user-sessions
pattern: keyvalue
backend:
type: redis # Fixed choice
Limitations:
- Can't optimize per-operation (read vs write)
- Can't leverage multiple backends' strengths
- Can't adapt to changing workload
Netflix's Approach: Intelligent Backend Selection
Example: EVCache for hot reads, Cassandra for durability.
Solution: Capability-Aware Routing
Configuration:
namespaces:
- name: user-sessions
pattern: keyvalue
# Multi-backend configuration (NEW)
backends:
# Hot read cache
- name: cache
type: redis
capabilities: [fast_read, ttl]
priority: 1 # Try first for reads
# Durable storage
- name: primary
type: postgres
capabilities: [durable_write, transactions, scan]
priority: 2 # Fallback for all ops
# Write-through to both
write_strategy: write_through
# Routing rules (NEW)
routing:
# Reads: Try cache first, fallback to primary
read_operations:
- try: cache
on_miss: primary
# Writes: Write to both (durability + cache)
write_operations:
- targets: [primary, cache]
consistency: write_through
# Scans: Only primary supports this
scan_operations:
- try: primary
Operation Routing:
# Read operation
value = client.keyvalue.retrieve(key="session:123")
# 1. Try Redis (cache) → Hit (1ms) ✅
# 2. If miss, query Postgres (primary) → (10ms)
# 3. Populate Redis for next read
# Write operation
client.keyvalue.store(key="session:123", value=data)
# 1. Write to Postgres (primary, durable) → (10ms)
# 2. Write to Redis (cache) → (1ms)
# 3. Acknowledge after both succeed
# Scan operation
keys = client.keyvalue.scan(prefix="session:")
# 1. Route to Postgres (only backend with scan) → (50ms)
Backend Capability Matrix
| Backend | Capabilities | Best For |
|---|---|---|
| Redis | fast_read, fast_write, ttl, pubsub | Hot data, caching, real-time |
| Postgres | durable_write, transactions, scan, complex_query | Durable data, ACID, analytics |
| S3 | cheap_storage, large_objects | Archival, backups, bulk data |
| Kafka | ordered_stream, replay, high_throughput | Events, WAL, streaming |
| Neptune | graph_traversal, complex_relationships | Connected data, recommendations |
Routing Intelligence:
// pkg/proxy/routing/capability_router.rs
pub struct CapabilityRouter {
backends: Vec<Backend>,
}
impl CapabilityRouter {
pub fn select_backend(&self, operation: Operation) -> Option<&Backend> {
let required_caps = operation.required_capabilities();
// Find backends that satisfy all required capabilities
let capable = self.backends.iter()
.filter(|b| b.has_capabilities(&required_caps))
.collect::<Vec<_>>();
if capable.is_empty() {
return None; // No backend can handle this
}
// Select best backend based on:
// 1. Priority (admin-configured)
// 2. Current load (circuit breaker state)
// 3. Latency characteristics
capable.iter()
.max_by_key(|b| self.score_backend(b, &operation))
.copied()
}
fn score_backend(&self, backend: &Backend, op: &Operation) -> u32 {
let mut score = backend.priority * 100;
// Penalty for high latency
if backend.p99_latency > Duration::from_millis(100) {
score -= 50;
}
// Penalty for circuit breaker issues
if backend.circuit_breaker.is_open() {
score -= 200;
}
// Bonus for native capability support
if backend.has_native_support(op) {
score += 30;
}
score
}
}
Standardized Observability
Problem: Inconsistent Metrics Across Patterns
Current State: Each pattern exports different metrics.
# KeyValue pattern metrics
prism_keyvalue_operations_total
prism_keyvalue_latency_seconds
# PubSub pattern metrics (different naming)
prism_pubsub_messages_total
prism_pubsub_publish_duration_seconds
# Inconsistent labels, naming conventions
Netflix's Approach: Platform-Wide Observability
Key Insight: Standardize metrics, tracing, and logging across all abstractions.
Solution: Standardized Observability Schema
Metrics Schema:
# Standard metrics for ALL patterns
observability:
metrics:
# Request metrics (standard across all patterns)
- name: prism_requests_total
type: counter
labels: [namespace, pattern, operation, status]
- name: prism_request_duration_seconds
type: histogram
labels: [namespace, pattern, operation]
buckets: [0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1.0, 5.0]
# Backend metrics (standard for all backends)
- name: prism_backend_requests_total
type: counter
labels: [namespace, backend_type, operation, status]
- name: prism_backend_duration_seconds
type: histogram
labels: [namespace, backend_type, operation]
# Cost metrics (NEW)
- name: prism_storage_bytes
type: gauge
labels: [namespace, tier, backend_type]
- name: prism_storage_cost_usd_per_month
type: gauge
labels: [namespace, tier]
- name: prism_operation_cost_usd_per_month
type: gauge
labels: [namespace, operation_type]
# Resilience metrics (NEW)
- name: prism_circuit_breaker_state
type: gauge
labels: [namespace, backend_type]
values: {open: 0, half_open: 1, closed: 2}
- name: prism_requests_shed_total
type: counter
labels: [namespace, reason]
- name: prism_queue_depth
type: gauge
labels: [namespace, queue_type]
Distributed Tracing (Standard):
Trace: Store session:123
├─ Span: ProxyService.Store [15ms]
│ ├─ Span: ResilienceLayer.check [0.1ms]
│ │ ├─ Tag: circuit_breaker_state=closed
│ │ ├─ Tag: shed_probability=0.0
│ │ └─ Tag: queue_depth=150
│ ├─ Span: LifecycleManager.apply_ttl [0.2ms]
│ │ ├─ Tag: ttl_policy=session:*
│ │ └─ Tag: computed_ttl=3600
│ ├─ Span: CapabilityRouter.select_backend [0.1ms]
│ │ ├─ Tag: backends_evaluated=2
│ │ └─ Tag: selected_backend=redis
│ ├─ Span: KeyValuePattern.store [10ms]
│ │ ├─ Span: RedisBackend.SET [5ms]
│ │ │ ├─ Tag: backend_type=redis
│ │ │ ├─ Tag: operation=SET
│ │ │ └─ Tag: bytes_written=1024
│ │ └─ Span: PostgresBackend.INSERT [5ms]
│ │ ├─ Tag: backend_type=postgres
│ │ ├─ Tag: operation=INSERT
│ │ └─ Tag: bytes_written=1024
│ └─ Span: CostTracking.record [0.1ms]
│ ├─ Tag: operation_cost_usd=0.0000004
│ └─ Tag: storage_cost_usd_per_gb=0.023
└─ Total: 15ms, Cost: $0.0000004
Log Structure (Standard):
{
"timestamp": "2025-11-14T10:30:00.123Z",
"level": "info",
"message": "KeyValue store operation succeeded",
"namespace": "user-sessions",
"pattern": "keyvalue",
"operation": "store",
"key": "session:123",
"backend_type": "redis",
"duration_ms": 1.2,
"bytes_written": 1024,
"ttl_seconds": 3600,
"cost_usd": 0.0000004,
"trace_id": "abc123",
"span_id": "def456",
"circuit_breaker_state": "closed",
"queue_depth": 150
}
Implementation Roadmap
Phase 1: Platform Resilience (4 weeks)
Week 1-2: Circuit Breaker Layer
- Implement per-backend circuit breakers in Rust proxy
- Add configuration schema for circuit breaker policies
- Integration tests with backend failures
- Metrics:
prism_circuit_breaker_state,prism_circuit_breaker_trips_total
Week 3-4: Load Shedding + Backpressure
- Implement adaptive load shedding based on latency
- Add queue depth monitoring and backpressure
- Load testing to validate shed thresholds
- Metrics:
prism_requests_shed_total,prism_queue_depth
Deliverables:
- Circuit breaker middleware in Rust proxy
- Load shedding configuration per namespace
- Grafana dashboard for resilience metrics
Phase 2: Data Lifecycle Management (4 weeks)
Week 1-2: TTL Policies
- Add lifecycle configuration schema
- Implement default TTL enforcement in patterns
- Pattern-specific TTL override logic
- Metrics:
prism_keys_expired_total,prism_ttl_violations_total
Week 3-4: Automated Tiering
- Background job for tiering (hot → warm → cold)
- S3 integration for cold storage
- Cost tracking per tier
- Metrics:
prism_storage_bytes{tier},prism_tier_migrations_total
Deliverables:
- Lifecycle policy configuration
- Automated tiering job
- Cost tracking dashboard
Phase 3: Multi-Modal Access (6 weeks)
Week 1-2: Multi-Modal Router
- Design multi-modal namespace configuration
- Implement pattern routing logic
- Add secondary pattern support to proxy
Week 3-4: Materialized Views
- Implement materialized view generation (Postgres)
- Background refresh for views
- Consistency guarantees
Week 5-6: Analytics Export
- S3 Parquet export job
- Scheduled export (daily/hourly)
- Partition management
Deliverables:
- Multi-modal namespace configuration
- Materialized view support
- Parquet export to S3
Phase 4: Capability-Aware Routing (4 weeks)
Week 1-2: Capability System
- Define backend capability taxonomy
- Implement capability detection for backends
- Scoring function for backend selection
Week 3-4: Dynamic Routing
- Implement operation-specific routing
- Add routing rules configuration
- Load testing for multi-backend namespaces
Deliverables:
- Capability-aware router
- Multi-backend namespace support
- Routing rules configuration
Phase 5: Standardized Observability (3 weeks)
Week 1: Metrics Standardization
- Migrate all patterns to standard metrics schema
- Add cost metrics to patterns
- Update Prometheus recording rules
Week 2: Tracing Enhancement
- Add standard span tags across patterns
- Implement cost tracking in spans
- Integration with OpenTelemetry
Week 3: Dashboards + Alerts
- Unified Grafana dashboards
- Cost tracking dashboard
- Standard alert rules for resilience
Deliverables:
- Standard metrics across all patterns
- Unified observability dashboards
- Alert runbooks
Configuration Examples
Example 1: Production KeyValue with Full Resilience
namespaces:
- name: user-sessions-prod
pattern: keyvalue
# Backends
backends:
- name: cache
type: redis
address: redis-cluster.prod:6379
priority: 1
- name: primary
type: postgres
address: postgres.prod:5432
database: sessions
priority: 2
# Routing
routing:
read_operations:
- try: cache
on_miss: primary
write_operations:
- targets: [primary, cache]
consistency: write_through
# Resilience (NEW)
resilience:
circuit_breaker:
enabled: true
error_threshold: 0.5
min_requests: 100
timeout: 30s
adaptive_load_shedding:
enabled: true
target_latency_p99: 100ms
shed_probability: 0.5
backpressure:
enabled: true
queue_depth_threshold: 1000
# Lifecycle (NEW)
lifecycle:
default_ttl: 3600
ttl_policies:
- key_pattern: "session:*"
ttl: 3600
- key_pattern: "user:*:profile"
ttl: 86400
cleanup:
enabled: true
scan_interval: 3600
cost_tracking:
enabled: true
storage_cost_per_gb_month: 0.023
tiering:
enabled: true
hot_tier:
backend: cache
max_age: 86400
warm_tier:
backend: primary
retention: 604800
Example 2: Multi-Modal TimeSeries + Counter
namespaces:
- name: user-events-multi
# Primary pattern (writes)
primary_pattern: timeseries
backend:
type: postgres
table: user_events
# Secondary patterns (reads) - NEW
secondary_patterns:
# Counter aggregation view
- pattern: counter
mode: eventually_consistent_global
aggregation:
group_by: [user_id, event_type]
window: 3600
refresh_interval: 300
# KeyValue lookup view
- pattern: keyvalue
key_template: "user:{user_id}:events"
materialized_view: true
refresh: continuous
# Analytics export
- pattern: analytics_export
destination: s3
bucket: prism-analytics
format: parquet
partition_by: [date, event_type]
schedule: "0 2 * * *"
# Lifecycle
lifecycle:
default_ttl: 2592000 # 30 days
tiering:
enabled: true
hot_tier:
backend: postgres
max_age: 86400 # 1 day
cold_tier:
backend: s3
retention: 2592000 # 30 days
storage_class: GLACIER
Comparison to Netflix
| Feature | Netflix Data Gateway | Prism (Current) | Prism (After RFC-054) |
|---|---|---|---|
| Platform Resilience | Circuit breakers, load shedding | Per-pattern resilience | ✅ Platform-level |
| Data Lifecycle | TTL, Snapshot Janitors | Manual cleanup | ✅ Automated policies |
| Multi-Modal Access | KV, TimeSeries, Counter on same data | Single pattern/namespace | ✅ Secondary patterns |
| Cost Tracking | Per-namespace attribution | No tracking | ✅ Built-in |
| Backend Selection | Static configuration | Static configuration | ✅ Capability-aware |
| Observability | Standardized metrics | Pattern-specific | ✅ Standardized |
Success Criteria
- Resilience: 99.99% uptime across all namespaces (Netflix-level SLO)
- Cost Reduction: 30% storage cost reduction via automated tiering
- Developer Experience: Zero manual TTL management required
- Performance: P99 latency <100ms with load shedding
- Observability: Unified dashboards for all patterns
Related Documents
- RFC-014: Layered Data Access Patterns - Pattern architecture
- RFC-046: Consolidated Pattern Protocols - Pattern APIs
- RFC-051: Write-Ahead Log Pattern - WAL (already Netflix-inspired)
- Netflix Data Gateway Summary - Netflix lessons learned
- Netflix Abstractions - Netflix's pattern catalog
Open Questions
- Lifecycle Defaults: Should all namespaces have default TTL, or opt-in?
- Cost Attribution: Track at namespace or per-key granularity?
- Multi-Modal Consistency: Eventual or strong consistency between pattern views?
- Backend Failover: Automatic or manual failover when circuit breaker opens?
- Tiering Performance: Acceptable latency increase for warm/cold tier reads?
References
Netflix Documentation
- Netflix Data Gateway Summary
- Netflix Abstractions Overview
- Netflix Key Use Cases
- Netflix Write-Ahead Log
External References
- Netflix Tech Blog: Building a Resilient Data Platform
- Circuit Breaker Pattern (Martin Fowler)
- Adaptive Load Shedding (Google SRE Book)
Revision History
- 2025-11-14: Initial draft inspired by Netflix Data Gateway architecture