Status: DraftAuthor: SystemCreated: Oct 8, 2025

RFC-007: Cache Strategies for Data Layer

Status: Draft Author: System Created: 2025-10-08 Updated: 2025-10-08

Abstract

Caching is fundamental to achieving low-latency, high-throughput data access. This RFC defines standard cache strategies implemented in Prism's data layer, focusing on look-aside (cache-aside) and write-through patterns for common use cases like table readers and object storage metadata caching.

By standardizing cache strategies at the proxy level, applications benefit from transparent caching without implementing cache logic in every service. Prism manages cache consistency, expiration (ADR-031), and invalidation automatically based on declarative configuration.

Motivation

Why Cache Strategies Matter

Performance: Sub-millisecond responses for cached data vs. 10-100ms database queries
Cost Reduction: Fewer database queries reduce compute and I/O costs
Scalability: Cache absorbs read traffic, allowing databases to scale independently
Availability: Cached data remains available during backend outages (stale reads)
Consistency: Different strategies offer different consistency guarantees

Real-World Scenarios

Table Readers: Frequently accessed reference tables (countries, categories, product catalogs)
Object Metadata: File metadata from object storage (size, content-type, ETag)
User Profiles: High-read, low-write data accessed on every request
Configuration Data: Application settings queried repeatedly
Computed Results: Expensive aggregations or ML model outputs

Goals

Define standard cache strategies with clear consistency semantics
Implement look-aside and write-through patterns for common use cases
Support cache warmup, invalidation, and expiration
Provide configuration-driven cache behavior (no code changes required)
Enable observability into cache hit rates, latency, and consistency

Non-Goals

Custom Cache Logic: Not implementing application-specific cache policies
Distributed Cache Coordination: Not solving distributed cache coherence (use Redis cluster instead)
Cache-Aside in Clients: Clients use Prism APIs; caching is transparent

Cache Strategies Overview

Strategy Comparison

Strategy	Read Path	Write Path	Consistency	Use Case
Look-Aside	Check cache first	Write to DB only	Eventual	Read-heavy, tolerate stale reads
Write-Through	Check cache first	Write to cache + DB	Strong	Read-heavy, require fresh reads
Write-Back	Check cache first	Write to cache only	Weak	Write-heavy, tolerate data loss
Refresh-Ahead	Check cache first	Background refresh	Eventual	Predictable access patterns
Read-Through	Cache or fetch	Write to DB only	Eventual	Simplify read logic

This RFC focuses on Look-Aside and Write-Through as they cover 90% of use cases.

Look-Aside (Cache-Aside) Pattern

Overview

Look-aside is the most common caching pattern:

Read: Check cache; if miss, fetch from DB, store in cache
Write: Update DB; optionally invalidate cache

Pros:

Simple to reason about
Cache failures don't affect writes
Flexible invalidation strategies

Cons:

Cache misses cause latency spikes
Potential for stale reads
Thundering herd on cold cache

Architecture Diagram

Protobuf Configuration

message LookAsideCacheConfig {
  // Cache backend (redis, in-memory)
  string cache_backend = 1;

  // Cache key prefix
  string key_prefix = 2;

  // TTL for cached entries (see ADR-031)
  int64 ttl_seconds = 3 [default = 300];

  // Invalidation strategy
  enum InvalidationStrategy {
    INVALIDATE_ON_WRITE = 0;  // Delete cache entry on write
    NO_INVALIDATION = 1;       // Rely on TTL expiration
    BACKGROUND_REFRESH = 2;    // Refresh cache asynchronously
  }
  InvalidationStrategy invalidation = 4;

  // Warmup strategy
  bool enable_warmup = 5;
  string warmup_query = 6;  // SQL query to pre-populate cache

  // Thundering herd prevention
  bool enable_locking = 7;   // Lock during cache fill to prevent duplicate fetches
  int64 lock_timeout_ms = 8 [default = 1000];
}

Namespace Configuration Example

namespaces:
  - name: user-profiles
    backend: postgres
    pattern: keyvalue
    cache:
      strategy: look_aside
      cache_backend: redis
      key_prefix: "users:"
      ttl_seconds: 300
      invalidation: INVALIDATE_ON_WRITE
      enable_locking: true

  - name: product-catalog
    backend: postgres
    pattern: keyvalue
    cache:
      strategy: look_aside
      cache_backend: redis
      key_prefix: "products:"
      ttl_seconds: 3600  # 1 hour
      invalidation: NO_INVALIDATION  # Read-only catalog
      enable_warmup: true
      warmup_query: "SELECT id, data FROM products WHERE active=true"

Rust Implementation

pub struct LookAsideCache {
    cache: RedisBackend,
    database: PostgresBackend,
    config: LookAsideCacheConfig,
}

impl LookAsideCache {
    pub async fn get(&self, key: &str) -> Result<Option<Bytes>> {
        let cache_key = format!("{}{}", self.config.key_prefix, key);

        // Step 1: Check cache
        if let Some(data) = self.cache.get(&cache_key).await? {
            metrics::increment_counter!("cache_hits", "namespace" => &self.config.namespace);
            return Ok(Some(data));
        }

        metrics::increment_counter!("cache_misses", "namespace" => &self.config.namespace);

        // Step 2: Thundering herd prevention
        if self.config.enable_locking {
            let lock_key = format!("{}:lock", cache_key);

            // Try to acquire lock
            if !self.cache.set_nx(&lock_key, b"1", Duration::from_millis(self.config.lock_timeout_ms)).await? {
                // Another request is fetching; wait and retry
                tokio::time::sleep(Duration::from_millis(50)).await;
                return self.get(key).await; // Retry (cache should be populated)
            }
        }

        // Step 3: Fetch from database
        let data = self.database.get(key).await?;

        // Step 4: Populate cache
        if let Some(ref data) = data {
            self.cache
                .set_ex(&cache_key, data, self.config.ttl_seconds as usize)
                .await?;
        }

        // Step 5: Release lock
        if self.config.enable_locking {
            let lock_key = format!("{}:lock", cache_key);
            self.cache.del(&lock_key).await?;
        }

        Ok(data)
    }

    pub async fn set(&self, key: &str, value: &[u8]) -> Result<()> {
        let cache_key = format!("{}{}", self.config.key_prefix, key);

        // Step 1: Write to database (source of truth)
        self.database.set(key, value).await?;

        // Step 2: Invalidate cache
        match self.config.invalidation {
            InvalidationStrategy::InvalidateOnWrite => {
                self.cache.del(&cache_key).await?;
            }
            InvalidationStrategy::NoInvalidation => {
                // Do nothing; rely on TTL
            }
            InvalidationStrategy::BackgroundRefresh => {
                // Trigger async refresh (not blocking write)
                let cache = self.cache.clone();
                let db = self.database.clone();
                let key = key.to_string();
                tokio::spawn(async move {
                    if let Ok(Some(data)) = db.get(&key).await {
                        let _ = cache.set_ex(&cache_key, &data, 300).await;
                    }
                });
            }
        }

        Ok(())
    }

    pub async fn warmup(&self) -> Result<usize> {
        if !self.config.enable_warmup || self.config.warmup_query.is_empty() {
            return Ok(0);
        }

        let rows = self.database.query(&self.config.warmup_query).await?;
        let mut count = 0;

        for row in rows {
            let key: String = row.get("id");
            let data: Vec<u8> = row.get("data");
            let cache_key = format!("{}{}", self.config.key_prefix, key);

            self.cache
                .set_ex(&cache_key, &data, self.config.ttl_seconds as usize)
                .await?;
            count += 1;
        }

        Ok(count)
    }
}

Write-Through Cache Pattern

Overview

Write-through ensures cache consistency by writing to both cache and database synchronously:

Read: Check cache; if miss, fetch from DB, store in cache
Write: Update cache AND database atomically

Pros:

Cache is always consistent with DB
No stale reads
Simpler consistency model

Cons:

Write latency (cache + DB)
Write failures affect both cache and DB
More complex error handling

Architecture Diagram

Protobuf Configuration

message WriteThroughCacheConfig {
  // Cache backend
  string cache_backend = 1;

  // Cache key prefix
  string key_prefix = 2;

  // TTL (optional; can be infinite for permanent config)
  optional int64 ttl_seconds = 3;

  // Write ordering
  enum WriteOrder {
    CACHE_THEN_DB = 0;  // Write cache first (faster, risk of inconsistency)
    DB_THEN_CACHE = 1;  // Write DB first (slower, safer)
  }
  WriteOrder write_order = 4 [default = DB_THEN_CACHE];

  // Rollback on failure
  bool enable_rollback = 5 [default = true];

  // Async write to cache (improves write latency)
  bool async_cache_write = 6 [default = false];
}

Namespace Configuration Example

namespaces:
  - name: application-config
    backend: postgres
    pattern: keyvalue
    cache:
      strategy: write_through
      cache_backend: redis
      key_prefix: "config:"
      ttl_seconds: null  # Infinite TTL (configuration data)
      write_order: DB_THEN_CACHE
      enable_rollback: true

  - name: user-settings
    backend: postgres
    pattern: keyvalue
    cache:
      strategy: write_through
      cache_backend: redis
      key_prefix: "settings:"
      ttl_seconds: 86400  # 24 hours
      write_order: DB_THEN_CACHE
      async_cache_write: false  # Synchronous for consistency

Rust Implementation

pub struct WriteThroughCache {
    cache: RedisBackend,
    database: PostgresBackend,
    config: WriteThroughCacheConfig,
}

impl WriteThroughCache {
    pub async fn get(&self, key: &str) -> Result<Option<Bytes>> {
        let cache_key = format!("{}{}", self.config.key_prefix, key);

        // Check cache first
        if let Some(data) = self.cache.get(&cache_key).await? {
            metrics::increment_counter!("cache_hits");
            return Ok(Some(data));
        }

        metrics::increment_counter!("cache_misses");

        // Fetch from database
        let data = self.database.get(key).await?;

        // Populate cache
        if let Some(ref data) = data {
            let ttl = self.config.ttl_seconds.unwrap_or(0);
            if ttl > 0 {
                self.cache.set_ex(&cache_key, data, ttl as usize).await?;
            } else {
                self.cache.set(&cache_key, data).await?;
            }
        }

        Ok(data)
    }

    pub async fn set(&self, key: &str, value: &[u8]) -> Result<()> {
        let cache_key = format!("{}{}", self.config.key_prefix, key);

        match self.config.write_order {
            WriteOrder::DbThenCache => {
                // Step 1: Write to database (source of truth)
                if let Err(e) = self.database.set(key, value).await {
                    // DB write failed; do NOT update cache
                    return Err(e);
                }

                // Step 2: Write to cache (DB succeeded)
                if self.config.async_cache_write {
                    // Async update (improves write latency)
                    let cache = self.cache.clone();
                    let cache_key = cache_key.clone();
                    let value = value.to_vec();
                    tokio::spawn(async move {
                        let _ = cache.set(&cache_key, &value).await;
                    });
                } else {
                    // Sync update (ensures consistency)
                    if let Err(e) = self.cache.set(&cache_key, value).await {
                        // Cache write failed; log but don't fail request
                        // (DB is source of truth)
                        warn!("Cache update failed: {}", e);
                    }
                }

                Ok(())
            }

            WriteOrder::CacheThenDb => {
                // Step 1: Write to cache (fast path)
                self.cache.set(&cache_key, value).await?;

                // Step 2: Write to database
                if let Err(e) = self.database.set(key, value).await {
                    // DB write failed; rollback cache if enabled
                    if self.config.enable_rollback {
                        let _ = self.cache.del(&cache_key).await;
                    }
                    return Err(e);
                }

                Ok(())
            }
        }
    }
}

Use Case: Table Reader with Look-Aside Cache

Scenario

Frequently accessed reference table (e.g., countries, categories, product_catalog) that rarely changes.

Requirements:

Low read latency (< 5ms P99)
Tolerate stale reads up to 1 hour
Handle 10,000 RPS peak load

Configuration

namespaces:
  - name: product-catalog
    backend: postgres
    pattern: keyvalue
    cache:
      strategy: look_aside
      cache_backend: redis
      key_prefix: "catalog:"
      ttl_seconds: 3600  # 1 hour
      invalidation: NO_INVALIDATION  # Read-only data
      enable_warmup: true
      warmup_query: |
        SELECT id::text,
               row_to_json(products)::text as data
        FROM products
        WHERE active = true

Client Usage

from prism_sdk import PrismClient

client = PrismClient(namespace="product-catalog")

# Read (cache hit: ~2ms, cache miss: ~25ms)
product = client.get("product:12345")

# Warmup cache (run on deployment)
client.warmup()

Performance

Metric	Look-Aside Cache	Direct DB Query
P50 Latency	1.5ms	15ms
P99 Latency	3.2ms	35ms
Cache Hit Rate	95%	N/A
DB Load	500 QPS	10,000 QPS

Use Case: Object Storage Metadata with Write-Through Cache

Scenario

Object metadata (size, content-type, ETag, last-modified) accessed on every file operation but infrequently updated.

Requirements:

Metadata always consistent with object storage
Low read latency (< 3ms P99)
Handle 5,000 metadata queries/sec

Configuration

namespaces:
  - name: object-metadata
    backend: postgres  # Metadata in PostgreSQL
    pattern: keyvalue
    cache:
      strategy: write_through
      cache_backend: redis
      key_prefix: "obj_meta:"
      ttl_seconds: 86400  # 24 hours
      write_order: DB_THEN_CACHE
      enable_rollback: true

Client Usage

client = PrismClient(namespace="object-metadata")

# Write metadata (updates cache + DB atomically)
client.set("bucket/file.jpg", {
    "size_bytes": 1024000,
    "content_type": "image/jpeg",
    "etag": "abc123",
    "last_modified": 1696780800,
})

# Read metadata (from cache: ~1ms)
metadata = client.get("bucket/file.jpg")
print(f"File size: {metadata['size_bytes']} bytes")

Monitoring and Observability

Cache Metrics

message CacheMetrics {
  string namespace = 1;
  string strategy = 2;  // "look_aside", "write_through"

  // Hit/Miss rates
  int64 cache_hits = 3;
  int64 cache_misses = 4;
  float hit_rate = 5;  // cache_hits / (cache_hits + cache_misses)

  // Latency
  float read_latency_p50_ms = 6;
  float read_latency_p99_ms = 7;
  float write_latency_p50_ms = 8;
  float write_latency_p99_ms = 9;

  // Cache operations
  int64 cache_evictions = 10;
  int64 cache_invalidations = 11;
  int64 warmup_count = 12;

  // Consistency
  int64 write_failures = 13;
  int64 rollback_count = 14;
}

Prometheus Metrics

# Cache hit rate
prism_cache_hits_total{namespace="product-catalog", strategy="look_aside"}
prism_cache_misses_total{namespace="product-catalog", strategy="look_aside"}

# Latency histograms
prism_cache_read_duration_seconds{namespace="product-catalog", quantile="0.5"}
prism_cache_read_duration_seconds{namespace="product-catalog", quantile="0.99"}

# Cache size
prism_cache_items_total{namespace="product-catalog"}
prism_cache_bytes_total{namespace="product-catalog"}

Grafana Dashboard Queries

# Cache hit rate
rate(prism_cache_hits_total[5m]) /
(rate(prism_cache_hits_total[5m]) + rate(prism_cache_misses_total[5m]))

# P99 read latency
histogram_quantile(0.99, rate(prism_cache_read_duration_seconds_bucket[5m]))

# Database load reduction
rate(prism_database_queries_total[5m]) vs. rate(prism_cache_misses_total[5m])

Cache Invalidation Strategies

Invalidation Comparison

Strategy	Consistency	Latency	Use Case
TTL Expiration	Eventual	Low	Read-only or rarely updated data
On-Write Invalidate	Strong	Medium	Frequent writes, require fresh
Background Refresh	Eventual	Low	Predictable updates (e.g., nightly)
Manual Invalidate	Strong	Low	Admin-triggered cache clear

Manual Invalidation via Admin CLI

# Invalidate specific cache entry
prism cache invalidate product-catalog --key "product:12345"

# Invalidate by prefix
prism cache invalidate product-catalog --prefix "category:"

# Flush entire namespace cache
prism cache flush product-catalog

# Trigger cache warmup
prism cache warmup product-catalog

Migration Path

Phase 1: Look-Aside Implementation (Week 1-2)

Redis Integration: Implement cache backend (ADR-010)
LookAsideCache: Rust implementation with thundering herd prevention
Namespace Config: Add cache configuration to namespace schema
Metrics: Cache hit rate, latency, evictions

Deliverable: Look-aside cache for KeyValue pattern

Phase 2: Write-Through Implementation (Week 3-4)

WriteThroughCache: Rust implementation with rollback
Configuration: Add write_order, rollback options
Integration Tests: Consistency validation tests
Documentation: Cache strategy selection guide

Deliverable: Write-through cache with consistency guarantees

Phase 3: Advanced Features (Week 5-6)

Cache Warmup: Background warmup on startup
Background Refresh: Async cache refresh for long-lived data
Admin CLI: Cache management commands
Monitoring: Grafana dashboards for cache observability

Deliverable: Production-ready caching with operational tools

Phase 4: Additional Patterns (Future)

Write-Back Cache: For write-heavy workloads
Refresh-Ahead: Predictive cache refresh
Multi-Level Cache: Local + distributed cache tiers
Cache Replication: Geo-distributed cache

Security Considerations

Cache Poisoning: Validate data before caching
PII in Cache: Apply encryption for sensitive data (see ADR-031)
Cache Isolation: Namespace-level cache isolation
TTL Enforcement: Prevent unbounded cache growth
Access Control: Cache operations require namespace permissions

Performance Targets

Pattern	Operation	P50 Latency	P99 Latency	Throughput
Look-Aside	Read (hit)	< 2ms	< 5ms	50k RPS
Look-Aside	Read (miss)	< 20ms	< 50ms	5k RPS
Look-Aside	Write	< 15ms	< 40ms	2k RPS
Write-Through	Read (hit)	< 2ms	< 5ms	50k RPS
Write-Through	Write	< 25ms	< 60ms	1k RPS

RFC-004: Redis Integration (cache backend)
RFC-005: ClickHouse Integration (aggregated cache)
ADR-031: TTL Defaults (cache expiration)
ADR-032: Object Storage Pattern (metadata caching)

References

Appendix: Cache Strategy Decision Tree

What's your access pattern? ├─ Read-heavy (90%+ reads) │ ├─ Can tolerate stale reads? → Look-Aside │ └─ Need fresh reads? → Write-Through └─ Write-heavy (50%+ writes) ├─ Can tolerate data loss? → Write-Back └─ Need durability? → Write-Through

What's your consistency requirement? ├─ Eventual consistency OK → Look-Aside + TTL ├─ Strong consistency → Write-Through └─ Real-time consistency → Write-Through + short TTL

What's your data update frequency? ├─ Rarely (hourly+) → Look-Aside + long TTL + warmup ├─ Occasionally (minutes) → Look-Aside + short TTL └─ Frequently (seconds) → Write-Through

---

**Status**: Draft
**Next Steps**:
1. Implement LookAsideCache in Rust proxy
2. Add cache configuration to namespace schema
3. Implement WriteThroughCache with rollback
4. Add cache metrics to monitoring
5. Document cache strategy best practices

Abstract​

Motivation​

Why Cache Strategies Matter​

Real-World Scenarios​

Goals​

Non-Goals​

Cache Strategies Overview​

Strategy Comparison​

Look-Aside (Cache-Aside) Pattern​

Overview​

Architecture Diagram​

Protobuf Configuration​

Namespace Configuration Example​

Rust Implementation​

Write-Through Cache Pattern​

Overview​

Architecture Diagram​

Protobuf Configuration​

Namespace Configuration Example​

Rust Implementation​

Use Case: Table Reader with Look-Aside Cache​

Scenario​

Configuration​

Client Usage​

Performance​

Use Case: Object Storage Metadata with Write-Through Cache​

Scenario​

Configuration​

Client Usage​

Monitoring and Observability​

Cache Metrics​

Prometheus Metrics​

Grafana Dashboard Queries​

Cache Invalidation Strategies​

Invalidation Comparison​

Manual Invalidation via Admin CLI​

Migration Path​

Phase 1: Look-Aside Implementation (Week 1-2)​

Phase 2: Write-Through Implementation (Week 3-4)​

Phase 3: Advanced Features (Week 5-6)​

Phase 4: Additional Patterns (Future)​

Security Considerations​

Performance Targets​

Related RFCs and ADRs​

References​

Appendix: Cache Strategy Decision Tree​

Abstract

Motivation

Why Cache Strategies Matter

Real-World Scenarios

Goals

Non-Goals

Cache Strategies Overview

Strategy Comparison

Look-Aside (Cache-Aside) Pattern

Overview

Architecture Diagram

Protobuf Configuration

Namespace Configuration Example

Rust Implementation

Write-Through Cache Pattern

Overview

Architecture Diagram

Protobuf Configuration

Namespace Configuration Example

Rust Implementation

Use Case: Table Reader with Look-Aside Cache

Scenario

Configuration

Client Usage

Performance

Use Case: Object Storage Metadata with Write-Through Cache

Scenario

Configuration

Client Usage

Monitoring and Observability

Cache Metrics

Prometheus Metrics

Grafana Dashboard Queries

Cache Invalidation Strategies

Invalidation Comparison

Manual Invalidation via Admin CLI

Migration Path

Phase 1: Look-Aside Implementation (Week 1-2)

Phase 2: Write-Through Implementation (Week 3-4)

Phase 3: Advanced Features (Week 5-6)

Phase 4: Additional Patterns (Future)

Security Considerations

Performance Targets

Related RFCs and ADRs

References

Appendix: Cache Strategy Decision Tree