MEMO-073: Week 13 - Storage Backend Evaluation for Massive-Scale Graphs
Date: 2025-11-16 Updated: 2025-11-16 Author: Platform Team Related: MEMO-052, RFC-057, RFC-058, RFC-059
Executive Summary
Goal: Evaluate storage backend options for 100B vertex graph system
Scope: 8 storage backends ranked by implementability for graph workloads
Findings:
- Best for graphs: Neptune (native graph), TigerGraph (native graph)
- Best for scale: S3/MinIO (cold storage), ClickHouse (time-series)
- Most practical: PostgreSQL + pg_timbala (relational), Redis (in-memory)
- Implementability winner: Redis (rank #1, score 95/100)
- Cost winner: S3/MinIO (cold storage tier)
Recommendation: Hybrid approach - Redis (hot tier) + S3 (cold tier) + PostgreSQL (metadata) as validated by RFC-059.
Methodology
Evaluation Criteria
Implementability Score (0-100):
- Go SDK Quality (30 points): Official SDK, community support, documentation
- Data Model Fit (30 points): How naturally backend supports graph operations
- Testing Difficulty (20 points): Local testing, Docker support, test data generation
- Operational Complexity (20 points): Deployment, monitoring, scaling
Data Models Supported
For graph workloads, backends must support:
- Vertices: Key-value or document storage
- Edges: Adjacency lists or edge tables
- Properties: Nested attributes on vertices/edges
- Indexes: Property lookups, traversal optimization
- Partitioning: Distribute across nodes
Findings
Backend Ranking Summary
| Rank | Backend | Score | Go SDK | Data Model | Testing | Best For |
|---|---|---|---|---|---|---|
| 1 | Redis | 95/100 | ✅ Excellent | ✅ Graph-friendly | ✅ Easy | Hot tier caching |
| 2 | PostgreSQL | 90/100 | ✅ Excellent | ✅ Good (JSONB) | ✅ Easy | Metadata, indexes |
| 3 | SQLite | 85/100 | ✅ Good | ✅ Good (JSON) | ✅ Trivial | Dev/testing |
| 4 | S3/MinIO | 80/100 | ✅ Good | ⚠️ Snapshot only | ✅ Easy | Cold storage |
| 5 | ClickHouse | 75/100 | ✅ Good | ⚠️ Time-series | ⚠️ Moderate | Analytics |
| 6 | Kafka | 70/100 | ✅ Good | ⚠️ Event stream | ⚠️ Moderate | Event sourcing |
| 7 | NATS | 65/100 | ✅ Good | ⚠️ Messaging | ⚠️ Moderate | Pub/sub |
| 8 | Neptune | 50/100 | ❌ None (HTTP) | ✅ Native graph | ❌ Hard | AWS-only graphs |
Key Insight: Native graph databases (Neptune, TigerGraph) score lowest on implementability despite best data model fit, due to lack of Go SDK and testing complexity.
Detailed Backend Evaluation
Rank #1: Redis (Score: 95/100) ✅
Overview
Type: In-memory key-value store with data structures Best For: Hot tier vertex/edge caching, real-time access patterns Used In: RFC-057 (hot tier), RFC-059 (10% hot data)
Go SDK Quality (30/30) ✅
// Official: github.com/redis/go-redis/v9
import "github.com/redis/go-redis/v9"
client := redis.NewClient(&redis.Options{
Addr: "localhost:6379",
DB: 0,
})
// Excellent API, strong typing, context support
ctx := context.Background()
err := client.Set(ctx, "vertex:123", vertexJSON, 0).Err()
Assessment:
- ✅ Official Go SDK maintained by Redis
- ✅ Excellent documentation with examples
- ✅ Strong community (19k+ GitHub stars)
- ✅ Context-aware, idiomatic Go
- ✅ Pipelining, transactions, pub/sub support
Data Model Fit (30/30) ✅
Vertex Storage:
// Option 1: Hash (structured)
client.HSet(ctx, "vertex:user:123", map[string]interface{}{
"id": "123",
"name": "Alice",
"age": 30,
"country": "USA",
})
// Option 2: JSON (Redis Stack)
client.JSONSet(ctx, "vertex:user:123", "$", vertexStruct)
Edge Storage (Adjacency Lists):
// Sorted set for edges (score = timestamp or weight)
client.ZAdd(ctx, "edges:user:123:friends", redis.Z{
Score: float64(time.Now().Unix()),
Member: "user:456",
})
// Retrieve friends
friends := client.ZRange(ctx, "edges:user:123:friends", 0, -1)
Indexes:
// Secondary indexes via sets
client.SAdd(ctx, "idx:country:USA", "user:123", "user:456")
// Retrieve all users in USA
usersInUSA := client.SMembers(ctx, "idx:country:USA")
Assessment:
- ✅ Native support for adjacency lists (sorted sets)
- ✅ Efficient property indexes (sets)
- ✅ JSON support via Redis Stack module
- ✅ Atomic operations for consistency
- ⚠️ No native graph traversal (implement in application)
Testing Difficulty (20/20) ✅
Local Testing:
# Podman/Docker
podman run -d --name redis -p 6379:6379 redis:7-alpine
# Or: redis-server (native install)
brew install redis
redis-server
Go Test Integration:
func TestRedisVertex(t *testing.T) {
// Use testcontainers-go for isolated tests
ctx := context.Background()
redisC, _ := testcontainers.GenericContainer(ctx, testcontainers.GenericContainerRequest{
ContainerRequest: testcontainers.ContainerRequest{
Image: "redis:7-alpine",
ExposedPorts: []string{"6379/tcp"},
},
Started: true,
})
defer redisC.Terminate(ctx)
// Connect and test
endpoint, _ := redisC.Endpoint(ctx, "")
client := redis.NewClient(&redis.Options{Addr: endpoint})
// ... test code
}
Assessment:
- ✅ Single-binary, no dependencies
- ✅ Instant startup (<1 second)
- ✅ Excellent testcontainers-go support
- ✅ In-memory = fast tests
- ✅ No schema migrations needed
Operational Complexity (15/20) ✅
Deployment:
- ✅ Stateless deployment with Redis Cluster
- ✅ Excellent Kubernetes operators (Redis Enterprise, Bitnami)
- ⚠️ Persistence requires RDB/AOF configuration
- ⚠️ Memory management (eviction policies)
Monitoring:
- ✅ Built-in INFO command exposes all metrics
- ✅ Prometheus exporter available
- ✅ Grafana dashboards
Scaling:
- ✅ Horizontal: Redis Cluster (sharding)
- ✅ Vertical: Add memory
- ⚠️ Rebalancing requires cluster resharding
Assessment: Mature operational tooling, memory constraints require planning.
Overall Assessment
Strengths:
- ✅ Best Go SDK of all backends
- ✅ Perfect data model for hot tier graphs
- ✅ Trivial local testing
- ✅ Sub-millisecond latency
- ✅ Battle-tested at scale (Twitter, GitHub, StackOverflow)
Weaknesses:
- ⚠️ Memory-bound (expensive at 100B scale)
- ⚠️ No native graph traversal (application-level)
- ⚠️ Persistence trade-offs (RDB snapshots vs AOF overhead)
Use Case: ✅ Ideal for hot tier (10% of data) as validated by RFC-059
Rank #2: PostgreSQL (Score: 90/100) ✅
Overview
Type: Relational database with JSONB support Best For: Metadata, indexes, small-to-medium graphs Used In: RFC-058 (index storage), potential for partition metadata
Go SDK Quality (30/30) ✅
// Popular: github.com/lib/pq or github.com/jackc/pgx/v5
import "github.com/jackc/pgx/v5/pgxpool"
pool, _ := pgxpool.New(ctx, "postgres://user:pass@localhost:5432/graphdb")
// Excellent query builder, prepared statements
var vertex Vertex
err := pool.QueryRow(ctx,
"SELECT id, properties FROM vertices WHERE id = $1",
vertexID,
).Scan(&vertex.ID, &vertex.Properties)
Assessment:
- ✅ Multiple excellent Go drivers (lib/pq, pgx)
- ✅ Strong typing, connection pooling
- ✅ Excellent documentation
- ✅ Native support for JSON/JSONB
- ✅ Prepared statements, batch operations
Data Model Fit (25/30) ✅
Schema Design:
-- Vertices table
CREATE TABLE vertices (
id BIGINT PRIMARY KEY,
label TEXT NOT NULL,
properties JSONB NOT NULL,
created_at TIMESTAMPTZ DEFAULT NOW()
);
-- Edges table (adjacency list)
CREATE TABLE edges (
src_id BIGINT NOT NULL,
dst_id BIGINT NOT NULL,
label TEXT NOT NULL,
properties JSONB,
PRIMARY KEY (src_id, dst_id, label)
);
-- Indexes for traversal
CREATE INDEX idx_edges_src ON edges(src_id);
CREATE INDEX idx_edges_dst ON edges(dst_id);
CREATE INDEX idx_vertices_props ON vertices USING GIN(properties);
Graph Operations:
-- Find friends (1-hop)
SELECT v.* FROM vertices v
JOIN edges e ON e.dst_id = v.id
WHERE e.src_id = 123 AND e.label = 'friend';
-- Property filter
SELECT * FROM vertices
WHERE properties @> '{"country": "USA"}';
-- 2-hop traversal (CTE)
WITH RECURSIVE friends AS (
SELECT dst_id, 1 as depth FROM edges WHERE src_id = 123
UNION
SELECT e.dst_id, f.depth + 1
FROM edges e
JOIN friends f ON e.src_id = f.dst_id
WHERE f.depth < 2
)
SELECT v.* FROM vertices v JOIN friends f ON v.id = f.dst_id;
Assessment:
- ✅ JSONB excellent for flexible properties
- ✅ GIN indexes for JSONB queries
- ✅ Recursive CTEs for traversals (up to ~3 hops practical)
- ⚠️ Deep traversals (4+ hops) become expensive
- ⚠️ No native graph algorithms
Testing Difficulty (20/20) ✅
Local Testing:
# Podman
podman run -d --name postgres \
-e POSTGRES_PASSWORD=secret \
-p 5432:5432 \
postgres:16-alpine
Test Helpers:
func TestPostgresGraph(t *testing.T) {
// Use testcontainers-go
ctx := context.Background()
pgC, _ := testcontainers.GenericContainer(ctx, testcontainers.GenericContainerRequest{
ContainerRequest: testcontainers.ContainerRequest{
Image: "postgres:16-alpine",
Env: map[string]string{"POSTGRES_PASSWORD": "secret"},
ExposedPorts: []string{"5432/tcp"},
WaitingFor: wait.ForLog("database system is ready"),
},
Started: true,
})
defer pgC.Terminate(ctx)
// Run migrations, seed test data
// ... test code
}
Assessment:
- ✅ Excellent testcontainers-go support
- ✅ Fast startup (~3 seconds)
- ✅ Schema migrations via goose/migrate
- ✅ Test data generation straightforward
Operational Complexity (15/20) ✅
Deployment:
- ✅ Mature Kubernetes operators (Crunchy, Zalando)
- ✅ Excellent backup/restore (pg_dump, WAL archiving)
- ✅ Streaming replication
Monitoring:
- ✅ pg_stat_* views expose all metrics
- ✅ Excellent Prometheus exporters
- ✅ Deep observability (query plans, slow logs)
Scaling:
- ✅ Vertical: Add CPU/memory/storage
- ⚠️ Horizontal: Requires sharding (Citus, manual)
- ⚠️ Large tables (>100M rows) need partitioning
Assessment: Excellent operational maturity, horizontal scaling requires extensions.
Overall Assessment
Strengths:
- ✅ Excellent Go SDK (pgx)
- ✅ JSONB perfect for flexible properties
- ✅ Recursive CTEs for limited traversals
- ✅ Trivial local testing
- ✅ 40+ years of operational knowledge
Weaknesses: