Skip to main content

Cluster Administration

This document describes the Prism admin cluster configuration and operational controls. The admin cluster is the control plane that manages all namespace assignments, backend registrations, and cluster-wide policies.

Control Plane

The Prism admin cluster is a separate control plane from the data plane (proxies and pattern runners). It uses Raft consensus for distributed state management.

Architecture

Components

┌─────────────────────────────────────────────────────────────┐
│ Admin Cluster (Raft) │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Admin Node │ │ Admin Node │ │ Admin Node │ │
│ │ (Leader) │ │ (Follower) │ │ (Follower) │ │
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │ │
│ └──────────────────┼──────────────────┘ │
│ │ │
│ SQLite FSM State │
│ (Replicated via Raft) │
└─────────────────────────────────────────────────────────────┘

│ gRPC Control Plane

┌────────────────┼────────────────┐
│ │ │
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ Proxy │ │ Launcher │ │ Launcher │
│ Node │ │ Node │ │ Node │
└──────────┘ └──────────┘ └──────────┘

Raft Consensus

The admin cluster uses Raft for distributed consensus:

  • Leader election: Automatic leader election on startup or failure
  • Log replication: All state changes replicated to followers
  • Strong consistency: Writes only committed when majority agrees
  • Fault tolerance: Tolerates (N-1)/2 failures in N-node cluster

Minimum cluster size: 3 nodes (tolerates 1 failure) Recommended: 5 nodes (tolerates 2 failures) Maximum practical: 7 nodes (diminishing returns, higher latency)

References:


Admin Cluster Configuration

Node Configuration

Each admin node requires:

# Admin node configuration
admin:
node_id: admin-01
bind_address: 0.0.0.0:7070
advertise_address: admin-01.prism.local:7070

# Raft configuration
raft:
data_dir: /var/lib/prism/admin/raft
snapshot_interval: 10m
snapshot_threshold: 10000 # Snapshot after 10k operations

# Cluster peers
peers:
- id: admin-01
address: admin-01.prism.local:7070
- id: admin-02
address: admin-02.prism.local:7070
- id: admin-03
address: admin-03.prism.local:7070

# Election timeout: 150-300ms (randomized)
election_timeout_min: 150ms
election_timeout_max: 300ms

# Heartbeat interval: 50ms
heartbeat_timeout: 50ms

# Replication timeout: 1s
replication_timeout: 1s

# SQLite FSM storage
storage:
sqlite_path: /var/lib/prism/admin/prism-admin.db
wal_mode: true
journal_mode: WAL
synchronous: FULL # Strong durability
cache_size: 10000 # Pages

# Control plane gRPC
control_plane:
bind_address: 0.0.0.0:9000
tls:
enabled: true
cert_path: /etc/prism/certs/admin.pem
key_path: /etc/prism/certs/admin-key.pem
ca_cert_path: /etc/prism/certs/ca.pem
client_auth_required: true

# Observability
observability:
metrics_port: 9090
health_port: 8080
log_level: info
tracing_enabled: true

# Resource limits
resources:
max_namespaces: 10000
max_proxies: 100
max_launchers: 200
max_backends: 500

Cluster Bootstrap

Initial Cluster Setup

# 1. Start first node (bootstrap mode)
./prism-admin start --config admin-01.yaml --bootstrap

# 2. Start second node (joins cluster)
./prism-admin start --config admin-02.yaml --join admin-01.prism.local:7070

# 3. Start third node (joins cluster)
./prism-admin start --config admin-03.yaml --join admin-01.prism.local:7070

# 4. Verify cluster
./prism-admin cluster status

Adding Nodes

# Add new node to existing cluster
./prism-admin cluster add-node --node-id admin-04 --address admin-04.prism.local:7070

# Node joins
./prism-admin start --config admin-04.yaml --join admin-01.prism.local:7070

Removing Nodes

# Gracefully remove node
./prism-admin cluster remove-node --node-id admin-02

# Force remove (if node dead)
./prism-admin cluster remove-node --node-id admin-02 --force

FSM State Management

State Schema

The admin cluster maintains state in SQLite:

-- Namespaces
CREATE TABLE namespaces (
namespace TEXT PRIMARY KEY,
team TEXT NOT NULL,
pattern TEXT NOT NULL,
config_json TEXT NOT NULL, -- Serialized NamespaceConfig
partition_id INTEGER NOT NULL,
assigned_proxy TEXT,
status TEXT NOT NULL, -- pending | active | draining | deleted
created_by TEXT NOT NULL,
created_at INTEGER NOT NULL,
updated_at INTEGER NOT NULL,
version INTEGER NOT NULL DEFAULT 1
);

-- Proxies
CREATE TABLE proxies (
proxy_id TEXT PRIMARY KEY,
address TEXT NOT NULL,
region TEXT NOT NULL,
version TEXT NOT NULL,
capabilities TEXT NOT NULL, -- JSON array
status TEXT NOT NULL, -- registered | active | draining | failed
last_heartbeat_at INTEGER,
registered_at INTEGER NOT NULL,
metadata_json TEXT
);

-- Launchers
CREATE TABLE launchers (
launcher_id TEXT PRIMARY KEY,
address TEXT NOT NULL,
region TEXT NOT NULL,
version TEXT NOT NULL,
max_processes INTEGER NOT NULL,
status TEXT NOT NULL, -- registered | active | draining | failed
last_heartbeat_at INTEGER,
registered_at INTEGER NOT NULL,
metadata_json TEXT
);

-- Process Assignments
CREATE TABLE process_assignments (
process_id TEXT PRIMARY KEY,
launcher_id TEXT NOT NULL,
process_type TEXT NOT NULL, -- pattern | proxy | backend | utility
namespace TEXT,
config_json TEXT NOT NULL,
status TEXT NOT NULL, -- assigned | starting | running | stopping | stopped | failed
assigned_at INTEGER NOT NULL,
started_at INTEGER,
stopped_at INTEGER,
version INTEGER NOT NULL DEFAULT 1,
FOREIGN KEY (launcher_id) REFERENCES launchers(launcher_id)
);

-- Backends
CREATE TABLE backends (
backend_id TEXT PRIMARY KEY,
backend_type TEXT NOT NULL,
config_json TEXT NOT NULL, -- Serialized BackendRegistration
status TEXT NOT NULL, -- registered | healthy | degraded | unhealthy | maintenance
registered_at INTEGER NOT NULL,
updated_at INTEGER NOT NULL
);

-- Teams
CREATE TABLE teams (
team_id TEXT PRIMARY KEY,
display_name TEXT NOT NULL,
permission_level TEXT NOT NULL, -- guided | advanced | expert
quotas_json TEXT NOT NULL,
contact_json TEXT,
created_at INTEGER NOT NULL,
updated_at INTEGER NOT NULL
);

-- Audit Log
CREATE TABLE audit_log (
log_id INTEGER PRIMARY KEY AUTOINCREMENT,
timestamp INTEGER NOT NULL,
operation TEXT NOT NULL, -- create_namespace | assign_namespace | register_proxy, etc.
resource_type TEXT NOT NULL,
resource_id TEXT NOT NULL,
principal TEXT NOT NULL, -- Who performed the operation
details_json TEXT,
success BOOLEAN NOT NULL
);

FSM Operations

All state changes go through the Raft FSM:

// Raft command types
type Command struct {
Type string // create_namespace | assign_namespace | etc.
Payload json.RawMessage // Command-specific data
Principal string // Who issued command
Timestamp int64 // When issued
}

// FSM Apply
func (fsm *AdminFSM) Apply(log *raft.Log) interface{} {
var cmd Command
json.Unmarshal(log.Data, &cmd)

switch cmd.Type {
case "create_namespace":
return fsm.applyCreateNamespace(cmd)
case "assign_namespace":
return fsm.applyAssignNamespace(cmd)
case "register_proxy":
return fsm.applyRegisterProxy(cmd)
case "proxy_heartbeat":
return fsm.applyProxyHeartbeat(cmd)
// ... other commands
}
}

Namespace Management

Namespace Lifecycle

┌─────────┐     CreateNamespace      ┌─────────┐
│ pending │ ────────────────────────> │ active │
└─────────┘ └────┬────┘

│ DrainNamespace

┌─────────┐
│draining │
└────┬────┘

│ DeleteNamespace

┌─────────┐
│ deleted │
└─────────┘

Create Namespace

command:
type: create_namespace
payload:
namespace: order-processing
team: payments-team
requesting_proxy: proxy-01
config:
# Full NamespaceConfig
backends:
storage:
backend_type: postgres
connection_string: postgres-primary.prism.local:5432
patterns:
queue:
pattern_name: queue
settings:
durability: strong
principal: user@example.com

# FSM processing:
1. Validate namespace doesn't exist
2. Validate team quota not exceeded
3. Validate team has permission for config
4. Select partition (hash-based or load-balanced)
5. Select proxy for partition
6. Insert into namespaces table
7. Return CreateNamespaceResponse

Assign Namespace to Proxy

command:
type: assign_namespace
payload:
namespace: order-processing
partition_id: 42
proxy_id: proxy-01
config:
# Same as create_namespace
principal: admin-cluster

# Proxy receives via control plane:
message: NamespaceAssignment
namespace: order-processing
partition_id: 42
config: { ... }
version: 1

# Proxy acknowledges:
message: NamespaceAssignmentAck
success: true
message: "Namespace assigned successfully"

Drain Namespace

command:
type: drain_namespace
payload:
namespace: order-processing
graceful_timeout: 30s
principal: user@example.com

# FSM processing:
1. Set namespace status to 'draining'
2. Send RevokeNamespace to assigned proxy
3. Wait for proxy to finish active sessions (up to timeout)
4. Proxy sends RevokeNamespaceAck
5. Update namespace status to 'deleted'

Proxy Management

Proxy Registration

# Proxy sends on startup
message: ProxyRegistration
proxy_id: proxy-01
address: proxy-01.prism.local:8980
region: us-east-1
version: 0.1.0
capabilities: [keyvalue, pubsub, queue]
metadata:
az: us-east-1a
instance_type: c5.2xlarge

# Admin responds
message: ProxyRegistrationAck
success: true
message: "Registered successfully"
initial_namespaces:
- namespace: order-processing
partition_id: 42
config: { ... }
partition_ranges:
- start: 0
end: 127

Proxy Heartbeat

# Proxy sends every 30s
message: ProxyHeartbeat
proxy_id: proxy-01
namespace_health:
order-processing:
active_sessions: 245
requests_per_second: 1250
status: healthy
resources:
cpu_percent: 45.2
memory_mb: 2048
goroutine_count: 512
uptime_seconds: 86400
timestamp: 1700000000

# Admin responds
message: HeartbeatAck
success: true
server_timestamp: 1700000001

Proxy Failover

When a proxy fails (missed 3 heartbeats):

# Admin detects failure
proxy_status: failed
last_heartbeat: 90 seconds ago (threshold: 60s)

# Admin reassigns namespaces
for namespace in proxy.namespaces:
new_proxy = select_proxy_for_partition(namespace.partition_id)
send_namespace_assignment(new_proxy, namespace)

# Audit log
audit:
operation: proxy_failover
proxy_id: proxy-01
affected_namespaces: 15
reassigned_to: [proxy-02, proxy-03]
reason: "Heartbeat timeout (3 consecutive misses)"

Launcher Management

Launcher Registration

# Launcher sends on startup
message: LauncherRegistration
launcher_id: launcher-01
address: launcher-01.prism.local:7070
region: us-east-1
version: 0.1.0
capabilities: [pattern, proxy, backend, utility]
max_processes: 50
process_types: [pattern-runner, backend-driver]
metadata:
az: us-east-1a
instance_type: c5.4xlarge

# Admin responds
message: LauncherRegistrationAck
success: true
message: "Registered successfully"
initial_processes:
- process_id: pattern-order-processing-001
process_type: pattern
namespace: order-processing
config: { ... }
assigned_capacity: 50

Process Assignment

# Admin assigns process to launcher
message: ProcessAssignment
process_id: pattern-order-processing-001
process_type: pattern
namespace: order-processing
config:
binary: /usr/local/bin/pattern-runner
args: [--config, /etc/prism/patterns/order-processing.yaml]
env:
NAMESPACE: order-processing
PATTERN: queue
port: 9095
health_port: 9096
log_level: info

pattern:
pattern_type: queue
isolation_level: namespace
slots:
storage:
backend_type: postgres
connection_string: postgres-primary.prism.local:5432
settings:
durability: strong
version: 1

# Launcher acknowledges
message: ProcessAssignmentAck
success: true
message: "Process started successfully"

Launcher Heartbeat

# Launcher sends every 30s
message: LauncherHeartbeat
launcher_id: launcher-01
process_health:
pattern-order-processing-001:
status: running
pid: 12345
restart_count: 0
error_count: 0
memory_mb: 512
uptime_seconds: 3600
cpu_percent: 15.3
resources:
process_count: 15
max_processes: 50
total_memory_mb: 8192
cpu_percent: 45.0
available_slots: 35
timestamp: 1700000000

Backend Registry Management

Register Backend

command:
type: register_backend
payload:
backend_id: kafka-prod-us-east-1
backend_type: kafka
config:
# Full BackendRegistration (see backend-registry.md)
connection:
endpoint: kafka-prod.prism.local:9092
capabilities: [...]
interfaces: [...]
principal: operator@example.com

# FSM processing:
1. Validate backend_id unique
2. Validate configuration schema
3. Test backend connectivity
4. Insert into backends table
5. Publish backend_registered event

Update Backend

command:
type: update_backend
payload:
backend_id: kafka-prod-us-east-1
config:
# Updated fields only
capacity:
max_write_rps: 2000000 # Increased capacity
principal: operator@example.com

# FSM processing:
1. Load existing backend
2. Merge configuration
3. Update backends table
4. Publish backend_updated event

Backend Health Updates

# Background health checker updates
command:
type: update_backend_health
payload:
backend_id: kafka-prod-us-east-1
health:
status: healthy
last_check_at: 1700000000
uptime_percent_24h: 99.95
principal: health-checker

# If unhealthy, trigger failover
if health.status == unhealthy:
for binding in find_bindings(backend_id):
new_backend = select_alternative_backend(binding.slot)
rebind_slot(binding, new_backend)

Team Management

Register Team

command:
type: register_team
payload:
team_id: payments-team
display_name: Payments Team
permission_level: advanced
quotas:
max_namespaces: 50
max_total_write_rps: 500000
max_total_data_size: 10TB
max_monthly_cost: 50000
contact:
email: payments-team@example.com
slack_channel: "#payments-ops"
principal: admin@example.com

Update Team Quotas

command:
type: update_team_quotas
payload:
team_id: payments-team
quotas:
max_total_write_rps: 1000000 # Increased
max_monthly_cost: 100000 # Increased
principal: admin@example.com

# Audit log
audit:
operation: update_team_quotas
team: payments-team
changes:
max_total_write_rps: 500000 → 1000000
max_monthly_cost: 50000 → 100000
principal: admin@example.com
reason: "Approved capacity increase for Q4"

Observability

Metrics

# Raft metrics
prism_admin_raft_leader: 1 # Is this node leader?
prism_admin_raft_term: 42
prism_admin_raft_commit_index: 1234567
prism_admin_raft_applied_index: 1234567
prism_admin_raft_fsm_pending: 0
prism_admin_raft_last_log_appended_at: 1700000000

# State metrics
prism_admin_namespaces_total{status="active"}: 1250
prism_admin_namespaces_total{status="draining"}: 5
prism_admin_proxies_total{status="active"}: 15
prism_admin_launchers_total{status="active"}: 30
prism_admin_backends_total{status="healthy"}: 45

# Performance metrics
prism_admin_fsm_apply_duration_seconds: histogram
prism_admin_namespace_create_duration_seconds: histogram
prism_admin_heartbeat_processing_duration_seconds: histogram

Health Checks

# Node health
curl http://admin-01.prism.local:8080/health
{
"status": "healthy",
"raft_state": "leader",
"applied_index": 1234567,
"peers": 3,
"leader_id": "admin-01"
}

# Cluster health
curl http://admin-01.prism.local:8080/health/cluster
{
"status": "healthy",
"leader": "admin-01",
"nodes": [
{"id": "admin-01", "state": "leader", "healthy": true},
{"id": "admin-02", "state": "follower", "healthy": true},
{"id": "admin-03", "state": "follower", "healthy": true}
],
"quorum": true
}

Disaster Recovery

Backup

# Snapshot FSM state
./prism-admin snapshot create --output /backups/admin-20251116.snap

# Backup contains:
# - Full SQLite database
# - Raft configuration
# - Cluster topology

Restore

# Restore from snapshot
./prism-admin snapshot restore --input /backups/admin-20251116.snap

# Process:
1. Stop admin cluster
2. Clear Raft data directories on all nodes
3. Restore snapshot to each node
4. Bootstrap cluster from restored state
5. Verify cluster health

Split-Brain Prevention

# Raft prevents split-brain through:
1. Quorum requirement (majority must agree)
2. Term-based leadership (higher term wins)
3. Log consistency checks

# Example:
Cluster: [admin-01, admin-02, admin-03]
Network partition: admin-01 | admin-02, admin-03

admin-01:
- Cannot form quorum (1/3)
- Cannot accept writes
- Steps down as leader

admin-02, admin-03:
- Can form quorum (2/3)
- Elects new leader (admin-02)
- Continues accepting writes

# When partition heals:
admin-01:
- Rejoins as follower
- Replicates missed log entries
- Catches up to cluster state

See Also