Cluster Administration

This document describes the Prism admin cluster configuration and operational controls. The admin cluster is the control plane that manages all namespace assignments, backend registrations, and cluster-wide policies.

Control Plane

The Prism admin cluster is a separate control plane from the data plane (proxies and pattern runners). It uses Raft consensus for distributed state management.

Architecture

Components

┌─────────────────────────────────────────────────────────────┐
│                    Admin Cluster (Raft)                      │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐     │
│  │  Admin Node  │  │  Admin Node  │  │  Admin Node  │     │
│  │   (Leader)   │  │  (Follower)  │  │  (Follower)  │     │
│  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘     │
│         │                  │                  │              │
│         └──────────────────┼──────────────────┘              │
│                           │                                  │
│                    SQLite FSM State                          │
│                  (Replicated via Raft)                       │
└─────────────────────────────────────────────────────────────┘
                           │
                           │ gRPC Control Plane
                           │
          ┌────────────────┼────────────────┐
          │                │                │
          ▼                ▼                ▼
    ┌──────────┐    ┌──────────┐    ┌──────────┐
    │  Proxy   │    │ Launcher │    │ Launcher │
    │   Node   │    │   Node   │    │   Node   │
    └──────────┘    └──────────┘    └──────────┘

Raft Consensus

The admin cluster uses Raft for distributed consensus:

Leader election: Automatic leader election on startup or failure
Log replication: All state changes replicated to followers
Strong consistency: Writes only committed when majority agrees
Fault tolerance: Tolerates (N-1)/2 failures in N-node cluster

Minimum cluster size: 3 nodes (tolerates 1 failure) Recommended: 5 nodes (tolerates 2 failures) Maximum practical: 7 nodes (diminishing returns, higher latency)

References:

Admin Cluster Configuration

Node Configuration

Each admin node requires:

# Admin node configuration
admin:
  node_id: admin-01
  bind_address: 0.0.0.0:7070
  advertise_address: admin-01.prism.local:7070

  # Raft configuration
  raft:
    data_dir: /var/lib/prism/admin/raft
    snapshot_interval: 10m
    snapshot_threshold: 10000  # Snapshot after 10k operations

    # Cluster peers
    peers:
      - id: admin-01
        address: admin-01.prism.local:7070
      - id: admin-02
        address: admin-02.prism.local:7070
      - id: admin-03
        address: admin-03.prism.local:7070

    # Election timeout: 150-300ms (randomized)
    election_timeout_min: 150ms
    election_timeout_max: 300ms

    # Heartbeat interval: 50ms
    heartbeat_timeout: 50ms

    # Replication timeout: 1s
    replication_timeout: 1s

  # SQLite FSM storage
  storage:
    sqlite_path: /var/lib/prism/admin/prism-admin.db
    wal_mode: true
    journal_mode: WAL
    synchronous: FULL  # Strong durability
    cache_size: 10000  # Pages

  # Control plane gRPC
  control_plane:
    bind_address: 0.0.0.0:9000
    tls:
      enabled: true
      cert_path: /etc/prism/certs/admin.pem
      key_path: /etc/prism/certs/admin-key.pem
      ca_cert_path: /etc/prism/certs/ca.pem
      client_auth_required: true

  # Observability
  observability:
    metrics_port: 9090
    health_port: 8080
    log_level: info
    tracing_enabled: true

  # Resource limits
  resources:
    max_namespaces: 10000
    max_proxies: 100
    max_launchers: 200
    max_backends: 500

Cluster Bootstrap

Initial Cluster Setup

# 1. Start first node (bootstrap mode)
./prism-admin start --config admin-01.yaml --bootstrap

# 2. Start second node (joins cluster)
./prism-admin start --config admin-02.yaml --join admin-01.prism.local:7070

# 3. Start third node (joins cluster)
./prism-admin start --config admin-03.yaml --join admin-01.prism.local:7070

# 4. Verify cluster
./prism-admin cluster status

Adding Nodes

# Add new node to existing cluster
./prism-admin cluster add-node --node-id admin-04 --address admin-04.prism.local:7070

# Node joins
./prism-admin start --config admin-04.yaml --join admin-01.prism.local:7070

Removing Nodes

# Gracefully remove node
./prism-admin cluster remove-node --node-id admin-02

# Force remove (if node dead)
./prism-admin cluster remove-node --node-id admin-02 --force

FSM State Management

State Schema

The admin cluster maintains state in SQLite:

-- Namespaces
CREATE TABLE namespaces (
    namespace TEXT PRIMARY KEY,
    team TEXT NOT NULL,
    pattern TEXT NOT NULL,
    config_json TEXT NOT NULL,  -- Serialized NamespaceConfig
    partition_id INTEGER NOT NULL,
    assigned_proxy TEXT,
    status TEXT NOT NULL,  -- pending | active | draining | deleted
    created_by TEXT NOT NULL,
    created_at INTEGER NOT NULL,
    updated_at INTEGER NOT NULL,
    version INTEGER NOT NULL DEFAULT 1
);

-- Proxies
CREATE TABLE proxies (
    proxy_id TEXT PRIMARY KEY,
    address TEXT NOT NULL,
    region TEXT NOT NULL,
    version TEXT NOT NULL,
    capabilities TEXT NOT NULL,  -- JSON array
    status TEXT NOT NULL,  -- registered | active | draining | failed
    last_heartbeat_at INTEGER,
    registered_at INTEGER NOT NULL,
    metadata_json TEXT
);

-- Launchers
CREATE TABLE launchers (
    launcher_id TEXT PRIMARY KEY,
    address TEXT NOT NULL,
    region TEXT NOT NULL,
    version TEXT NOT NULL,
    max_processes INTEGER NOT NULL,
    status TEXT NOT NULL,  -- registered | active | draining | failed
    last_heartbeat_at INTEGER,
    registered_at INTEGER NOT NULL,
    metadata_json TEXT
);

-- Process Assignments
CREATE TABLE process_assignments (
    process_id TEXT PRIMARY KEY,
    launcher_id TEXT NOT NULL,
    process_type TEXT NOT NULL,  -- pattern | proxy | backend | utility
    namespace TEXT,
    config_json TEXT NOT NULL,
    status TEXT NOT NULL,  -- assigned | starting | running | stopping | stopped | failed
    assigned_at INTEGER NOT NULL,
    started_at INTEGER,
    stopped_at INTEGER,
    version INTEGER NOT NULL DEFAULT 1,
    FOREIGN KEY (launcher_id) REFERENCES launchers(launcher_id)
);

-- Backends
CREATE TABLE backends (
    backend_id TEXT PRIMARY KEY,
    backend_type TEXT NOT NULL,
    config_json TEXT NOT NULL,  -- Serialized BackendRegistration
    status TEXT NOT NULL,  -- registered | healthy | degraded | unhealthy | maintenance
    registered_at INTEGER NOT NULL,
    updated_at INTEGER NOT NULL
);

-- Teams
CREATE TABLE teams (
    team_id TEXT PRIMARY KEY,
    display_name TEXT NOT NULL,
    permission_level TEXT NOT NULL,  -- guided | advanced | expert
    quotas_json TEXT NOT NULL,
    contact_json TEXT,
    created_at INTEGER NOT NULL,
    updated_at INTEGER NOT NULL
);

-- Audit Log
CREATE TABLE audit_log (
    log_id INTEGER PRIMARY KEY AUTOINCREMENT,
    timestamp INTEGER NOT NULL,
    operation TEXT NOT NULL,  -- create_namespace | assign_namespace | register_proxy, etc.
    resource_type TEXT NOT NULL,
    resource_id TEXT NOT NULL,
    principal TEXT NOT NULL,  -- Who performed the operation
    details_json TEXT,
    success BOOLEAN NOT NULL
);

FSM Operations

All state changes go through the Raft FSM:

// Raft command types
type Command struct {
    Type      string          // create_namespace | assign_namespace | etc.
    Payload   json.RawMessage // Command-specific data
    Principal string          // Who issued command
    Timestamp int64           // When issued
}

// FSM Apply
func (fsm *AdminFSM) Apply(log *raft.Log) interface{} {
    var cmd Command
    json.Unmarshal(log.Data, &cmd)

    switch cmd.Type {
    case "create_namespace":
        return fsm.applyCreateNamespace(cmd)
    case "assign_namespace":
        return fsm.applyAssignNamespace(cmd)
    case "register_proxy":
        return fsm.applyRegisterProxy(cmd)
    case "proxy_heartbeat":
        return fsm.applyProxyHeartbeat(cmd)
    // ... other commands
    }
}

Namespace Management

Namespace Lifecycle

┌─────────┐     CreateNamespace      ┌─────────┐
│ pending │ ────────────────────────> │ active  │
└─────────┘                           └────┬────┘
                                           │
                                           │ DrainNamespace
                                           ▼
                                      ┌─────────┐
                                      │draining │
                                      └────┬────┘
                                           │
                                           │ DeleteNamespace
                                           ▼
                                      ┌─────────┐
                                      │ deleted │
                                      └─────────┘

Create Namespace

command:
  type: create_namespace
  payload:
    namespace: order-processing
    team: payments-team
    requesting_proxy: proxy-01
    config:
      # Full NamespaceConfig
      backends:
        storage:
          backend_type: postgres
          connection_string: postgres-primary.prism.local:5432
      patterns:
        queue:
          pattern_name: queue
          settings:
            durability: strong
    principal: user@example.com

# FSM processing:
1. Validate namespace doesn't exist
2. Validate team quota not exceeded
3. Validate team has permission for config
4. Select partition (hash-based or load-balanced)
5. Select proxy for partition
6. Insert into namespaces table
7. Return CreateNamespaceResponse

Assign Namespace to Proxy

command:
  type: assign_namespace
  payload:
    namespace: order-processing
    partition_id: 42
    proxy_id: proxy-01
    config:
      # Same as create_namespace
  principal: admin-cluster

# Proxy receives via control plane:
message: NamespaceAssignment
  namespace: order-processing
  partition_id: 42
  config: { ... }
  version: 1

# Proxy acknowledges:
message: NamespaceAssignmentAck
  success: true
  message: "Namespace assigned successfully"

Drain Namespace

command:
  type: drain_namespace
  payload:
    namespace: order-processing
    graceful_timeout: 30s
  principal: user@example.com

# FSM processing:
1. Set namespace status to 'draining'
2. Send RevokeNamespace to assigned proxy
3. Wait for proxy to finish active sessions (up to timeout)
4. Proxy sends RevokeNamespaceAck
5. Update namespace status to 'deleted'

Proxy Management

Proxy Registration

# Proxy sends on startup
message: ProxyRegistration
  proxy_id: proxy-01
  address: proxy-01.prism.local:8980
  region: us-east-1
  version: 0.1.0
  capabilities: [keyvalue, pubsub, queue]
  metadata:
    az: us-east-1a
    instance_type: c5.2xlarge

# Admin responds
message: ProxyRegistrationAck
  success: true
  message: "Registered successfully"
  initial_namespaces:
    - namespace: order-processing
      partition_id: 42
      config: { ... }
  partition_ranges:
    - start: 0
      end: 127

Proxy Heartbeat

# Proxy sends every 30s
message: ProxyHeartbeat
  proxy_id: proxy-01
  namespace_health:
    order-processing:
      active_sessions: 245
      requests_per_second: 1250
      status: healthy
  resources:
    cpu_percent: 45.2
    memory_mb: 2048
    goroutine_count: 512
    uptime_seconds: 86400
  timestamp: 1700000000

# Admin responds
message: HeartbeatAck
  success: true
  server_timestamp: 1700000001

Proxy Failover

When a proxy fails (missed 3 heartbeats):

# Admin detects failure
proxy_status: failed
last_heartbeat: 90 seconds ago (threshold: 60s)

# Admin reassigns namespaces
for namespace in proxy.namespaces:
    new_proxy = select_proxy_for_partition(namespace.partition_id)
    send_namespace_assignment(new_proxy, namespace)

# Audit log
audit:
  operation: proxy_failover
  proxy_id: proxy-01
  affected_namespaces: 15
  reassigned_to: [proxy-02, proxy-03]
  reason: "Heartbeat timeout (3 consecutive misses)"

Launcher Management

Launcher Registration

# Launcher sends on startup
message: LauncherRegistration
  launcher_id: launcher-01
  address: launcher-01.prism.local:7070
  region: us-east-1
  version: 0.1.0
  capabilities: [pattern, proxy, backend, utility]
  max_processes: 50
  process_types: [pattern-runner, backend-driver]
  metadata:
    az: us-east-1a
    instance_type: c5.4xlarge

# Admin responds
message: LauncherRegistrationAck
  success: true
  message: "Registered successfully"
  initial_processes:
    - process_id: pattern-order-processing-001
      process_type: pattern
      namespace: order-processing
      config: { ... }
  assigned_capacity: 50

Process Assignment

# Admin assigns process to launcher
message: ProcessAssignment
  process_id: pattern-order-processing-001
  process_type: pattern
  namespace: order-processing
  config:
    binary: /usr/local/bin/pattern-runner
    args: [--config, /etc/prism/patterns/order-processing.yaml]
    env:
      NAMESPACE: order-processing
      PATTERN: queue
    port: 9095
    health_port: 9096
    log_level: info

    pattern:
      pattern_type: queue
      isolation_level: namespace
      slots:
        storage:
          backend_type: postgres
          connection_string: postgres-primary.prism.local:5432
      settings:
        durability: strong
  version: 1

# Launcher acknowledges
message: ProcessAssignmentAck
  success: true
  message: "Process started successfully"

Launcher Heartbeat

# Launcher sends every 30s
message: LauncherHeartbeat
  launcher_id: launcher-01
  process_health:
    pattern-order-processing-001:
      status: running
      pid: 12345
      restart_count: 0
      error_count: 0
      memory_mb: 512
      uptime_seconds: 3600
      cpu_percent: 15.3
  resources:
    process_count: 15
    max_processes: 50
    total_memory_mb: 8192
    cpu_percent: 45.0
    available_slots: 35
  timestamp: 1700000000

Backend Registry Management

Register Backend

command:
  type: register_backend
  payload:
    backend_id: kafka-prod-us-east-1
    backend_type: kafka
    config:
      # Full BackendRegistration (see backend-registry.md)
      connection:
        endpoint: kafka-prod.prism.local:9092
      capabilities: [...]
      interfaces: [...]
  principal: operator@example.com

# FSM processing:
1. Validate backend_id unique
2. Validate configuration schema
3. Test backend connectivity
4. Insert into backends table
5. Publish backend_registered event

Update Backend

command:
  type: update_backend
  payload:
    backend_id: kafka-prod-us-east-1
    config:
      # Updated fields only
      capacity:
        max_write_rps: 2000000  # Increased capacity
  principal: operator@example.com

# FSM processing:
1. Load existing backend
2. Merge configuration
3. Update backends table
4. Publish backend_updated event

Backend Health Updates

# Background health checker updates
command:
  type: update_backend_health
  payload:
    backend_id: kafka-prod-us-east-1
    health:
      status: healthy
      last_check_at: 1700000000
      uptime_percent_24h: 99.95
  principal: health-checker

# If unhealthy, trigger failover
if health.status == unhealthy:
    for binding in find_bindings(backend_id):
        new_backend = select_alternative_backend(binding.slot)
        rebind_slot(binding, new_backend)

Team Management

Register Team

command:
  type: register_team
  payload:
    team_id: payments-team
    display_name: Payments Team
    permission_level: advanced
    quotas:
      max_namespaces: 50
      max_total_write_rps: 500000
      max_total_data_size: 10TB
      max_monthly_cost: 50000
    contact:
      email: payments-team@example.com
      slack_channel: "#payments-ops"
  principal: admin@example.com

Update Team Quotas

command:
  type: update_team_quotas
  payload:
    team_id: payments-team
    quotas:
      max_total_write_rps: 1000000  # Increased
      max_monthly_cost: 100000      # Increased
  principal: admin@example.com

# Audit log
audit:
  operation: update_team_quotas
  team: payments-team
  changes:
    max_total_write_rps: 500000 → 1000000
    max_monthly_cost: 50000 → 100000
  principal: admin@example.com
  reason: "Approved capacity increase for Q4"

Observability

Metrics

# Raft metrics
prism_admin_raft_leader: 1  # Is this node leader?
prism_admin_raft_term: 42
prism_admin_raft_commit_index: 1234567
prism_admin_raft_applied_index: 1234567
prism_admin_raft_fsm_pending: 0
prism_admin_raft_last_log_appended_at: 1700000000

# State metrics
prism_admin_namespaces_total{status="active"}: 1250
prism_admin_namespaces_total{status="draining"}: 5
prism_admin_proxies_total{status="active"}: 15
prism_admin_launchers_total{status="active"}: 30
prism_admin_backends_total{status="healthy"}: 45

# Performance metrics
prism_admin_fsm_apply_duration_seconds: histogram
prism_admin_namespace_create_duration_seconds: histogram
prism_admin_heartbeat_processing_duration_seconds: histogram

Health Checks

# Node health
curl http://admin-01.prism.local:8080/health
{
  "status": "healthy",
  "raft_state": "leader",
  "applied_index": 1234567,
  "peers": 3,
  "leader_id": "admin-01"
}

# Cluster health
curl http://admin-01.prism.local:8080/health/cluster
{
  "status": "healthy",
  "leader": "admin-01",
  "nodes": [
    {"id": "admin-01", "state": "leader", "healthy": true},
    {"id": "admin-02", "state": "follower", "healthy": true},
    {"id": "admin-03", "state": "follower", "healthy": true}
  ],
  "quorum": true
}

Disaster Recovery

Backup

# Snapshot FSM state
./prism-admin snapshot create --output /backups/admin-20251116.snap

# Backup contains:
# - Full SQLite database
# - Raft configuration
# - Cluster topology

Restore

# Restore from snapshot
./prism-admin snapshot restore --input /backups/admin-20251116.snap

# Process:
1. Stop admin cluster
2. Clear Raft data directories on all nodes
3. Restore snapshot to each node
4. Bootstrap cluster from restored state
5. Verify cluster health

Split-Brain Prevention

# Raft prevents split-brain through:
1. Quorum requirement (majority must agree)
2. Term-based leadership (higher term wins)
3. Log consistency checks

# Example:
Cluster: [admin-01, admin-02, admin-03]
Network partition: admin-01 | admin-02, admin-03

admin-01:
  - Cannot form quorum (1/3)
  - Cannot accept writes
  - Steps down as leader

admin-02, admin-03:
  - Can form quorum (2/3)
  - Elects new leader (admin-02)
  - Continues accepting writes

# When partition heals:
admin-01:
  - Rejoins as follower
  - Replicates missed log entries
  - Catches up to cluster state

Architecture​

Components​

Raft Consensus​

Admin Cluster Configuration​

Node Configuration​

Cluster Bootstrap​

Initial Cluster Setup​

Adding Nodes​

Removing Nodes​

FSM State Management​

State Schema​

FSM Operations​

Namespace Management​

Namespace Lifecycle​

Create Namespace​

Assign Namespace to Proxy​

Drain Namespace​

Proxy Management​

Proxy Registration​

Proxy Heartbeat​

Proxy Failover​

Launcher Management​

Launcher Registration​

Process Assignment​

Launcher Heartbeat​

Backend Registry Management​

Register Backend​

Update Backend​

Backend Health Updates​

Team Management​

Register Team​

Update Team Quotas​

Observability​

Metrics​

Health Checks​

Disaster Recovery​

Backup​

Restore​

Split-Brain Prevention​

See Also​