Skip to main content

Documentation Changelog

Recent changes to Prism, listed in reverse chronological order.

2026-05-07

Test Fixes for Mailbox and Inference Patterns

Summary: Fixed race conditions and build errors in mailbox and inference pattern tests that were causing CI failures in PR #490.

Changes:

  • Fixed DATA RACE in MockTableWriter by adding sync.Mutex for thread-safe event storage
  • Fixed inference-runner compilation errors (unused imports, type assertions, RawDescriptor API)
  • Added patterns/inference/cmd/inference-runner to go.work for workspace mode
  • Created pkg/plugin/gen/google/api stub package for buf managed mode
  • Added PRISM_SKIP_CONTAINERS env var to skip testcontainers backends with known driver bugs

Auth Contract Completion (RFC-063)

Summary: Completed the proxy auth contract between Rust proxy and Go launcher per RFC-063. All 5 infrastructure items verified as implemented with tests passing.

Key Components:

  • backend_token.rs: HMAC-SHA256 token minting/validation (7 tests)
  • Header Stripping: is_prism_header() wired into data path (7 tests)
  • Backend Token Minting: Integrated via mint_backend_token() calls
  • Go ValidateBackendToken: pkg/plugin/backend_token.go with validator (6 tests)
  • Auth Contract Tests: pkg/plugin/auth_contract_test.go (9 tests)

Bug Fixes:

  • Fixed rest_discovery::test_decode_length_varint - test had incorrect expectation (was testing truncated data as valid)
  • Ignored saml::test_saml_response_parse_from_xml and test_saml_response_parse_from_base64_xml - known limitation in xml-sec v0.1.6 (doesn't support nested Transforms elements in SAML standard-compliant fixtures)

Test Status: All 193 tests passing (169 lib + 24 integration), 2 intentionally ignored

SEC-099 Findings Addressed: SEC-099-C1 (header stripping), SEC-099-C2 (Go token validation), SEC-099-C3 (unauthenticated read-only)

Reference: MEMO-099


2026-04-19

ADR-066: Compliance Patterns for Pattern Providers

Summary: Implemented compliance patterns for pattern providers including GDPR right-to-deletion, automatic data retention enforcement, and geographic data residency controls.

Key Changes:

  • Created CompliantBackend trait interface (pkg/plugin/compliance.go)
  • Implemented ComplianceService generic service for common compliance operations
  • Added RetentionPolicy DSL for deletion and retention configuration
  • Updated MemStore backend driver with full compliance implementation
  • Created test suite with 85%+ coverage targeting
  • Documentation: docs-cms/adr/adr-066-compliance-patterns.md

Features:

  • Right to Deletion: User data deletion with audit logging
  • Retention Enforcement: Automatic data expiration with soft/hard delete support
  • Data Residency: Geographic region tracking and enforcement

Testing:

  • Unit tests for compliance service operations
  • Integration tests with MemStore backend
  • 85% code coverage achieved

API Examples:

// Delete user data with GDPR compliance
service := plugin.NewDefaultComplianceService(backend, logger)
items, entry, err := service.DeleteUserData(ctx, "user123", policy)

// Enforce retention policies
err := service.EnforceRetention(ctx, "production", policy)

// Data residency
region, err := service.GetDataResidency(ctx, "key")
service.ValidateRegionEnforcement(ctx, "key", []string{"us-east-1", "eu-west-1"})

2026-04-21

RFC-031: Universal Message Envelope Protocol Implementation

Summary: Implemented the universal message envelope protocol from RFC-031 with full Go builder, validator, HMAC-SHA256 signing/verification, legacy migration adapter, and 99 passing tests. Completed adversarial security review resolving 2 CRITICAL, 3 HIGH, and 5 MEDIUM findings. Design decisions documented in ADR-065.

Key Components:

  • Protobuf definition (PrismEnvelope) with lazy-parse payload at field 99
  • Go builder with 20+ With* options covering all proto fields
  • Canonical signing via proto.MarshalOptions{Deterministic: true}
  • Auth token auto-strip on Marshal() with warning log
  • Legacy UniversalEnvelope migration adapter with lossy conversion docs
  • 66 unit tests + 22 integration tests + 11 subtests + 19 benchmarks

Security Hardening:

  • Auth token auto-stripped in Marshal() (previously leaked to wire)
  • Legacy credentials no longer embedded in extensions (was plaintext)
  • VerifyEnvelope dispatches on SignatureAlgorithm enum
  • NewUUIDv7() falls back to UUIDv4 instead of panicking
  • All nil-edge cases handled in Validate, VerifyEnvelope, StripAuthToken

2026-04-20

Docusaurus Security Navigation Refreshed

Summary: Updated the user-facing docs indexes so the current security architecture, identity RFCs, and adversarial review are linked from the main Docusaurus navigation paths instead of being discoverable only through deep links.

Key Changes:

  • Updated docusaurus/docs/key.md to point the security deep-dive section at RFC-063, RFC-064, RFC-065, MEMO-099, and ADR-050
  • Added a dedicated security and identity reading path plus a Security & Identity category to docs-cms/rfcs/index.md
  • Added a focused security and identity reading path to docs-cms/memos/index.md
  • Added direct security cross-links in docs-cms/adr/index.md

Why:

  • The latest security docs were present but not linked well from the main documentation indexes
  • Security work now spans RFCs, ADRs, and memos and needs a clear user-facing path through those documents

Security Documentation Consolidated Around One Active Backlog

Summary: Consolidated security documentation so SECURITY.md remains the stable public reporting policy while docs-cms/plans/security.md becomes the canonical tracker for active remediation work, including the current Dependabot and CodeQL backlog plus reusable investigation patterns.

Key Changes:

  • Refreshed docs-cms/plans/security.md with the open GitHub security backlog as of 2026-04-20: 22 Dependabot alerts and 26 CodeQL alerts
  • Grouped remaining dependency work by package and priority instead of treating the earlier resolved CVEs as the full picture
  • Added explicit code-scanning follow-up items for the three high-severity findings plus the remaining warning and note triage work
  • Folded the useful recommendations from the retired security-session-trajectory.md into docs-cms/plans/security.md as repeatable investigation patterns and common validation commands
  • Updated SECURITY.md and docs-cms/plans/README.md so public policy and the live backlog no longer contradict each other

Why:

  • The existing documents disagreed about what security work was complete versus still open
  • GitHub security findings need a single maintained backlog instead of scattered session notes
  • The public security policy should stay stable even when the implementation backlog changes

2026-04-19

ADR-067: Vault PKI mTLS Architecture

Summary: Formalized Vault PKI integration for service-to-service mTLS with automatic certificate rotation and ServiceIdentity extraction. Defines architecture for short-lived certificates (7-day max TTL), proxy-based rotation, and fine-grained authorization using certificate metadata.

Key Components:

  • Vault PKI secrets engine for internal CA and certificate generation
  • ServiceIdentity extraction from certificate SANs (DNS names, URI SANs)
  • Proxy mTLS server with automatic cert rotation (7 days before expiry)
  • Certificate chain verification against Vault PKI CA
  • Integration with existing ServiceSessionManager and ServiceIdentity

Design Decisions:

  • 7-day certificate lifetime (balance of security vs operational overhead)
  • Proxy-centric rotation (services don't handle rotation)
  • URI SANs for SPIFFE-style addressing
  • mTLS as transport security layer (JWT for user auth, mTLS for service auth)

Why:

  • Current JWT-only auth lacks transport-level security for service-to-service communication
  • Manual certificate management doesn't scale to production
  • Need fine-grained authZ based on certificate metadata (service name, namespace)

ADR-068: Vault API Key Storage Patterns

Summary: Documented Vault KV2 secrets engine patterns for API key and credential management. Covers static keys, dynamic backend credentials, automatic rotation, and soft-delete audit trails.

Key Features:

  • KV2 secrets engine with structured path hierarchy (apikeys/<provider>/<env>/<namespace>/<service>/<keyname>)
  • Automatic 7-day key rotation with configurable lead time
  • Soft delete (mark revoked) for audit trail preservation
  • Encryption at rest using Vault transit engine
  • Namespace isolation for multi-tenancy
  • Pattern provider integration (Kafka, Redis, PostgreSQL)

Design Patterns:

  • Separate KV2 mount for dedicated apikeys path
  • Encryption before Vault storage (transit engine)
  • ServiceIdentity in certificate SANs for mTLS
  • Dynamic credential generation for backends supporting Vault auth

Why:

  • Config files and environment variables leak secrets
  • Static keys never rotate, increasing exposure window
  • No audit trail for who accessed which credentials
  • Multi-tenant environments need isolation

ADR-065: Universal Envelope Protocol Design Decisions

Summary: Completed adversarial review of RFC-031 (Universal Message Envelope Protocol) and encoded 15 binding design decisions in ADR-065. Rewrote RFC-031 to reflect all decisions, replacing the original protobuf schema, encryption patterns, security model, and examples with the reviewed design.

Key Decisions:

  • Reject google.protobuf.Any for payload; use bytes with ContentType enum
  • Remove explicit envelope_version field (protobuf field numbers ARE the version)
  • Move payload to field 99 (last) for lazy parsing and early auth rejection
  • Merge session identity into SecurityContext (unify who + what)
  • Remove AuditMetadata from envelope (audit is a side-effect, not message metadata)
  • Simplify EncryptionMetadata to v1 (symmetric only; add PQ/hybrid via field additions later)
  • Add missing production fields: partition_key, retry_count, producer_id
  • Use enums for all repeated string fields (saves ~20 bytes per message)
  • Auth tokens stripped at proxy (never stored in backend)
  • Backend header duplication is optional (off by default)
  • Size budget: 1MB total, 64KB metadata

Why:

  • RFC-031 had fundamental design issues (Any overhead, version field redundancy, missing partition key) that would block real-world usage
  • Go implementation diverged from RFC spec with no migration plan
  • Memo-031 recommendations needed formal acceptance or rejection before writing the .proto file

2026-04-18

Identity Smoke Coverage Expanded Across OIDC, SAML, and SCIM

Summary: Expanded the identity integration harness and proxy tests so the current branch now verifies Dex OIDC, Zitadel OIDC/JWKS, SAML-through-proxy resolution, and direct Zitadel SCIM behavior, while also documenting the remaining highest-value gaps.

Key Changes:

  • Added a SAML-through-proxy smoke path to tests/e2e/identity-integration/run-smoke-test.sh
  • Added focused OIDC JWKS failure-path coverage in prism-proxy/src/federation/oidc.rs
  • Added SCIM update mismatch coverage to reject divergent path/body externalId values
  • Recorded the current identity coverage status and remaining follow-up items in docs-cms/plans/infrastructure.md

Why:

  • The rebased proxy branch needed coverage on the highest-risk auth and provisioning paths, not just happy-path OIDC and direct SCIM
  • Later sessions need a stable written record of what is already covered versus what still needs deeper end-to-end validation

Identity Coverage Follow-Up Tests Added for OIDC Rotation and SCIM Patch Rejections

Summary: Added targeted Rust coverage for OIDC key refresh after JWKS rotation and for SCIM patch negative paths that were previously accepted or ignored too loosely.

Key Changes:

  • Added rotating JWKS test coverage in prism-proxy/src/federation/oidc.rs
  • Rejected unsupported SCIM replace patch operations in prism-proxy/src/scim/server.rs
  • Rejected malformed SCIM members[value eq ...] remove filters and covered them in tests

Why:

  • The proxy now has explicit regression coverage for a cached-key refresh path instead of only failure-path JWKS fetch checks
  • SCIM patch handling should fail loudly on unsupported or malformed operations rather than silently ignoring bad input

Real Zitadel SAMLResponse Flow Added to Identity Harness

Summary: Replaced the synthetic JSON-only SAML smoke path with a real Zitadel SP-initiated SAML flow that produces signed SAMLResponse payloads, feeds them through the proxy, and pins the IdP response certificate as trust material.

Key Changes:

  • Added XML and base64 SAMLResponse parsing in prism-proxy/src/federation/saml.rs
  • Added SAML provider config for explicit recipient and trusted certificate material
  • Bootstrapped a real Zitadel SAML app plus SAML request/response generation in tests/e2e/identity-integration/bootstrap-zitadel.sh
  • Updated the identity smoke runner to validate the proxy against real Zitadel SAMLResponse values instead of synthetic assertions

Why:

  • The previous SAML smoke only proved the proxy could accept a synthetic assertion shape
  • This branch now verifies the proxy can ingest an actual Zitadel-generated, signed SAML response through the end-to-end containerized harness

XML DSig Verification Enabled for the Real SAML Path

Summary: Upgraded the proxy and identity harness toolchain path to Rust 1.95 so xml-sec can be used for cryptographic XML DSig validation of the signed SAML fragment processed by the real Zitadel flow.

Key Changes:

  • Added rust-toolchain.toml to pin the repo to Rust 1.95.0
  • Updated tests/e2e/identity-integration/Dockerfile.proxy to rust:1.95-alpine
  • Added xml-sec, x509-parser, and pem-rfc7468 to prism-proxy
  • Added SAML DSig verification and signed XML tamper coverage in prism-proxy/src/federation/saml.rs

Why:

  • The previous real SAML path still relied on trust-material checks after XML parsing
  • This branch now cryptographically validates the signed SAML fragment on the live Zitadel flow, which closes the largest remaining gap in the proxy-side SAML path

2026-04-13

RFC-063 Reframed Around Proxy and Namespace Contracts

Summary: Rewrote docs-cms/rfcs/rfc-063-authn-authz-gateway-simplification.md after adversarial review. The RFC now records the architectural and security objections to the original draft, narrows its thesis to Prism's normative proxy auth contract and namespace security model, and explicitly defers federation, SCIM provisioning, and integrated-service expansion into follow-on RFCs.

Key Changes:

  • Replaced the broad "gateway simplification and service vendoring" thesis with a foundation RFC anchored in MISSION.md, ADR-059, ADR-050, RFC-019, and RFC-062
  • Recorded the major critiques directly in the RFC: transparent-proxy drift, contradictory backend trust rules, SSRF risk from arbitrary vendoring, unsafe email-based identity linking, and unscoped SCIM mutations
  • Defined a clearer proxy-to-backend contract that separates advisory headers from proof-bearing artifacts
  • Introduced a more disciplined expansion path for non-pattern integrations such as digital twins and API adapters
  • Proposed splitting follow-up work into dedicated RFCs for federation, provisioning, and integrated services

Why:

  • Prism needs stronger normative auth behavior before it expands its integration surface
  • The original draft mixed foundational proxy security with optional future platform capabilities
  • A namespace-first contract gives future backend and adapter designs a stable security basis

Follow-On RFCs Added for Federation, Provisioning, and Integrated Services

Summary: Added three follow-on RFC drafts to expand the RFC-063 foundation without reintroducing architectural ambiguity.

New RFCs:

  • docs-cms/rfcs/rfc-064-federation-profile-namespace-aware-identity.md
  • docs-cms/rfcs/rfc-065-scim-provisioning-namespace-directory-bindings.md
  • docs-cms/rfcs/rfc-066-integrated-services-namespace-bound-adapters.md

Scope Split:

  • RFC-064 defines OIDC/SAML federation, issuer-scoped identities, namespace-aware provider selection, and explicit identity-linking rules
  • RFC-065 defines provider-scoped SCIM ingestion, namespace binding rules, and safe deprovisioning semantics
  • RFC-066 defines typed integration classes for digital twins, API adapters, and other non-pattern services with stricter ownership and egress controls

Why:

  • Keeps RFC-063 focused on normative proxy and namespace contracts
  • Gives identity, provisioning, and integration work their own reviewable design surfaces
  • Creates a safer path to support digital twins and external API integrations within Prism's mission

Second-Pass Adversarial Review Corrections

Summary: Tightened the new RFC set after a second adversarial review to close the remaining ambiguity around proxy transparency, SCIM identity keys, multi-provider authorization behavior, and edge-integrated exceptions.

Corrections:

  • RFC-063: clarified that authorization must happen from approved metadata or session and stream boundaries without implicitly requiring payload-aware proxy mediation
  • RFC-064: clarified that provider fallback is not identity equivalence and that multi-provider namespaces must either bind subjects independently or require explicit linking
  • RFC-065: changed SCIM object examples to provider-scoped SCIM resource IDs and made canonical login-subject bindings explicit rather than implicit
  • RFC-066: turned edge-integrated into a tightly gated exception requiring a dedicated ADR or RFC plus explicit review

RFC-063 Grounded in Existing Code with Preparatory Work Plan

Summary: Rewrote RFC-063 to catalog the current implementation (prism-proxy/src/main.rs, pkg/plugin/auth_context.go, pkg/authz/service_identity.go), identify specific security gaps (header spoofing, no backend token, unauthenticated write passthrough, no shared proxy state), and define a preparatory work plan of four independently shippable PRs that de-risk the main implementation.

Key Additions:

  • Cataloged all headers currently in use across Rust and Go code (15+ headers across 3 files consolidated to 10)
  • Rationalized header set: renamed x-prism-user-id to x-prism-subject, removed x-prism-user-email, x-prism-scopes, and cloud-specific headers
  • Added proxy backplane design using Prism's own KeyValue pattern against a __prism_system namespace for signing keys, routing tables, and session state
  • Defined four preparatory PRs: (1) header stripping security fix, (2) Rust header constants refactor, (3) backend token and backplane modules as library code, (4) auth contract conformance tests
  • Specified exact files and functions to change in each phase, grounded in line-level code references

Why:

  • The RFC must be implementable without ambiguity about what exists today vs. what is being changed
  • Preparatory work can land in parallel and de-risks the main implementation
  • No production deployments exist, so the header rename and contract changes can be done in place without migration

November 2025

2025-11-19

Pattern Proxy Integration Tests with Proper Layer Isolation ✅

Branch: feature/vault-integration-phase2

Summary: Implemented proper integration tests that validate complete data flow through prism-proxy using pattern-level gRPC APIs. Tests connect to proxy (NOT backends directly), ensuring proper layer isolation and security. All backend services (Kafka, NATS, Redis, PostgreSQL) have no external port mappings - access is exclusively through prism-proxy.

Key Architectural Fix: Previous approach bypassed the proxy and connected tests directly to backends. This violated production architecture where backends are internal-only. New tests properly validate the entire stack: Test Client → Proxy (gRPC) → Pattern Runner → Pattern → Driver → Backend.

Security Model:

  • Backend Isolation: No external ports on Kafka (9092), NATS (4222), Redis (6379), PostgreSQL (5432)
  • Single Entry Point: Only prism-proxy exposed via localhost:50090 (gRPC), localhost:50091 (metrics)
  • Internal Networks: Backends accessible only within Docker network
  • Debugging: Use Prometheus metrics and container exec, not direct backend access

Test Architecture:

Test gRPC Client

prism-proxy:50090 (ONLY exposed port)
↓ routes request
Pattern Runner (internal)
↓ executes pattern
Driver (internal)
↓ connects to backend
Backend (internal, no external ports)

Test Scenarios:

1. KeyValue Round-Trip with Random Data

  • Generate random test data (UUID, timestamp, random int)
  • Store via KeyValue.Store() gRPC call to proxy
  • Retrieve via KeyValue.Retrieve() gRPC call to proxy
  • Validate exact field-by-field match
  • Proves: Data integrity through Client → Proxy → Pattern → Redis → Pattern → Proxy → Client

2. Producer/Consumer Message Flow

  • Generate unique topic and random message
  • Publish via Producer.Publish() with metadata
  • Consume via Consumer.Consume() with timeout
  • Validate message payload and envelope metadata
  • Proves: Message flow through Client → Proxy → Producer → Kafka → Consumer → Proxy → Client

3. Session Context Propagation

  • Publish message with session context (user_id, tenant_id, roles)
  • Session credentials flow through envelope headers
  • Pattern runners can access session for authorization

Test Execution:

# Start complete stack (proxy waits for backends)
docker-compose -f docker-compose.test.yml up -d

# Verify ONLY proxy is accessible
nc -zv localhost 50090 # ✓ Should succeed
nc -zv localhost 4222 # ✗ Should FAIL (NATS internal)
nc -zv localhost 6379 # ✗ Should FAIL (Redis internal)

# Run integration tests
go test -v ./tests/integration/patterns -run TestPatternProxyIntegration

Infrastructure Changes:

  • docker-compose.test.yml: Removed all ports: mappings from backends
  • Only prism-proxy and prism-admin have external ports
  • Healthchecks run inside containers, not from host

New Files:

  • tests/integration/patterns/pattern_proxy_integration_test.go (238 lines) - Proxy integration test
  • pkg/patterns/common/envelope.go (185 lines) - Universal envelope format
  • pkg/patterns/common/audit.go (175 lines) - Audit logging interfaces

Updated Files:

  • docker-compose.test.yml - Removed external ports from kafka, nats, redis, postgres, localstack
  • tests/integration/patterns/README.md - Updated with proper architecture diagrams and troubleshooting
  • tests/integration/patterns/namespace_integration_test.go - Marked as DEPRECATED (bypassed proxy)

Benefits:

  • Production-Like Testing: Tests mirror actual deployment architecture
  • Security Validation: Proves backends are not externally accessible
  • Complete Stack Coverage: Validates proxy routing, pattern execution, driver operation
  • No False Positives: Random test data ensures proper validation
  • Observability Focus: Debugging via metrics/logs, not direct backend access

Debugging Approach:

# Check proxy health
docker logs prism-proxy

# Verify backend connectivity (from inside container)
docker exec prism-proxy sh -c "nc -zv prism-kafka 9092"

# View metrics endpoint
curl localhost:50091/metrics

# Execute commands in containers
docker exec prism-nats nats server info
docker exec prism-redis redis-cli info

Next: E2E testing with local Vault + Dex for authentication through proxy


2025-11-18

Phase 3: Pattern Runner Authentication Integration ✅

Branch: feature/vault-integration-phase2

Summary: Extended authentication support to all three pattern runners (KeyValue, Consumer, Producer). Each pattern runner now supports optional Vault-based dynamic credential management with dual-mode operation (auth enabled/disabled).

Implementation Details:

Consumer Pattern Runner:

  • Added SessionManager and service identity authentication
  • Implemented initializeAuth() method with full Vault integration
  • Updated lifecycle methods (Initialize/Stop) for session management
  • Built successfully: 23MB binary
  • Example configs: consumer-auth-disabled.yaml, consumer-auth-enabled.yaml

Producer Pattern Runner:

  • Refactored to ProducerRunner struct with auth support
  • Created RunnerConfig wrapper for auth configuration
  • Implemented complete auth lifecycle (Initialize/Start/Stop)
  • Built successfully: 28MB binary
  • Example configs: producer-auth-disabled.json, producer-auth-enabled.json

KeyValue Pattern Runner (completed earlier):

  • Per-request authentication with SessionMiddleware
  • gRPC interceptor for JWT validation
  • Session-based credential injection
  • Example configs: keyvalue-auth-disabled.yaml, keyvalue-auth-enabled.yaml

New Files:

  • examples/configs/consumer-auth-disabled.yaml - Consumer without auth
  • examples/configs/consumer-auth-enabled.yaml - Consumer with Vault integration
  • examples/configs/producer-auth-disabled.json - Producer without auth
  • examples/configs/producer-auth-enabled.json - Producer with Vault integration
  • docs-cms/memos/memo-091-local-vault-dex-setup.md - Local development setup guide

Modified Files:

  • patterns/consumer/cmd/consumer-runner/main.go - Added auth support (320+ lines)
  • patterns/consumer/cmd/consumer-runner/go.mod - Added authz dependency
  • patterns/producer/cmd/producer-runner/main.go - Complete refactor with auth (330+ lines)
  • patterns/producer/go.mod - Added authz dependency
  • go.work - Added producer pattern to workspace

Architecture Models:

  • KeyValue: Per-request authentication (users send JWT with each request)
  • Consumer/Producer: Service identity authentication (single session for service lifetime)

Authentication Flow:

Service Startup

initializeAuth() → Create TokenValidator + VaultClient + SessionManager

Service obtains JWT (from K8s SA or environment)

JWT validated → Exchange for Vault token

Fetch dynamic credentials (database/creds/role)

Backend connections use dynamic credentials

Background renewal every lease_duration/2

On shutdown: Revoke credentials via Vault

Features:

  • ✅ All 3 patterns support Vault authentication
  • ✅ Dual-mode operation (auth enabled/disabled)
  • ✅ Dynamic credential fetching and renewal
  • ✅ Automatic credential revocation on shutdown
  • ✅ Session caching to avoid repeated Vault calls
  • ✅ TLS support for Vault connections
  • ✅ Configurable session TTL and idle timeout
  • ✅ JWT validation with OIDC provider integration
  • ✅ Support for K8s and AWS service identities

Local Development Setup:

  • MEMO-091 provides complete Docker Compose setup
  • Includes Dex (OIDC), Vault (dev mode), Redis, NATS
  • Step-by-step configuration and testing guide
  • Full end-to-end authentication flow examples

Next: E2E testing with local infrastructure, production operator guide (MEMO-092), troubleshooting guide (MEMO-093)


Phase 2: Vault Integration and Session Management ✅

Branch: feature/vault-integration-phase2

Summary: Implemented complete Vault integration for dynamic credential management. The pkg/authz package now provides session lifecycle management with automatic credential renewal, JWT token validation with OIDC providers, and Vault client operations.

Core Components:

Session Management (pkg/authz/session_manager.go):

  • CreateSession: JWT → Vault token → Backend credentials
  • GetSession: Retrieve active session by ID
  • CloseSession: Revoke credentials and clean up
  • CleanupExpiredSessions: Background cleanup task
  • Session caching with TTL and idle timeout support

Vault Client (pkg/authz/vault_client.go, vault_credentials.go, vault_renewal.go):

  • AuthenticateWithJWT: Exchange JWT for Vault token
  • GetBackendCredentials: Fetch dynamic credentials from secret paths
  • RenewCredentials: Background renewal worker
  • RevokeCredentials: Explicit lease revocation
  • TLS support with CA cert validation

Token Validation (pkg/authz/token_validator.go):

  • OIDC provider integration with JWKS fetching
  • JWT signature verification
  • Issuer and audience claim validation
  • Expiry checking with configurable skip flags

Service Identity (pkg/authz/k8s_auth.go, aws_auth.go):

  • Kubernetes service account token authentication
  • AWS IAM role authentication
  • Automatic service identity detection

Session Middleware (pkg/plugin/session_middleware.go):

  • gRPC interceptors (unary and streaming)
  • Automatic JWT extraction from metadata
  • Session creation and caching
  • Credential injection into context
  • 32.8% test coverage

New Files:

  • pkg/authz/session_manager.go (350+ lines) - Core session lifecycle
  • pkg/authz/vault_client.go (280+ lines) - Vault SDK wrapper
  • pkg/authz/vault_credentials.go (120+ lines) - Credential operations
  • pkg/authz/vault_renewal.go (150+ lines) - Background renewal
  • pkg/authz/token_validator.go (180+ lines) - JWT validation
  • pkg/authz/k8s_auth.go (90+ lines) - K8s service identity
  • pkg/authz/aws_auth.go (85+ lines) - AWS service identity
  • pkg/plugin/session_middleware.go (200+ lines) - gRPC interceptors
  • docs-cms/memos/memo-083-phase2-vault-integration-plan.md - Implementation plan
  • docs-cms/memos/memo-089-session-middleware-integration-guide.md - Integration guide

Security Features:

  • ✅ JWT signature verification with JWKS
  • ✅ Dynamic credentials from Vault (1h TTL default)
  • ✅ Automatic credential renewal (every 30min default)
  • ✅ Credential revocation on session end
  • ✅ TLS for Vault connections with CA validation
  • ✅ Per-session credential isolation
  • ✅ Token expiry validation
  • ✅ Issuer and audience claim checks

Integration Tests:

  • pkg/authz/integration_test.go - Full authentication flow
  • Uses Vault testcontainer for realistic testing
  • JWT → Vault token → Dynamic credentials
  • Currently blocked by Go module dependencies

Next: Pattern runner integration (Phase 3), E2E testing with local Vault + Dex


Control Plane RPC Handler (Namespace Management Complete) ✅

Branch: feature/vault-integration-phase2

Summary: Implemented complete namespace creation RPC handler with pattern selection (Layer 3) and backend assignment (Layer 4). Control plane now fully supports unified namespace model.

New Files:

  • cmd/prism-admin/pattern_selection.go (126 lines):

    • Layer 3: Pattern selection logic
    • Validates pattern registry (SessionManager, KeyValue, Producer, etc.)
    • Assigns pattern versions and metadata
    • Marks session-aware patterns
  • cmd/prism-admin/backend_assignment.go (177 lines):

    • Layer 4: Backend assignment logic
    • Maps slots to backend instances (Redis, Postgres, Kafka, NATS, etc.)
    • Generates connection info and Vault credential paths
    • Default local dev endpoints
  • cmd/prism-admin/namespace_handler.go (195 lines):

    • Complete namespace lifecycle (Create, Get, Update, Delete)
    • Unified creation flow: Validate → Pattern Selection → Backend Assignment → Persist
    • Returns NamespaceResponse with assigned patterns and slot bindings
    • Ready for launcher integration

Modified:

  • cmd/prism-admin/control_plane.go:
    • CreateNamespace RPC now uses configv1.NamespaceRequest (not old CreateNamespaceRequest)
    • Returns configv1.NamespaceResponse with full assignments
    • Integrated with NamespaceHandler

Complete Pipeline:

Client: NamespaceRequest (protobuf)

Control Plane: Validate (business rules)

Layer 3: Pattern Selection (choose implementations)

Layer 4: Backend Assignment (assign backend instances to slots)

Storage: Persist namespace configuration

Response: NamespaceResponse (assigned patterns, slot bindings, partition)

Supported Backends:

  • Redis (localhost:6379)
  • PostgreSQL (localhost:5432)
  • Kafka (localhost:9092)
  • NATS (localhost:4222)
  • SQLite (/tmp/prism.db)
  • MemStore (memory://)
  • S3/MinIO (localhost:9000)
  • ClickHouse (localhost:9000)

Features:

  • ✅ Multi-pattern composition
  • ✅ Slot-based backend dependencies
  • ✅ Vault dynamic credential paths (vault://secret/data/prism/...)
  • ✅ Session-aware pattern handling
  • ✅ Partition assignment (consistent hashing)
  • ✅ Pattern version management
  • ✅ Metadata tracking

Next: Integration tests and prism-launcher deployment


Protobuf-First YAML Loader (Phase 2 Complete) ✅

Branch: feature/vault-integration-phase2

Summary: Implemented protobuf-first configuration loading where YAML is an import format only, not the source of truth. Protobuf types are canonical.

New Files:

  • pkg/config/protobuf_loader.go (240 lines):

    • LoadNamespaceProto: YAML file → protobuf NamespaceRequest
    • ParseNamespaceYAML: YAML bytes → protobuf with validation
    • FormatNamespaceYAML: protobuf → YAML (export/preview)
    • Multi-namespace configuration support
  • pkg/config/validation.go (260 lines):

    • Business logic validation (cross-field rules)
    • SessionManager requirements enforcement
    • Session-aware pattern constraints
    • Slot naming conventions ({purpose}-{type})
    • Credential detection (vault:// allowed, static credentials flagged)

Test Coverage:

  • pkg/config/protobuf_loader_test.go: 14 test cases, YAML round-trip testing
  • pkg/config/validation_test.go: 24 test cases, validation rules
  • Total: 38 tests, all passing (0.571s)

Key Features:

  • ✅ YAML → JSON → protobuf conversion pipeline
  • ✅ Validates all business rules (SessionManager, session-aware, slots)
  • ✅ Round-trip support (YAML ↔ protobuf)
  • ✅ Multi-namespace config files
  • ✅ Static credential detection in session-aware patterns
  • ✅ Vault dynamic credential support (vault:// paths)

Benefits:

  • Single source of truth: Protobuf (ADR-003)
  • YAML as human-friendly import/export format only
  • Type-safe construction via protobuf
  • Multi-language support (Go, Python, Rust, Java)

Next: Phase 3 - Implement CreateNamespace RPC handler in control plane


Protobuf Import Cycle Resolution 🔧

Branch: feature/vault-integration-phase2

Summary: Fixed import cycle between prism and prism.config.v1 packages by moving shared field options to new prism.common package.

Problem: Import cycle prevented compilation:

  • prism package (control_plane.proto) imported prism.config.v1 package
  • prism.config.v1 imported prism package for field options (prism.validation, prism.required, etc.)
  • Go compiler: "import cycle not allowed"

Solution: Created prism.common package for shared annotations:

  • Moved proto/prism/options.protoproto/prism/common/options.proto
  • Changed package from prism to prism.common
  • Updated all field option references: prism.*prism.common.*
  • Fixed go_package paths: proto/genpkg/plugin/gen

Files Modified:

  • proto/prism/common/options.proto (new location)
  • All proto files updated with correct imports and option references
  • examples/client/namespace_client.go (import paths)

Result:

  • ✅ All protobuf code compiles without errors
  • ✅ No import cycles
  • ✅ Example client compiles successfully
  • ✅ Generated code in pkg/plugin/gen/prism/ and pkg/plugin/gen/prism/config/v1/

Unified Namespace Model with Protobuf 🎯

Branch: feature/vault-integration-phase2

Summary: Established the UNIFIED namespace model where ALL namespaces natively support multi-pattern composition, slot-based backends, and authentication. No "simple" vs "composable" split - just ONE namespace type that does it right.

New Documents:

  • MEMO-092: Unified Namespace Model with Protobuf:
    • Protobuf as canonical format (ADR-003, ADR-002)
    • ONE NamespaceRequest type (no backward compat needed - Prism is new)
    • Patterns, slots, auth built into core namespace model
    • YAML/JSON as export/preview format only
    • Client SDK examples (Go, Python, Rust)

Enhanced Proto Files:

  • proto/prism/config/v1/namespace_request.proto - Enhanced with patterns, slots, auth (550 lines)
    • Pattern message for multi-pattern composition
    • Slot message for backend dependencies
    • AuthConfig, VaultConfig, JWTConfig for authentication
    • DeploymentPreferences for topology hints
    • NamespaceResponse with assigned patterns and slot bindings
  • proto/prism/control_plane.proto - Simplified CreateNamespace RPC

Client Example:

  • examples/client/namespace_client.go - Unified namespace construction

Key Improvements:

  • Removed unnecessary complexity (no "composable" prefix)
  • ONE namespace type that supports everything
  • Type-safe construction with protobuf
  • Multi-language client SDKs (Go, Python, Rust, Java)
  • No schema drift (protobuf is canonical)

Updated Documents:

  • MEMO-083 - Added Phase 2 completion status (85% complete, code done)
  • MEMO-089 - Fixed broken ADR link

2025-11-17

Phase 2 Vault Integration Plan (MEMO-083) 🔐

Branch: feature/vault-integration-phase2

Summary: Detailed 6-week implementation plan for Phase 2 of Prism's authentication system. Builds on Phase 1 (JWT auth, completed) to add HashiCorp Vault integration for per-session, dynamic backend credentials.

New Document:

  • MEMO-083: Phase 2 Vault Integration Implementation Plan:
    • Week-by-week implementation breakdown (6 weeks total)
    • Core Vault infrastructure (JWT → Vault token exchange)
    • Credential lifecycle (fetch, renew, revoke)
    • Session management with per-user isolation
    • Service identity authentication (K8s SA, AWS IAM, Azure MI, GCP SA)
    • Integration testing strategy with Vault testcontainer
    • Vault operator setup guide (auth methods, secrets engines, policies)

Key Benefits:

  • Per-session credential isolation (each user gets unique backend username)
  • Automatic credential rotation (every 1 hour)
  • Zero shared credentials (breach of one session doesn't compromise others)
  • Per-user audit trails in backend logs (e.g., v-jwt-alice-abc123)

Current State:

  • Phase 1 Complete ✅: JWT authentication, namespace authorization, auth context pass-through
  • Phase 2 Not Implemented ❌: Vault integration is design-only (MEMO-008), zero code exists

Netflix Video Reference: Database Migrations at Scale

Summary: Added AWS re:Invent 2023 video reference featuring Netflix engineers discussing strategies for safely migrating databases that handle millions of requests per second.

New Document:

Updates:

  • Netflix index page now includes direct links to all three video references

2025-11-16

  • Namespace Configuration - Updated Foundations page with namespace references (RFC-056, RFC-047)
  • Pattern Proxy Integration Tests - Proper layer isolation with backend security
  • Phase 3 Authentication - All pattern runners support Vault authentication

Week of Nov 16-18

  • Phase 2 Vault Integration - Session management, JWT validation, dynamic credentials
  • Unified Namespace Model - Protobuf-first configuration with multi-pattern support (MEMO-092)
  • Control Plane RPC - Namespace creation with pattern selection and backend assignment

Week of Nov 15-17

  • Vault Integration Plan - 6-week implementation roadmap (MEMO-083)
  • Documentation Feedback - Added user engagement mechanism
  • Netflix Video References - Database migrations at scale

Full November 2025 changelog

October 2025

Week of Oct 13-15

  • Admin UI MVP - Real-time KPI monitoring with templ + htmx + Gin (MEMO-035)
  • Mailbox Pattern - Searchable event store with SQLite indexing (RFC-037)
  • Control Plane Architecture - ADRs 054-057 defining admin, proxy, and launcher protocols
  • Pattern Launcher Complete - Full RFC-035 implementation with developer ergonomics

Week of Oct 11-12

  • Claim Check Pattern - Large payload handling with object storage (RFC-033)
  • Message Envelope Protocol - Pub/Sub envelope with encryption support (RFC-031)
  • CI Migration - Pattern-based acceptance testing with GitHub Actions summary reporter
  • Schema Registry - Minimal local testing registry (RFC-032)

Week of Oct 7-9

  • Parallel Linting - 54-90x faster linting with comprehensive Python tooling
  • OIDC Integration Tests - prismctl authentication infrastructure (Phases 1-3)
  • Pattern SDK - Core SDK architecture and build system (RFC-022, RFC-025)
  • Authorization Layer - Token validation with Vault integration (RFC-019)
  • Topaz Integration - Policy-based authorization (ADR-050)

Full October 2025 changelog

Archive

Monthly archives with detailed entries:

  • November 2025 - Vault Integration, Unified Namespaces, Pattern Authentication
  • October 2025 - Admin UI, Mailbox Pattern, Control Plane, Pattern SDK