MEMO-082: Configuration Schema and Code Generation System
Overview
This memo describes a configuration schema system that prevents drift between documentation, schemas, and actual configuration files. The system uses JSON Schema as the source of truth for all configuration formats, with automated generation of documentation, validation, and examples.
Problem Statement
Currently, Prism has configuration drift across multiple layers:
- Documentation drift: RFCs describe ideal config, but examples don't match
- Schema drift: YAML configs use fields not defined in protobuf schemas
- Validation drift: Validation logic scattered across Go/Rust/Python code
- Example drift: Example configs in README don't match actual working configs
Concrete Examples of Drift
From patterns/multicast_registry/examples/redis-nats.yaml:
pattern: multicast-registry # Not in any proto definition
slots: # Not in NamespaceConfig proto
registry:
backend: redis
From proto/prism/control_plane.proto:
message NamespaceConfig {
map<string, BackendConfig> backends = 1; # Different structure!
map<string, PatternConfig> patterns = 2;
}
From RFC-056 (documentation):
needs:
durability: strong # Structured in RFC-056
write_rps: 5000
From current code (actual implementation):
message NamespaceConfig {
map<string, string> metadata = 4; # Everything dumped in metadata!
}
Solution: Schema-Driven Configuration
Principles
- JSON Schema is the source of truth for all configuration formats
- Documentation is generated from schemas (never hand-written)
- Validation is centralized using JSON Schema validators
- Examples are validated against schemas in CI
- Protobuf is generated from JSON Schema for gRPC APIs
Architecture
┌─────────────────────────────────────────────────────────────┐
│ Source of Truth │
│ schemas/config/*.schema.json │
│ (JSON Schema with custom x-* extensions) │
└─────────────────────┬───────────────────────────────────────┘
│
┌───────────┼───────────┐
│ │ │
▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌──────────┐
│ Validate│ │Generate │ │ Generate │
│ YAMLs │ │ Docs │ │ Examples │
└─────────┘ └─────────┘ └──────────┘
│ │ │
▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌──────────┐
│ CI Check│ │ Markdown│ │ YAML │
│ (fail) │ │ in docs │ │ in repo │
└─────────┘ └─────────┘ └──────────┘
Implementation
Phase 1: Schema Definition (Completed)
Created JSON Schema files with rich metadata:
schemas/config/namespace-request.schema.json:
- Defines Layer 1 (User Request) from RFC-056
- Includes validation rules, permission level constraints, examples
- Custom
x-*extensions for:x-rfc: RFC/ADR referencesx-permission-level: Permission constraints per fieldx-triggers-pattern: Which patterns are triggeredx-backend-preference: Backend selection hintsx-validation-error: Custom error messagesx-quota: Which quota this counts against
Example schema snippet:
{
"properties": {
"needs": {
"properties": {
"durability": {
"enum": ["strong", "eventual", "best-effort"],
"x-enum-descriptions": {
"strong": "Sync disk writes before ack (no data loss)",
"eventual": "Async disk writes (minimal data loss window)",
"best-effort": "In-memory only (fastest)"
},
"x-permission-level": {
"guided": ["strong", "eventual", "best-effort"],
"advanced": ["strong", "eventual", "best-effort"],
"expert": ["strong", "eventual", "best-effort"]
},
"x-rfc": "RFC-056"
}
}
}
}
}
Phase 2: Validation and Documentation Tools (Completed)
tooling/config_schema_tools.py provides:
-
Validation:
uv run tooling/config_schema_tools.py validate <config.yaml>- Validates YAML against JSON Schema
- Returns detailed error messages with field paths
- Exit code 0 if valid, 1 if invalid
-
Documentation Generation:
uv run tooling/config_schema_tools.py docs <schema.json>- Generates Markdown documentation from schema
- Includes all validation rules, permission levels, examples
- Documents enum values with descriptions
- References RFCs/ADRs automatically
-
Example Generation:
uv run tooling/config_schema_tools.py example <schema.json>- Generates valid example YAML from schema
- Uses schema examples where provided
- Generates valid values based on validation rules
-
Drift Detection:
uv run tooling/config_schema_tools.py drift <config-dir>- Finds configs that don't match schema
- Detects deprecated fields (used but not in schema)
- Detects unused fields (in schema but never used)
- Provides summary report
Phase 3: CI Integration (To-Do)
Add to .github/workflows/ci.yml:
- name: Validate Configuration Schemas
run: |
# Validate all example configs against schemas
uv run tooling/config_schema_tools.py validate patterns/*/examples/*.yaml
# Check for drift
uv run tooling/config_schema_tools.py drift patterns/
# Ensure documentation is up-to-date
uv run tooling/config_schema_tools.py docs schemas/config/namespace-request.schema.json \
--output docs-cms/config/namespace-request.md
# Fail if docs have changed (means they weren't regenerated)
git diff --exit-code docs-cms/config/
Phase 4: Schema Coverage (To-Do)
Create schemas for all configuration types:
- ✅ namespace-request.schema.json (Layer 1: User Request)
- ⬜ platform-policy.schema.json (Layer 2: Team Quotas & Permissions)
- ⬜ pattern-selection.schema.json (Layer 3: Pattern Selection Output)
- ⬜ backend-registry.schema.json (Layer 4: Backend Definitions)
- ⬜ frontend-registry.schema.json (Layer 5: API Bindings)
- ⬜ runtime-config.schema.json (Layer 6: Runtime Process Config)
Phase 5: Protobuf Generation (To-Do)
Generate protobuf definitions from JSON Schema:
tooling/json_schema_to_proto.py:
def generate_proto_from_schema(schema_path: Path) -> str:
"""Convert JSON Schema to .proto definition."""
# Read schema
# Generate message types from object schemas
# Generate enums from enum schemas
# Add field options from x-* extensions
# Write .proto file
Usage:
uv run tooling/json_schema_to_proto.py \
schemas/config/namespace-request.schema.json \
--output proto/prism/config/v1/namespace_request.proto
Usage Examples
Validating a Config File
# Validate a namespace request config
uv run tooling/config_schema_tools.py validate \
patterns/multicast_registry/examples/redis-nats.yaml \
--schema schemas/config/namespace-request.schema.json
# Output:
# ❌ patterns/multicast_registry/examples/redis-nats.yaml has validation errors:
# - needs.durability: 'strong' is required but missing
# - slots: Additional property not allowed (not in schema)
Generating Documentation
# Generate Markdown documentation
uv run tooling/config_schema_tools.py docs \
schemas/config/namespace-request.schema.json \
--output docs-cms/config/namespace-request.md
# Output: docs-cms/config/namespace-request.md with:
# - Field descriptions
# - Validation rules
# - Permission level constraints
# - Examples
# - RFC references
Checking for Drift
# Check all pattern configs for drift
uv run tooling/config_schema_tools.py drift patterns/
# Output:
# 📊 Configuration Drift Report
#
# Total configs: 15
# Invalid configs: 5
# Deprecated fields: 8
# Unused fields: 3
#
# ❌ Invalid Configs:
# patterns/multicast_registry/examples/redis-nats.yaml:
# - needs.durability: Required property missing
# - slots: Additional property not allowed
#
# ⚠️ Deprecated Fields (used but not in schema):
# - slots
# - config.pattern_name
# - behavior.max_identities
#
# 💤 Unused Fields (in schema but never used):
# - policies.encryption.key_rotation
# - needs.partition_count
Generating Example Configs
# Generate example YAML
uv run tooling/config_schema_tools.py example \
schemas/config/namespace-request.schema.json \
--output examples/namespace-request-example.yaml
# Output: Valid YAML with all required fields filled with example values
Migration Plan
Step 1: Audit Current Configs (Week 1)
- Run drift detection on all existing configs
- Document all deprecated fields
- Create migration guide for each pattern
Step 2: Update Schemas (Week 1-2)
- Update JSON Schema to match actual usage
- Add missing fields from current configs
- Document which fields are deprecated vs new
Step 3: Update Configs (Week 2-3)
- Migrate example configs to new schema
- Validate all configs pass
- Update README/documentation
Step 4: Update Code (Week 3-4)
- Generate new protobuf from schemas
- Update Go/Rust code to use new proto
- Update validation logic to use schema
Step 5: CI Enforcement (Week 4)
- Add schema validation to CI
- Make schema validation required for PR merge
- Add documentation generation to CI
Benefits
Prevents Configuration Drift
Before:
RFC-056 (docs) ➜ Hand-written examples ➜ Protobuf ➜ Go/Rust code
↓ ↓ ↓ ↓
Drift 1 Drift 2 Drift 3 Drift 4
After:
JSON Schema (source of truth)
├── Validates ➜ YAML configs (CI enforced)
├── Generates ➜ Documentation (always in sync)
├── Generates ➜ Protobuf (single source)
└── Validates ➜ Examples (CI enforced)
Improves Developer Experience
- Faster validation: Immediate feedback on invalid configs
- Better errors: Detailed error messages with field paths
- Always current docs: Documentation generated from schema
- Correct examples: Examples validated in CI
Enables Advanced Features
- Auto-completion: IDE support using JSON Schema
- Web UI: JSON Schema powers form generation
- Migration tools: Automated config version upgrades
- Policy validation: Permission levels enforced at schema level
Alternatives Considered
Alternative 1: Protobuf as Source of Truth
Pros:
- Already using protobuf
- Native gRPC support
Cons:
- Protobuf doesn't support rich validation rules
- No standard for documentation generation
- Custom options not well-supported by tooling
- YAML ↔ Protobuf conversion loses information
Decision: Use JSON Schema, generate protobuf from it
Alternative 2: Manual Documentation
Pros:
- Full control over documentation format
Cons:
- Documentation drifts from implementation
- No automated validation
- Examples become stale
Decision: Generate documentation from schema
Alternative 3: Code as Source of Truth
Pros:
- Code is always correct
Cons:
- Code doesn't capture intent (only implementation)
- No single source for validation logic
- Hard to generate documentation from code
Decision: Schema is higher-level than code
Future Work
Schema Registry Service
Create a schema registry service that:
- Stores all configuration schemas
- Provides schema validation API
- Tracks schema evolution history
- Validates backward compatibility
Config Migration Tool
Create a migration tool that:
- Detects config version from schema
- Automatically upgrades configs to latest version
- Validates migration completeness
- Generates migration reports
Web-based Config Editor
Create a web UI that:
- Uses JSON Schema for form generation
- Validates in real-time
- Shows inline documentation
- Generates valid YAML
Related Documentation
- RFC-056: Unified Configuration Model (6-layer model)
- ADR-002: Client-Originated Configuration (permission levels)
- ADR-022: Dynamic Client Configuration (runtime updates)
- RFC-039: Backend Configuration Registry (backend definitions)
- RFC-027: Namespace Configuration Client Perspective (user-facing config)
Implementation Checklist
Phase 1: Schema Definition
- Create namespace-request.schema.json
- Create platform-policy.schema.json
- Create pattern-selection.schema.json
- Create backend-registry.schema.json
- Create frontend-registry.schema.json
- Create runtime-config.schema.json
Phase 2: Tooling
- Build config_schema_tools.py (validation, docs, examples, drift)
- Build json_schema_to_proto.py (protobuf generation)
- Build config_migration_tool.py (version upgrades)
Phase 3: Migration
- Audit all existing configs (drift report)
- Update schemas to match reality
- Migrate example configs
- Update code to use new schemas
Phase 4: CI Integration
- Add schema validation to CI
- Add documentation generation to CI
- Add drift detection to CI
- Make validation required for merge
Phase 5: Documentation
- Generate docs from schemas
- Update READMEs with new config format
- Create migration guide
- Update RFCs with schema references
Success Metrics
- Zero drift: All configs validate against schemas
- 100% coverage: All config types have schemas
- Always current: Documentation auto-generated from schemas
- Fast feedback: CI validates configs in <1 minute
- Developer satisfaction: Positive feedback on DX improvements
Conclusion
The configuration schema system solves drift problems by making JSON Schema the single source of truth. All documentation, validation, and examples are generated from schemas, ensuring they stay in sync.
The system is implemented in tooling/config_schema_tools.py with schemas in schemas/config/. Next steps are CI integration and migrating existing configs to the new format.