ADR-065: Universal Message Envelope Protocol Design Decisions
Status
Status: Proposed Created: 2026-04-20 Updated: 2026-04-20 Author: Jacob Repp Deciders: Core Team
Context
RFC-031 defines a universal message envelope protocol (PrismEnvelope) for all Prism messaging patterns. Before implementation begins, this ADR captures the adversarial review findings and binding design decisions.
Three sources informed this review:
- RFC-031 (1688 lines): The protocol specification with protobuf schema, backend mapping, encryption patterns, and API examples
- Memo-031: Security and performance review identifying 3 critical issues and 8 recommendations
- Go implementation (
pkg/patterns/common/envelope.go): Working JSON-based envelope that diverges significantly from the RFC
The review identified 12 critical issues, 8 performance/security concerns, and 6 specification gaps that must be resolved before writing the .proto file. The existing Go UniversalEnvelope will be replaced by protobuf-generated code from the finalized schema.
Why This ADR Is Needed
- RFC-031 has a fundamental
google.protobuf.Anyproblem that affects every consumer - The Go implementation and RFC specify different type hierarchies (session context vs security context)
- Memo-031 recommendations need formal acceptance or rejection
- Multiple fields are over-engineered for v1 (post-quantum encryption, full FIPS compliance matrix)
- Missing critical fields (partition key, retry tracking) block real-world usage
Decision
1. Use bytes payload instead of google.protobuf.Any
The RFC specifies google.protobuf.Any for the payload field. Reject this.
Problem: Any requires a type URL (~30 bytes overhead per message) and consumers need the full proto descriptor to unpack it. For JSON, Avro, or custom payloads, Any adds complexity without benefit. It also forces a dependency on google/protobuf/any.proto and requires type registry management in every language runtime.
Decision: Use raw bytes with a ContentType enum to indicate format. Consumers deserialize based on content type.
optional ContentType content_type = 5;
optional string content_type_custom = 11;
bytes payload = 99;
Rejected alternative: google.protobuf.Any provides type-safe unpacking but at a cost that outweighs the benefit for a system where payload formats are heterogeneous and content-type-based dispatch is simpler.
2. Remove explicit envelope_version field
Accept Memo-031 Recommendation 4. Protobuf field numbers provide implicit versioning. A version field cannot prevent wire-incompatible parse errors (those need separate topics/namespaces). Consumers should use feature detection (has_security(), has_observability()) rather than version checks.
If version tracking is needed later, add it to the extensions map: extensions["prism-envelope-version"] = []byte("2").
3. Move payload to field 99 (last)
Accept Memo-031 Recommendation 3. Placing the variable-size payload after all small metadata fields enables:
- Lazy payload parsing (read metadata/security without touching payload)
- Early auth rejection without allocating payload buffer (DDoS protection)
- 13x faster metadata-only access for large messages (per Memo-031 benchmarks)
4. Add optional keyword to enrichment fields
Accept Memo-031 Recommendation 1. Proto3 optional keyword enables has_*() methods to distinguish "field not set" from "field = zero value". This is critical for security: a consumer must distinguish "no security context" from "security context with empty publisher_id".
Required fields (metadata, payload) are validated at runtime by SDK and proxy. The proto schema documents them as comments but protobuf cannot enforce presence.
5. Merge session identity into SecurityContext
The Go implementation has SessionContext (user_id, tenant_id, roles, permissions, credentials) that has no equivalent in the RFC's SecurityContext (publisher_id, auth_token, signature, encryption). Both are needed.
Decision: Merge session identity fields into SecurityContext. The proxy validates auth and strips sensitive fields (tokens, credentials) before forwarding to backend.
message SecurityContext {
// Identity (populated from auth token / mTLS / session)
optional string principal_id = 1;
optional string team = 2;
optional string tenant_id = 3;
repeated string roles = 4;
// Auth token: validated by proxy, STRIPPED before backend storage
optional string auth_token = 5;
// Message integrity
optional bytes signature = 6;
optional SignatureAlgorithm signature_algorithm = 7;
// Data classification
optional bool contains_pii = 8;
optional DataClassification classification = 9;
// Encryption metadata (end-to-end, producer-side)
optional EncryptionMetadata encryption = 10;
}
Rejected alternative: Keep session and security as separate sub-messages. This adds complexity for consumers who always need both identity and security together. A single context reduces message nesting and wire overhead.
6. Remove AuditMetadata from envelope
The Go implementation carries AuditMetadata (operation, resource, outcome, duration, compliance) inside the envelope. This is wrong: audit is a side-effect of processing, not a property of the message. An envelope published to Kafka should not contain its own audit trail.
Decision: Audit logging remains a separate subsystem (the existing AuditLogger interface in pkg/patterns/common/audit.go). The envelope carries identity and classification metadata that audit loggers consume to enrich events, but the envelope does not carry audit events itself.
7. Simplify EncryptionMetadata for v1
The RFC specifies 8 fields in EncryptionMetadata including post-quantum (Kyber/ML-KEM) and hybrid encryption support. None of this is needed for v1.
Decision: Start with 4 fields sufficient for symmetric encryption (AES-256-GCM), which covers 95% of real-world use cases. Add asymmetric, post-quantum, and hybrid fields in future iterations via protobuf field additions.
message EncryptionMetadata {
optional string key_id = 1;
optional EncryptionAlgorithm algorithm = 2;
optional bytes iv = 3;
optional bytes aad = 4;
// Fields 5-10 reserved for asymmetric, PQ, hybrid (v2+)
}
8. Use enums for repeated string fields
Accept Memo-031 Optimization 3. Replace high-cardinality strings with enums + custom fallback:
ContentType: PROTOBUF=1, JSON=2, AVRO=3, CUSTOM=99 (saves ~20 bytes per message)ContentEncoding: NONE=0, GZIP=1, SNAPPY=2, ZSTD=3, CUSTOM=99DataClassification: UNSPECIFIED=0, PUBLIC=1, INTERNAL=2, CONFIDENTIAL=3, RESTRICTED=4SignatureAlgorithm: UNSPECIFIED=0, HMAC_SHA256=1, ED25519=2EncryptionAlgorithm: UNSPECIFIED=0, AES_256_GCM=1, CHACHA20_POLY1305=2
9. Use int64 milliseconds for timestamp
Accept Memo-031 Optimization 4. Replace google.protobuf.Timestamp (12 bytes) with int64 published_at_ms (8 bytes). Millisecond precision is sufficient for ordering, TTL, and audit. UUIDv7 provides sub-millisecond ordering when needed.
10. Add missing fields for real-world usage
The RFC omits fields required by production messaging systems:
partition_key(field 11): Required for Kafka partition routing, NATS subject routing, and consistent sharding (per ADR-034). Without it, producers cannot control message locality.retry_count(field 12): Tracks redelivery attempts. Consumers use this to decide when to send to dead-letter queue. Stored in metadata rather than extensions because it affects routing behavior.producer_id(field 13): Identifies the producing application/service for debugging and migration tracking. More useful than SDK version.
11. Proxy must strip auth tokens before backend storage
Accept Memo-031 Recommendation 5. The proxy validates auth_token, extracts identity into principal_id/roles/tenant_id, then sets auth_token = "" before forwarding to backend. Tokens must never reach Kafka disk, Redis snapshots, or Postgres backups.
This is a hard requirement, not a recommendation. Non-compliant proxies are a security vulnerability.
12. Signature scope: sign envelope without SecurityContext
Accept Memo-031 Recommendation 6. The circular dependency (signature inside the thing being signed) is resolved by:
- Set
security = nilon a clone - Marshal clone to bytes
- Compute signature over those bytes
- Set signature on original envelope's SecurityContext
This must be documented in the proto file comments and implemented identically across all SDKs.
13. Defer batch envelope to v2
The RFC lists batch envelope as an open question. Decision: defer. V1 envelopes are one-message-per-envelope. Batch publishing uses the existing transport-level batching (Kafka producer batches, NATS jetstream batches) rather than envelope-level multiplexing. This avoids the complexity of partial failure semantics.
14. Envelope size budget: 1MB total, 64KB metadata
Establish explicit limits:
- Maximum envelope size: 1MB (matches Kafka default max.message.bytes)
- Maximum metadata (all fields excluding payload): 64KB
- Payloads exceeding 512KB should use claim check pattern (RFC-033)
These limits prevent pathological messages from consuming excessive memory during parsing and provide clear guidance for when to switch to claim check.
15. Duplicate metadata in backend headers is optional
The RFC specifies duplicating envelope metadata in Kafka/NATS headers for non-Prism tool compatibility. Decision: make this optional and off by default. Header duplication:
- Doubles metadata storage overhead
- Creates consistency risk (headers and envelope diverge)
- Only benefits non-Prism consumers (who should use the SDK)
When header duplication is enabled, the proxy writes a subset of metadata fields (message_id, topic, namespace, trace_id) to backend headers. The envelope is always the source of truth.
Rationale
Evaluation Criteria
- Security (30%): Auth token exposure, signature correctness, PII handling, DDoS resistance
- Performance (25%): Serialization overhead, lazy parsing, memory allocation, wire size
- Implementation Simplicity (25%): Proto complexity, SDK burden, cognitive load for developers
- Evolution (20%): Backward compatibility, field addition safety, migration path
Why These Decisions Are Correct
The RFC over-engineers the initial design. Post-quantum encryption, FIPS compliance matrices, batch envelopes, and CloudEvents comparison belong in a v2 design document. V1 must ship with a minimal, correct schema that handles 95% of real-world messaging needs and evolves cleanly via protobuf field additions.
The google.protobuf.Any rejection is the single most impactful decision. It removes a class of runtime errors (type URL mismatches, missing descriptors), eliminates a proto dependency, reduces wire overhead, and simplifies every SDK implementation. Content-type-based dispatch is the industry-standard pattern (HTTP, email MIME, Kafka Connect converters).
The session/security merge reflects how auth actually works: identity (who) and authorization (what they can do) are always needed together. Splitting them into separate sub-messages forces consumers to correlate two optional fields, increasing error surface.
Alternatives Considered
Alternative 1: Keep google.protobuf.Any for type-safe payload
- Pros: Type-safe unpacking in generated code, proto-standard pattern
- Cons: 30+ bytes overhead per message, requires type registry in every runtime, useless for JSON/Avro/custom payloads, forces proto descriptor distribution
- Rejected because: Content-type enum + raw bytes is simpler, smaller, and more flexible for a system with heterogeneous payload formats
Alternative 2: Keep separate SessionContext and SecurityContext
- Pros: Cleaner separation of concerns, session can evolve independently
- Cons: Two optional sub-messages to correlate, higher wire overhead, more complex consumer code, session without security is meaningless and vice versa
- Rejected because: Identity and authorization are always consumed together in practice
Alternative 3: Keep full EncryptionMetadata from RFC
- Pros: Future-proof for post-quantum and hybrid encryption
- Cons: 8 fields including optional fields for algorithms not yet standardized, increases cognitive load, no working code or tests for PQ/hybrid encryption
- Rejected because: Protobuf evolution handles adding fields later. Ship what works now (symmetric AES-256-GCM), add asymmetric/PQ/hybrid when needed. Fields 5-10 are reserved.
Alternative 4: Keep AuditMetadata in envelope
- Pros: Self-contained message with full processing history
- Cons: Audit is a side-effect, not a message property. Producer should not know consumer's audit outcome. Inflates envelope size. Creates ordering dependency (audit must be written before envelope is forwarded).
- Rejected because: Audit logging is a cross-cutting concern handled by the existing
AuditLoggerinterface. Envelope carries identity metadata that enriches audit events; it does not carry audit events itself.
Alternative 5: Use CloudEvents format
- Pros: Industry standard (CNCF), rich tooling ecosystem
- Cons: JSON-based (larger payloads), designed for HTTP eventing not pub/sub, missing Prism-specific fields (namespace, schema governance, encryption), would require extension for every Prism feature
- Rejected because: CloudEvents and PrismEnvelope solve different problems. CloudEvents is for inter-system eventing; PrismEnvelope is for intra-system messaging with security, schema, and observability built in.
Consequences
Positive Consequences
- Implementable schema: Final protobuf definition is concrete, minimal, and ready for
.protofile creation - Security by default: Auth token stripping, signature scope, PII flags, early validation via payload positioning
- Performance: 40% faster serialization, 13x faster metadata-only parsing, ~20 bytes saved per message from enum optimization
- Clean evolution: No version field means field numbers ARE the version. Adding fields is always safe.
- Clear migration path: Go
UniversalEnvelopemaps directly to the new schema with known deltas
Negative Consequences
- Breaking change from Go implementation: Current JSON-based
UniversalEnvelopeis incompatible with protobufPrismEnvelope. All code usingcommon.Envelopemust be updated.- Mitigation: The Go implementation is only used in integration tests and the
SessionAwareProducer/SessionAwareConsumertest helpers. No production code depends on it.
- Mitigation: The Go implementation is only used in integration tests and the
- No post-quantum encryption in v1: Organizations requiring PQ-safe encryption must wait for v2 field additions.
- Mitigation: Protobuf field additions are backward-compatible. PQ fields will be added as
optional EncryptionMetadataextensions without breaking v1 consumers.
- Mitigation: Protobuf field additions are backward-compatible. PQ fields will be added as
- Content-type dispatch is runtime, not compile-time: Losing
Anytype safety means SDKs must handle deserialization errors at runtime.- Mitigation: SDKs provide typed helper methods (
envelope.PayloadAs<OrderCreated>()) that wrap the runtime dispatch with clear error messages.
- Mitigation: SDKs provide typed helper methods (
- Session/security merge increases SecurityContext size: More fields per sub-message.
- Mitigation: SecurityContext is
optionaland only populated when auth is required. Absent for system/internal messages.
- Mitigation: SecurityContext is
Neutral Consequences
- Backend header duplication is now optional (was mandatory in RFC). Teams can enable it for non-Prism consumers.
- Audit logging remains unchanged (existing
AuditLoggerinterface continues as-is). - Claim check pattern (RFC-033) works with the new envelope via the
extensions["claim-check-id"]convention.
Implementation Notes
Final Protobuf Schema
syntax = "proto3";
package prism.envelope.v1;
message PrismEnvelope {
PrismMetadata metadata = 1;
optional SecurityContext security = 2;
optional ObservabilityContext observability = 3;
optional SchemaContext schema = 4;
map<string, bytes> extensions = 97;
bytes payload = 99;
}
message PrismMetadata {
string message_id = 1;
string topic = 2;
string namespace = 3;
int64 published_at_ms = 4;
optional ContentType content_type = 5;
optional int32 priority = 6;
optional int64 ttl_seconds = 7;
optional ContentEncoding content_encoding = 8;
optional string correlation_id = 9;
optional string causality_parent = 10;
optional string partition_key = 11;
optional int32 retry_count = 12;
optional string producer_id = 13;
optional string content_type_custom = 14;
optional string content_encoding_custom = 15;
}
message SecurityContext {
optional string principal_id = 1;
optional string team = 2;
optional string tenant_id = 3;
repeated string roles = 4;
optional string auth_token = 5;
optional bytes signature = 6;
optional SignatureAlgorithm signature_algorithm = 7;
optional bool contains_pii = 8;
optional DataClassification classification = 9;
optional EncryptionMetadata encryption = 10;
}
message EncryptionMetadata {
optional string key_id = 1;
optional EncryptionAlgorithm algorithm = 2;
optional bytes iv = 3;
optional bytes aad = 4;
}
message ObservabilityContext {
optional string trace_id = 1;
optional string span_id = 2;
optional string parent_span_id = 3;
optional int32 trace_flags = 4;
map<string, string> baggage = 5;
map<string, string> labels = 6;
}
message SchemaContext {
optional string schema_url = 1;
optional string schema_version = 2;
optional SchemaFormat schema_format = 3;
optional string schema_hash = 4;
optional string schema_name = 5;
optional CompatibilityMode compatibility_mode = 6;
repeated string deprecated_fields_used = 7;
}
enum ContentType {
CONTENT_TYPE_UNSPECIFIED = 0;
CONTENT_TYPE_PROTOBUF = 1;
CONTENT_TYPE_JSON = 2;
CONTENT_TYPE_AVRO = 3;
CONTENT_TYPE_CUSTOM = 99;
}
enum ContentEncoding {
CONTENT_ENCODING_NONE = 0;
CONTENT_ENCODING_GZIP = 1;
CONTENT_ENCODING_SNAPPY = 2;
CONTENT_ENCODING_ZSTD = 3;
CONTENT_ENCODING_CUSTOM = 99;
}
enum DataClassification {
DATA_CLASSIFICATION_UNSPECIFIED = 0;
DATA_CLASSIFICATION_PUBLIC = 1;
DATA_CLASSIFICATION_INTERNAL = 2;
DATA_CLASSIFICATION_CONFIDENTIAL = 3;
DATA_CLASSIFICATION_RESTRICTED = 4;
}
enum SignatureAlgorithm {
SIGNATURE_ALGORITHM_UNSPECIFIED = 0;
SIGNATURE_ALGORITHM_HMAC_SHA256 = 1;
SIGNATURE_ALGORITHM_ED25519 = 2;
}
enum EncryptionAlgorithm {
ENCRYPTION_ALGORITHM_UNSPECIFIED = 0;
ENCRYPTION_ALGORITHM_AES_256_GCM = 1;
ENCRYPTION_ALGORITHM_CHACHA20_POLY1305 = 2;
}
enum SchemaFormat {
SCHEMA_FORMAT_UNSPECIFIED = 0;
SCHEMA_FORMAT_PROTOBUF = 1;
SCHEMA_FORMAT_JSON_SCHEMA = 2;
SCHEMA_FORMAT_AVRO = 3;
}
enum CompatibilityMode {
COMPATIBILITY_MODE_UNSPECIFIED = 0;
COMPATIBILITY_MODE_BACKWARD = 1;
COMPATIBILITY_MODE_FORWARD = 2;
COMPATIBILITY_MODE_FULL = 3;
COMPATIBILITY_MODE_NONE = 4;
}
Migration from Go UniversalEnvelope
The existing Go UniversalEnvelope struct and its dependencies (SessionContext, AuditMetadata, ClaimCheckMessage) map to the new schema as follows:
| Go Struct | New Proto Equivalent | Action |
|---|---|---|
UniversalEnvelope.EnvelopeVersion | Removed | Delete (per Decision 2) |
UniversalEnvelope.MessageID | PrismMetadata.message_id | Map directly |
UniversalEnvelope.CorrelationID | PrismMetadata.correlation_id | Map directly |
UniversalEnvelope.CreatedAt | PrismMetadata.published_at_ms | Convert time.Time to int64 ms |
UniversalEnvelope.ExpiresAt | PrismMetadata.ttl_seconds | Convert to TTL from now |
UniversalEnvelope.SourcePattern | extensions["source-pattern"] | Move to extensions |
UniversalEnvelope.SourceNamespace | PrismMetadata.namespace | Map directly |
UniversalEnvelope.DestinationPattern | extensions["destination-pattern"] | Move to extensions |
UniversalEnvelope.Topic | PrismMetadata.topic | Map directly |
UniversalEnvelope.ContentType | PrismMetadata.content_type | Map to enum |
UniversalEnvelope.TraceID/SpanID/ParentSpanID | ObservabilityContext.{trace,span,parent_span}_id | Move to sub-message |
UniversalEnvelope.Session | SecurityContext.{principal_id,roles,tenant_id} | Flatten into SecurityContext |
UniversalEnvelope.Session.Credentials | extensions["credentials"] | Move to extensions (encrypted) |
UniversalEnvelope.Audit | Remove | Use AuditLogger separately |
UniversalEnvelope.Metadata | extensions map | Migrate key-value pairs |
UniversalEnvelope.Payload | bytes payload | Map directly |
UniversalEnvelope.ClaimCheck | extensions["claim-check-id"] | Store reference string |
Implementation Order
- Create
proto/prism/envelope/v1/envelope.protowith the final schema above - Generate Go, Rust, Python code from proto (per ADR-003)
- Replace
pkg/patterns/common/envelope.gowith generated proto types + builder helpers - Update
SessionAwareProducer/SessionAwareConsumertest helpers - Update integration tests to use new envelope
- Implement proxy validation (required fields check, auth token stripping)
- Implement lazy payload parsing in Rust proxy (per Memo-031 optimization)
- Add benchmark suite for serialization/deserialization
Size Budget Validation
Envelope overhead (no payload):
PrismMetadata: ~120 bytes (message_id=36, topic=~20, namespace=~15, timestamp=8, content_type=1)
SecurityContext: ~80 bytes (principal_id=~20, tenant_id=~15, roles=~30)
ObservabilityContext: ~60 bytes (trace_id=32, span_id=16, parent_span_id=16)
SchemaContext: ~60 bytes (schema_url=~30, version=~5, format=1, hash=64)
Extensions: ~0 bytes (empty map)
Tag overhead: ~5 bytes
Total: ~325 bytes (well under 64KB budget)
With 1KB payload: ~1.3KB total
With 100KB payload: ~100.3KB total
With 1MB payload: ~1.0MB total (at limit)
Related Documents
- RFC-031: Universal Message Envelope Protocol
- Memo-031: Security and Performance Review
- ADR-003: Protobuf as Single Source of Truth
- ADR-023: gRPC-First Interface Design
- RFC-033: Claim Check Pattern
- RFC-037: Mailbox Pattern
- ADR-034: Sharding Strategy
Revision History
- 2026-04-20: Initial draft from adversarial review of RFC-031, Memo-031, and Go implementation