š Schema Registry & Compatibility Rules
The Contract System Behind Event-Driven Architectures
Event streaming without schema governance is just distributed chaos.
A Schema Registry is not about serialization.
It is about controlling change over time.
1ļøā£ Why You Need a Schema Registry
Without a registry:
Producers change fields silently
Consumers break in production
Replays fail
Historical data becomes unreadable
Rollbacks become impossible
In event systems, schema evolution is not optional ā it is continuous.
2ļøā£ What a Schema Registry Actually Does
At a high level:
Stores event schemas (Avro / Protobuf / JSON Schema)
Assigns version IDs
Enforces compatibility rules
Prevents breaking changes
Provides schema lookup at runtime
The registry sits between developers and chaos.
3ļøā£ The Core Concept: Compatibility Modes
This is where depth begins.
There are four primary compatibility types:
Mode | Who must survive change? |
|---|---|
Backward | Old consumers |
Forward | Old producers |
Full | Both |
None | No guarantees |
4ļøā£ Backward Compatibility (Most Common)
New schema can read old data.
Used when:
Consumers upgrade before producers
Replay is common
Example
V1
{
"orderId": "string",
"total": "number"
}
V2 (Add optional field)
{
"orderId": "string",
"total": "number",
"currency": "string?"
}
ā Backward compatible
Old events still readable.
5ļøā£ Forward Compatibility
Old schema can read new data.
Used when:
Producers upgrade first
Consumers lag behind
Rule:
New fields must have defaults or be optional.
6ļøā£ Full Compatibility
Both backward and forward compatible.
This is safest but most restrictive.
7ļøā£ The Dangerous One: None
No compatibility checks.
This is how production outages happen.
8ļøā£ Compatibility Deep Dive (Avro Example)
Letās go precise.
Rule 1: Adding a field
ā Allowed if:
Field has default
Or is optional
Rule 2: Removing a field
ā Allowed only if:
Field was optional
Or had default
Rule 3: Changing field type
ā Usually breaking.
"total": "string" ā "number"
Breaks deserialization.
9ļøā£ Why Type Systems Matter
Avro:
Strong schema enforcement
Supports evolution rules
JSON:
Looser
Riskier
Often breaks silently
Protobuf:
Field numbers matter
Never reuse field numbers
Reserve removed fields
š Hard Production Rules
NEVER:
Rename fields
Reuse field numbers (Protobuf)
Change meaning of a field
Remove required fields
Change enum values carelessly
1ļøā£1ļøā£ Compatibility vs Semantics
Compatibility rules check structure, not meaning.
This is the subtle danger.
Example:
"amount": number
Before: cents
After: dollars
Schema compatible.
Semantically catastrophic.
Schema registry protects structure, not intent
1ļøā£2ļøā£ Subject-Level Compatibility
In Confluent-style systems:
Each topic/subject has:
Its own compatibility mode
Its own version history
Example:
orders-value ā BACKWARD
payments-value ā FULL
audit-log ā NONE
1ļøā£3ļøā£ Rolling Deploy Safety Pattern
Safe rollout order depends on compatibility mode.
If BACKWARD:
Deploy consumers
Deploy producers
If FORWARD:
Deploy producers
Deploy consumers
Get this wrong ā outage.
Compatibility determines safe order.
1ļøā£4ļøā£ Schema ID Encoding (Wire Format)
Producer sends:
[magic byte][schema ID][serialized payload]
Consumer:
Reads schema ID
Fetches schema
Deserializes
This enables:
Multiple schema versions in same topic
Long-lived history
1ļøā£5ļøā£ Multi-Team Governance Problem
Without registry:
Team A changes event
Team B deploys later
Production crash
With registry:
CI fails on incompatible schema
Change blocked before deploy
1ļøā£6ļøā£ Advanced: Transitive Compatibility
Instead of comparing only to latest schema:
Compare against all historical schemas.
Why?
Because replay from 3 years ago must still work.
Modes:
BACKWARD_TRANSITIVE
FORWARD_TRANSITIVE
FULL_TRANSITIVE
Transitive compatibility is required for event sourcing systems.
1ļøā£7ļøā£ Schema Evolution in Event Sourcing
Event sourcing demands:
Transitive backward compatibility
Infinite retention safety
Upcasting support
Otherwise:
- You cannot rebuild projections.
1ļøā£8ļøā£ Versioning Strategy Matrix
Strategy | Pros | Cons |
|---|---|---|
Additive fields | Simple | Limited flexibility |
Versioned events | Clear | Proliferation |
Upcasters | Clean consumers | Central complexity |
Dual publish | Safe migration | Storage overhead |
1ļøā£9ļøā£ Failure Story (Realistic)
Team adds required field:
"country": "string"
Without default.
Old consumer crashes.
Kafka lag spikes.
Autoscaling reacts.
Retry storm begins.
Outage.
Root cause:
No compatibility enforcement.
2ļøā£0ļøā£ Production-Grade Setup Checklist
If you run Kafka in production:
ā Schema Registry enforced in CI
ā Transitive backward compatibility
ā Consumer contract tests
ā Replay test environment
ā Schema documentation
ā Event semantic versioning
