Spaxiom Logo
Spaxiom Technical Series - Part 15

Forensics and Explainability

Event Timelines, Agent Explainability, and Schema Evolution for Production Systems

Joe Scanlin

November 2025

About This Section

This section demonstrates how Spaxiom's structured event abstraction enables powerful forensic analysis and agent explainability. Instead of sifting through millions of raw sensor readings after an incident, you get structured event timelines that make it trivial to understand what happened and why.

You'll learn about event-based forensics vs raw sensor logs, explainable agent decisions using human-readable event vocabularies, semantic versioning for event schemas (MAJOR.MINOR.PATCH), backward/forward compatibility strategies, schema migration patterns (dual-write, adapters), schema registries for coordinating upgrades across thousands of sites, and best practices from production deployments. Includes an interactive timeline visualization and a complete HVAC schema migration case study showing 15% performance improvement.

9. Forensics and Explainability

9.1 Raw logs vs event timelines

Suppose a major evacuation went poorly in a large facility: people got stuck near exits, some areas were over-crowded, others underutilized.

Naïve forensic data:

Spaxiom forensic data:

A structured event timeline such as:

[
  {"type": "AlarmTriggered", "zone": "lobby", "t": "13:02:00Z"},
  {"type": "CrowdFormation", "zone": "exit-west", "start": "13:02:30Z", "peak_occupancy_pct": 95},
  {"type": "DoorBlocked", "zone": "exit-west", "start": "13:04:10Z"},
  {"type": "UnderutilizedExit", "zone": "exit-east", "start": "13:04:30Z"},
  {"type": "EvacuationComplete", "zone": "building", "t": "13:12:00Z"}
]

This enables forensic queries like:

These are straightforward to express as temporal logic or event-graph queries atop Spaxiom's event store.

9.2 Explainable agents

Suppose an agent made a controversial decision (e.g., temporarily locking an entrance to redirect evacuees). We can ask:

"Explain your decision using only the event history, not raw sensor values."

Because the agent's inputs are already INTENT events, it can answer in terms humans understand:

"At 13:02:30, CrowdFormation at exit-west exceeded 90% occupancy.
At 13:04:10, DoorBlocked was detected there.
UnderutilizedExit at exit-east persisted for 3 minutes.
Redirecting traffic to exit-east was predicted to reduce peak density at exit-west by 40%."

Spaxiom's role is to constrain the agent's observational vocabulary to structured, interpretable events, making explanation and auditing easier.

Figure 6 (Event Timeline Visualization)

AlarmTriggered
zone: lobby
13:02:00
CrowdFormation
exit-west, 95%
13:02:30
DoorBlocked
exit-west
13:04:10
UnderutilizedExit
exit-east
13:04:30
EvacuationComplete
building
13:12:00
AlarmTriggered
CrowdFormation
DoorBlocked
UnderutilizedExit
EvacuationComplete

Timeline showing evacuation events with causal arrows (CrowdFormation → DoorBlocked → EvacuationDelay). Each event is color-coded and positioned temporally, making it easy to understand the sequence and relationships between events.

9.3 Event Schema Evolution and Versioning

Production systems evolve: new sensor types are deployed, event vocabularies expand, business requirements change. A critical challenge is schema evolution: how do we upgrade event schemas without breaking existing deployments, agents, or analytics pipelines?

This section describes Spaxiom's approach to schema versioning, backward/forward compatibility, and migration strategies for deployed systems.

The schema evolution problem

Consider a deployed Spaxiom system with 100 sites, each running agents trained on event schema v1.0. We want to deploy schema v2.0 with new fields or event types. Challenges:

Without careful versioning, schema evolution leads to fragmentation, breakage, and technical debt.

Semantic versioning for event schemas

Spaxiom adopts semantic versioning (SemVer) for event schemas:

Version = MAJOR.MINOR.PATCH

Each event includes a schema_version field:

{
    "type": "DoorOpened",
    "schema_version": "2.1.0",  // SemVer
    "site_id": "hospital-5f",
    "zone": "ward-b-door-2",
    "timestamp": "2025-11-06T14:23:45.123456Z",
    "occupancy_before": 12,      // Added in v2.0
    "occupancy_after": 13,       // Added in v2.0
    "access_badge_id": "A1234"   // Added in v2.1 (optional)
}

Backward compatibility: old consumers, new schemas

When introducing minor version changes (e.g., v2.0 → v2.1), new fields must be optional. Old consumers (agents, analytics) that expect v2.0 can safely ignore v2.1's new fields.

Spaxiom enforces this via schema validation:

from spaxiom.schema import EventSchema

# Define schema v2.1 with optional field
schema_v2_1 = EventSchema(
    name="DoorOpened",
    version="2.1.0",
    required_fields=["type", "schema_version", "site_id", "zone", "timestamp"],
    optional_fields=["occupancy_before", "occupancy_after", "access_badge_id"]
)

# Old consumer expects v2.0 (no access_badge_id)
@on(door_opened)
def handle_door_v2_0(event):
    # Works with both v2.0 and v2.1 events
    # access_badge_id is None if not present
    badge = event.get("access_badge_id", None)
    log_entry(event["zone"], badge)

Backward compatibility rules:

Forward compatibility: new consumers, old schemas

When a consumer expects v2.1 but receives v2.0 events (missing access_badge_id), it must handle gracefully:

@on(door_opened)
def handle_door_v2_1(event):
    # Explicitly check schema version
    if event.schema_version >= "2.1.0":
        badge = event["access_badge_id"]
    else:
        # Fallback for v2.0: badge unknown
        badge = "UNKNOWN"

    log_entry(event["zone"], badge)

Spaxiom provides utilities for version comparison:

from spaxiom.schema import version_gte

if version_gte(event["schema_version"], "2.1.0"):
    # Use v2.1 features
    process_badge(event["access_badge_id"])
else:
    # Fall back to v2.0 behavior
    process_no_badge()

Breaking changes and major version upgrades

Sometimes breaking changes are unavoidable:

These require a MAJOR version bump (v2.x → v3.0) and explicit migration.

Migration strategy 1: Dual-write during transition

During migration, emit events in both v2.x and v3.0 formats:

from spaxiom.schema import EventEmitter

emitter = EventEmitter()

# Emit both versions during migration window
def on_door_opened():
    # v2.x event (legacy)
    emitter.emit({
        "type": "DoorOpened",
        "schema_version": "2.1.0",
        "timestamp": time.time(),
        "zone": "ward-b-door-2"
    })

    # v3.0 event (new)
    emitter.emit({
        "type": "DoorOpened",
        "schema_version": "3.0.0",
        "event_timestamp": time.time(),  # Renamed field
        "zone_id": "ward-b-door-2"       # Renamed field
    })

Consumers subscribe to either v2.x or v3.0 stream during transition. After all consumers upgrade, v2.x stream is deprecated.

Migration strategy 2: Schema adapters

For complex migrations, use schema adapters that translate between versions:

from spaxiom.schema import SchemaAdapter

# Adapter translates v2.x → v3.0
adapter_v2_to_v3 = SchemaAdapter(
    from_version="2.1.0",
    to_version="3.0.0",
    field_mappings={
        "timestamp": "event_timestamp",  # Rename
        "zone": "zone_id"                # Rename
    }
)

# Consumer receives v2.x events, adapter translates to v3.0
@on(door_opened)
def handle_door_v3(event_v2):
    event_v3 = adapter_v2_to_v3.translate(event_v2)
    process(event_v3["event_timestamp"], event_v3["zone_id"])

Adapters can run at:

Schema registry and discovery

To coordinate schema versions across 1000s of sites, Spaxiom provides a centralized schema registry:

from spaxiom.schema import SchemaRegistry

# Connect to registry (e.g., hosted on cloud)
registry = SchemaRegistry(url="https://schema-registry.spaxiom.io")

# Publish new schema version
door_schema_v3 = EventSchema(name="DoorOpened", version="3.0.0", ...)
registry.publish(door_schema_v3)

# Sites query registry for latest compatible schema
latest_compatible = registry.get_latest("DoorOpened", compatible_with="2.1.0")
# Returns v2.1.x (highest MINOR/PATCH compatible with v2.1.0)

Registry features:

Handling heterogeneous schema versions

In federated deployments (Section 8), different sites may run different schema versions. The aggregator must handle this gracefully.

Approach 1: Normalize to lowest common denominator (LCD)

Aggregator translates all events to the lowest supported version:

# Site A sends v2.0, Site B sends v2.1, Site C sends v3.0
# Aggregator normalizes all to v2.0 (LCD)

for event in incoming_stream:
    if event["schema_version"] >= "3.0.0":
        event = adapter_v3_to_v2.translate(event)
    elif event["schema_version"] >= "2.1.0":
        event = adapter_v2_1_to_v2_0.translate(event)

    process_v2_event(event)

Pro: simple, all consumers see uniform schema.
Con: loses information from newer schema versions.

Approach 2: Preserve version, annotate capabilities

Aggregator preserves original schema versions but annotates with capability flags:

{
    "type": "DoorOpened",
    "schema_version": "2.1.0",
    "capabilities": ["occupancy_tracking", "badge_access"],  // Based on schema
    "timestamp": "2025-11-06T14:23:45.123456Z",
    ...
}

Consumers filter events by required capabilities:

@on(door_opened)
def handle_with_badge(event):
    if "badge_access" in event["capabilities"]:
        process_badge(event["access_badge_id"])
    else:
        skip_event()  # This event doesn't have badge data

Pro: preserves full schema diversity.
Con: consumers must handle multiple schemas.

Deprecation and sunset policies

Old schema versions should be deprecated explicitly:

  1. Announce deprecation: mark schema v1.x as deprecated at time T. Sunset date: T + 6 months.
  2. Warning period (months 1-3): sites emitting v1.x events receive warnings but still function.
  3. Grace period (months 4-6): sites must upgrade or face reduced functionality (e.g., no federated learning).
  4. Sunset (month 6+): v1.x events rejected by aggregator. Sites must upgrade to v2.x or later.
from spaxiom.schema import deprecate_schema

deprecate_schema(
    name="DoorOpened",
    version="1.0.0",
    sunset_date="2026-06-01",
    replacement="2.0.0",
    migration_guide_url="https://docs.spaxiom.io/migration/v1-to-v2"
)

Case study: HVAC event schema migration

A real-world example from a Spaxiom deployment in a smart campus with 50 buildings.

Initial schema (v1.0):

{
    "type": "TemperatureAnomaly",
    "schema_version": "1.0.0",
    "zone": "building-5-floor-3",
    "temp_celsius": 28.5
}

Problem: v1.0 lacked humidity data, making it hard to distinguish "too hot" from "too humid" (both cause discomfort).

New schema (v2.0):

{
    "type": "ThermalComfortAnomaly",  // Renamed for clarity
    "schema_version": "2.0.0",
    "zone_id": "bldg-5-fl-3",         // Renamed field
    "temperature_c": 28.5,             // Renamed field
    "humidity_pct": 65.0,              // New required field
    "pmv_index": 1.8                   // Predicted Mean Vote: thermal comfort metric
}

Migration approach:

  1. Months 1-2: Dual-write. Emit both v1.0 and v2.0 events.
  2. Month 3: Deploy adapters to cloud aggregator: translate v1.0 → v2.0 (infer humidity from historical data, use default PMV).
  3. Month 4: Upgrade agent training pipeline to use v2.0 events.
  4. Month 5: Mark v1.0 as deprecated.
  5. Month 6: Stop emitting v1.0. All buildings upgraded to v2.0 sensors.

Result: Smooth migration with zero downtime. Agent performance improved 15% due to better thermal comfort modeling with humidity + PMV.

Schema evolution best practices

Summary of lessons learned from production deployments:

Future directions: learned schema evolution

Currently, schema evolution is manual (experts design v2.0, write adapters). Future work could automate this using learned schema evolution:

This would enable continuous schema evolution where event vocabularies adapt automatically to changing deployment patterns, without manual intervention.