We trained the first wave of modern AI on words about the world. The next wave will be trained on the world itself.
Agents that plan, coordinate, and act in physical spaces don't need more prose; they need context: who's here, what's moving, where the boundaries are, which patterns matter, and when something changes. That context is plentiful: floors that feel footsteps, meters that taste air, cameras that see depth, motors that announce torque, doors that confess traffic. But today it's scattered across brittle scripts and raw streams.
Spaxiom is my bet that the future belongs to a language of experience, a way to turn heterogeneous sensors into compact, actionable semantics for agents, and that INTENT is the pattern layer that makes those semantics reusable across buildings, clinics, factories, and cities.
This is an approachable guide to what we're building and why the opportunity is so large.
⸻
I. The current phase: language as simulation
Today's AI systems, from GPT to Claude to Gemini, are linguistic machines. They've read the collective library of human thought and can compress it into plausible sentences, code, and plans. They model meaning through text, not through sensation.
That's their superpower, but also their constraint. These systems don't see, touch, or measure. They infer. They reconstruct the world from written traces of it, like a scholar describing the ocean from shipping logs.
This text-based intelligence has carried us astonishingly far, into automation, reasoning, even creativity. But it's still a simulation layer. It's mind without matter.
⸻
II. The coming phase: the world as dataset
The next era of AI will be trained not just on data, but within environments.
Models will learn through embodiment, by reading sensors, watching flows, controlling systems, and closing loops between perception and consequence. This is already happening in robotics, climate modeling, energy systems, healthcare, and smart infrastructure. But it's fragmented: every domain speaks its own dialect of signals, every site its own messy schema.
If language models gave us semantic interoperability for human knowledge, we now need spatial interoperability for machine experience.
That's where Spaxiom comes in: a universal grammar for physical context, one that allows the built environment to speak fluently to AI.
Think of air traffic control. Raw radar sends back millions of radio wave reflections per second: meaningless blips. But air traffic control screens don't show blips. They show flight AA123, altitude 35,000 ft, speed 480 kts, heading toward collision zone. That translation, from radio echoes to actionable context, is exactly what Spaxiom does for buildings. Not "sensor 47 triggered," but "Queue formed, lobby entrance, 5 people, 3 minutes, growing."
⸻
III. Preparing the physical world for intelligent collaboration
Think of Spaxiom as the Rosetta Stone for sensors.
Existing and emerging devices (smart floors, air monitors, PLCs, cameras, wearables) all observe fragments of reality. Spaxiom unifies them into a single spatiotemporal language that AI can parse and reason over.
By fusing raw sensor signals into meaningful abstractions like "five people entered this zone," "air quality degrading in aisle C," or "mobility hesitation detected in corridor 2," Spaxiom provides the bridge between environment and intelligence.
It turns the built world into a computable surface.
Where INTENT comes in is as the pattern layer, a library of reusable world-logic that lets AIs understand and respond to familiar situations (a queue forming, a machine overheating, a fall risk emerging) across any domain.
Together, Spaxiom and INTENT don't just collect data. They curate experience.
⸻
IV. From datasets to experiences
LLMs are extraordinary at compressing human culture, but they're starved for situational truth. In the physical world, meaning is spatiotemporal: where something happened, how long it persisted, what it co-occurred with, and who it affected. We don't want agents re-reading manuals; we want them noticing queues, easing congestion, anticipating maintenance, catching hazards, and stewarding energy in real time.
This requires a substrate that treats sensors as nerves, space as an addressable medium, and time as first-class logic.
⸻
V. What Spaxiom is (in one breath)
Spaxiom is a spatial + temporal DSL and runtime that fuses raw streams (pressure mats, depth/radar, air, vibration, PLCs, wearables, MQTT, etc.) into high-level events (e.g., QueueFormed, CrowdingInZone, NeedsService, ADL.Anomaly). Those events are compressed, typed, and agent-ready, so they can slot directly into planning loops or small LLM contexts without shipping kilobytes of raw time series. Think of it as a sensor cortex: it digests sensation into experience.
⸻
VI. INTENT: a library of embodied patterns
On top of the DSL sits INTENT, a catalog of reusable spatiotemporal patterns (queue flow, occupancy fields, activities of daily living, facilities stewardship, safety envelopes for robots, and more). Patterns are opinionated mini-models that:
- Compress semantics (hundreds of readings into a few trusted facts),
- Embed domain logic (queueing theory, elder-care heuristics, FM best practices),
- Expose clean schemas (JSON events an agent or LLM can reason over).
Instead of re-wiring every site, you instantiate patterns and get a consistent vocabulary of real-world phenomena, portable across domains.
⸻
VII. Why this unlocks the next AI platform
1) Token and energy efficiency
A handful of semantically dense events replace thousands of raw samples. You feed agents what changed and why it matters, not telemetry dumps. That means smaller prompts, faster decisions, cheaper loops, lower carbon per action.
2) Generalizable agency
Because patterns are site-independent abstractions, you can move an agent from a retail lobby to a clinic corridor and it still understands crowding, queuing, wayfinding, routines, anomalies. The world becomes legible and transferrable.
3) Safety and forensics built in
When a decision matters, you want a trail of reasons, not a mystery. Structured events, temporal logic, and zone semantics create explainable, replayable histories of "what the agent knew, when."
⸻
VIII. What this looks like in practice
Spaxiom enables intelligent systems across wildly different domains. Here are four examples of how sensor fusion becomes semantic action:
Cold Chain Logistics
Pharmaceutical shipment integrity monitoring
Explore →Medical Sterilization
Predictive maintenance & compliance monitoring
Explore →Wildfire Risk
Predictive forest fire danger assessment
Explore →Human-Robot Safety
Collision avoidance for collaborative robots
Explore →⸻
IX. The experience fabric
Single buildings are interesting. Networks of sites are transformative. Here's how it works:
The basic idea: Imagine you run 80 retail stores. Each store has cameras, occupancy sensors, and checkout scanners generating thousands of raw data points per second. Traditionally, you'd need to either (1) send all that raw video and sensor data to the cloud - expensive, slow, and privacy-invasive - or (2) analyze each store in isolation, learning nothing from patterns across your network.
The Spaxiom approach: Each store runs a local edge runtime that compresses raw sensors into semantic events using INTENT patterns. Instead of streaming video, Store #47 emits: QueueFormed(checkout_3, length=8, wait_time=4.2min). These high-level events - not raw pixels - flow to regional hubs where the system learns patterns: "Stores near universities see queue spikes at 3pm on Fridays" or "Locations with >15% elderly customers need 20% more staffing at checkout."
The payoff: Insights learned from your Boston store can improve operations in Seattle - without ever sharing camera footage between sites. Regional hubs discover these patterns and push back policies: updated thresholds, new event definitions, or refined INTENT patterns. This is federated learning - sites contribute knowledge to a shared intelligence layer, but raw sensor data stays local. You get cross-site benchmarking and continuous improvement for the physical world, with privacy built in.
Spaxiom supports this multi-site experience fabric: local edge runtimes emit events; regional hubs aggregate and learn; global policies fan back out. You get federated learning, cross-site benchmarking, and policy as code for the physical world, without centralizing raw video or sensitive streams.
Reading the graph: Figure 6 shows how different sites (red circles) - a hospital and a retail store - share a common language. Each site has zones (orange circles) that generate event instances (small green circles). The key insight: both sites use the same event types (blue circles) like "OccupancyChanged" or "QueueFormed." When a hospital ward detects an occupancy change (e₁) and a retail checkout also detects one (e₂), the system recognizes these as instances of the same pattern. This shared vocabulary - the purple "Shared Semantic Ontology" - enables cross-site learning: a hospital lobby's queue management insights can inform retail checkout optimization, even though the physical contexts are completely different.
Scale and diversity: Figure 7 shows the full network in action. Six different industry verticals - hospitals, retail chains, smart buildings, manufacturing plants, agriculture, and data centers - all feeding semantic events into a central INTENT layer. Here's why this matters: diversity makes the system smarter. A retail store that only learns from other retail stores will plateau quickly - there are only so many ways to optimize a checkout line. But when that same store can learn from hospital patient flow patterns, data center thermal dynamics, and manufacturing shift transitions, it discovers strategies it would never find in isolation. A data center's thermal optimization breakthrough might reveal a novel HVAC scheduling pattern that transforms smart building efficiency. A hospital's occupancy prediction model - trained on decades of patient flow - might revolutionize retail staffing algorithms. This is federated learning's superpower: the network gets smarter as it gets more diverse, not just as it gets bigger. Events flow in (solid green arrows), patterns are learned centrally, and refined policies flow back out (dashed purple arrows). The system collectively processes 23,000 semantic events per day across 2,130 zones - all without ever moving raw sensor data between sites. Every site contributes to the collective intelligence; every site benefits from insights they could never generate alone.
⸻
X. Why now
Sensors are exploding in number; context-aware computing is becoming an infrastructure category; agents are moving from chat to control. The missing layer is a common grammar for space and time that agents can trust and developers can extend. That is exactly what Spaxiom/INTENT provides.
⸻
XI. The opportunity (and obligation)
- Markets: smart buildings, healthcare ops, manufacturing, logistics, retail, campuses, cities anywhere the gap between what is sensed and what is decided wastes money, energy, or human patience.
- Economics: events are the new API. They enable marketplaces for experience data products, privacy-respecting aggregates that power planning, safety, and sustainability.
- Civics: many sensors (pressure, RF, thermal, air) are inherently privacy-preserving. Use them to build useful, non-extractive systems that improve comfort, safety, and throughput without watching faces.
⸻
XII. How to think with it
- Treat your environment like a programmable instrument.
- Model zones, routines, episodes; subscribe to meaning, not noise.
- Let agents ask for more context only when uncertainty is high.
- Make explainability default: every action linked to events and rules.
⸻
XIII. Closing
Most AI still lives where language lives. The real world lives where experience happens. Spaxiom/INTENT is the bridge: a way to compress reality into the few, faithful facts an agent needs to act well, safely, efficiently, and humanely.
The shift is simple to say and enormous in consequence: stop feeding models words about places; start feeding them places, expressed as words.