How Spaxiom Reduces LLM Token Usage by 100-1500×
Joe Scanlin
November 2025
This section demonstrates how Spaxiom acts as a context compressor for AI agents, turning raw sensor deluges into compact intent streams that dramatically reduce token and energy usage.
You'll learn the simple token model showing 100-1500× compression ratios, how token savings translate to energy savings (kWh), and see a visual comparison of raw sensor streams vs. Spaxiom events over increasing time horizons. The analysis shows how Spaxiom enables long-horizon reasoning for agents without exploding token budgets.
A central claim of this paper is that a Spaxiom + INTENT stack can be drastically more token- and energy-efficient than sending raw sensor logs into LLMs.
Consider:
If you naively serialize each reading as text for an LLM, the token count over horizon T is approx:
For example:
Then:
Even if you aggressively compress and downsample, you're still in the millions of tokens for a modest time window.
With Spaxiom, the goal is to produce a small set of semantically dense events E over the same horizon T:
Now token cost becomes:
with E ≪ S f T by design.
If we take:
Then:
That is a reduction factor:
Even if our assumptions are off by an order of magnitude, 100× reductions are very plausible in realistic deployments.
Recent work has begun to measure energy per token for LLM inference, with values on the order of a few Joules per token for large models, depending on hardware and optimizations.
Let:
Then the energy cost of feeding a horizon T to a model is:
Using the numeric example above with e = 3 J/token:
Again, this is a back-of-the-envelope illustration, but it supports the claim that:
Spaxiom can act as a context compressor for agents, turning raw sensor deluges into compact intent streams that dramatically reduce token (and therefore energy) usage.
Figure 1 (Context Compression Curves): Plot tokens vs. time horizon T on a log–log scale. Curve 1 (Raw): tokensraw(T) ∝ T. Curve 2 (Spaxiom): tokensintent(T) grows sublinearly or saturates as the number of salient events per unit time plateaus. The gap between the curves widens as T increases, showing how Spaxiom enables long-horizon reasoning for agents without exploding token budgets.