stele
structured data format optimized for LLM consumption
The Problem
Every API call has a cost. Every token in that call adds to it.
JSON burns tokens on ceremony:
{"users":[{"id":1,"name":"alice","role":"admin"},{"id":2,"name":"bob","role":"user"}]}
Quotes. Braces. Colons. Repeated keys. Every row repeats the same field names.
The Solution
stele encodes once, streams values:
Schema declared once. Data rows are pure values. 30-50% fewer tokens.
Design Philosophy
stele is built on one principle: minimize tokens while maximizing model comprehension.
| Goal | How |
|---|---|
| Token efficiency | Eliminate JSON’s syntactic overhead |
| Model parseability | Structure that LLMs extract accurately without examples |
| Schema compression | Declare field names once, reference by position |
Human readability is a secondary benefit, useful for debugging. But make no mistake: stele exists because every token costs money, and JSON burns tokens on ceremony.
Quick Comparison
| Format | Haiku Accuracy | Tokens (50 records) |
|---|---|---|
| JSON | baseline | 6,757 |
| TOON | 59.8% | 8,744 (+29%) |
| stele | 100% | 5,918 (-12%) |
stele parses at parity with JSON while being smaller. Smaller models handle it cold.
Implementation
The reference implementation is base-d, a Rust CLI and library.
# JSON to stele
echo '{"users":[{"id":1,"name":"alice"}]}' | base-d stele
# stele to JSON
echo '@users|id^i|name^s*1|alice' | base-d stele -d
Related Formats
| Format | Model Reads Structure | Compression | Use Case |
|---|---|---|---|
| stele | Yes | 30-50% | Working data |
| carrier98 | No | 90-97% | Shuttle data |
They are siblings. Same family, different jobs.