JSON vs TOON: Save tokens on your LLMs πͺ
Discover TOON, the new format that promises to dethrone JSON for feeding LLMs and save you a fortune in tokens.
- Date
Hello everyone! π Welcome back to the blog.
Today Iβm here to talk about a topic that is on the lips of everyone working with Artificial Intelligence and LLMs (Large Language Models): token optimization. Because, my friends, in the world of AI, tokens are money πΈ. Literally.
If youβve been βtinkeringβ with models like GPT-4, Claude, or Gemini for a while, youβll know that feeding the beast with structured data is our daily bread. Whether itβs for RAG (Retrieval-Augmented Generation), fine-tuning, or simply providing context, we need to pass data to the model. And this is where our old friend JSON has been the undisputed kingβ¦ until now.
The problem with JSON: The βverbosity taxβ π’
Donβt get me wrong, I love JSON. Itβs readable, easy to parse, and universal. It has been the de facto standard for data exchange on the web for years. But when it comes to sending large amounts of data to an LLM, JSON has a βsmallβ problem: it is extremely verbose.
Imagine you have a list of 1000 users. In JSON, you repeat the keys name, email, role, etc., 1000 times!
{
"users": [
{ "id": 1, "name": "Jorge", "role": "Developer", "active": true },
{ "id": 2, "name": "Maria", "role": "Designer", "active": false }
// ... imagine this repeated 1000 times π±
]
}
Every time you repeat "name":, you are spending tokens. Tokens that you could be using to give more context to the model or to get a longer response. Itβs redundant structure that the model doesnβt need to see constantly to understand the data.
What is TOON? π¦ΈββοΈ
This is where TOON (Token-Oriented Object Notation) comes into play. It is a format designed specifically to be βtoken-friendlyβ. Its goal is to maintain the structure of the data but eliminate the syntactic redundancy that costs us so much on the API bill.
Think of TOON as a translation layer: you use JSON in your code (because itβs comfortable), but you encode it to TOON before sending it to the LLM.
TOONβs philosophy is simple but brilliant: it combines the indentation-based structure of YAML (for nested objects) with the tabular efficiency of CSV (for uniform arrays).
How does it work?
The key is to declare the structure only once and then βstreamβ the data.
Letβs see the previous example converted to TOON:
users[2]{id,name,role,active}:
1,Jorge,Developer,true
2,Maria,Designer,false
Look at the cleanliness! π§Ή
users[2]: Explicitly declares the length of the array. This helps the LLM know if generation has been cut off or if data is missing.{id,name,role,active}: Defines the headers (the keys) only once.1,Jorge...: The data goes in rows, separated by commas, just like in a CSV.
We have eliminated all the quotes from the keys, the repetitive braces, and the repeated keys themselves. For an array of 2 elements it might not seem like much, but scale this to thousands of records and the difference is abysmal.
A more complex example: The best of both worlds π
TOON is not just a glorified CSV. Its real power is seen when we mix objects and arrays. Look at this example taken from its official documentation, where we have a context (object) and lists of data (arrays):
In JSON:
{
"context": {
"task": "Our favorite hikes",
"location": "Pyrenees",
"season": "summer_2025"
},
"friends": ["ana", "luis", "sam"],
"hikes": [
{ "id": 1, "name": "Monte Perdido", "km": 15.5, "hard": true },
{ "id": 2, "name": "Aneto", "km": 12.2, "hard": true },
{ "id": 3, "name": "Cola de Caballo", "km": 18.0, "hard": false }
]
}
In TOON:
context:
task: Our favorite hikes
location: Pyrenees
season: summer_2025
friends[3]: ana,luis,sam
hikes[3]{id,name,km,hard}:
1,Monte Perdido,15.5,true
2,Aneto,12.2,true
3,Cola de Caballo,18.0,false
Here we see the magic:
contextuses a YAML-like style (key-value with indentation).friendsis an array of primitives, super compact.hikesis an array of objects, rendered as a table.
The format automatically adapts to the structure of your data to be as efficient as possible.
Why use TOON? (Design Goals) π―
According to its creators, TOON has very clear design goals that make it ideal for LLMs:
- Token Efficiency: Reduces token usage by 30% to 60% compared to pretty-printed JSON.
- Schema-Aware: By including the array length
[N]and headers, we give explicit hints to the model. This reduces hallucinations and helps validate that the output is complete. - Human Readability: Unlike binary formats or extreme minification, TOON remains readable by us.
- Lossless: It is a lossless representation of the JSON data model. You can go from JSON -> TOON -> JSON without losing anything.
When to use it (and when NOT)? π¦
Like any tool, itβs not a silver bullet. Here is my recommendation:
β Use it when:
- You have uniform arrays of objects (lists of products, users, logs, transactions). This is where TOON shines and destroys JSON in efficiency.
- Token cost is a concern (and when isnβt it?).
- You need to maximize the context window to fit more information.
β Do not use it when:
- Your data is very irregular or deeply nested without repetitive patterns. If every object has different keys, TOON cannot use its tabular format and ends up looking like YAML, losing its advantage.
- You need ultra-low latency in very small local models that might not have seen this format ever (although large models understand it perfectly with a proper prompt).
- Itβs purely tabular and flat data: there a simple CSV might be even lighter (although TOON adds safety with types and lengths).
Benchmarks and Savings π
Preliminary tests are impressive. In typical RAG datasets, the savings are substantial.
π E-commerce orders with nested structures β Tabular: 33%
β
TOON ββββββββββββββββββββ 72,771 tokens
ββ vs JSON (β33.1%) 108,806 tokens
ββ vs JSON compact (+5.5%) 68,975 tokens
ββ vs YAML (β14.2%) 84,780 tokens
ββ vs XML (β40.5%) 122,406 tokens
π§Ύ Semi-uniform event logs β Tabular: 50%
β
TOON ββββββββββββββββββββ 153,211 tokens
ββ vs JSON (β15.0%) 180,176 tokens
ββ vs JSON compact (+19.9%) 127,731 tokens
ββ vs YAML (β0.8%) 154,505 tokens
ββ vs XML (β25.2%) 204,777 tokens
π§© Deeply nested configuration β Tabular: 0%
β
TOON ββββββββββββββββββββ 631 tokens
ββ vs JSON (β31.3%) 919 tokens
ββ vs JSON compact (+11.9%) 564 tokens
ββ vs YAML (β6.2%) 673 tokens
ββ vs XML (β37.4%) 1,008 tokens
ββββββββββββββββββββββββββββββββββββ Total ββββββββββββββββββββββββββββββββββββ
TOON ββββββββββββββββββββ 226,613 tokens
ββ vs JSON (β21.8%) 289,901 tokens
ββ vs JSON compact (+14.9%) 197,270 tokens
ββ vs YAML (β5.6%) 239,958 tokens
ββ vs XML (β31.0%) 328,191 tokens
Imagine reducing your OpenAI or Anthropic bill by half just by changing the input data format. π€―
Conclusion
Optimization is key in this new era of AI. Tools like TOON help us be more efficient and build better products. Itβs not just about saving money, but making our applications faster and capable of processing more information.
TOON is still young, but its value proposition is undeniable. If you are building data-intensive applications with LLMs, I encourage you to try it. You can find more information and the full documentation on their official website.
And you? Have you already tried alternative formats to JSON like YAML or XML for your prompts? Tell me on social media!
I hope you liked this post and that it helps you save a few tokens (and dollars). π
Greetings and see you in the next post! π
Peace βοΈ.