singular-particular-space/skills/commissioning-skill/token-budgets.md

# Token budgets

Token budgets for spore files are set by the parent agent at commission time
based on the scion's model class. The goal is not to minimise tokens — it is
to maximise signal density within the model's reliable execution window.

---

## Why this matters

Context rot is empirically universal: every model degrades before its nominal
context window limit, and smaller models degrade faster. A dense, denoised
500-token spore file produces better scion outputs than a verbose 2,000-token
one — not just cheaper outputs. Signal density is a quality constraint, not
only a cost constraint.

Source: Chroma research 2025; Liu et al. "Lost in the Middle" 2024;
CompLLM 2025 (compressed context outperforming uncompressed on small models).

The nominal context window is almost irrelevant. What matters is the model's
reliable execution window — where multi-step reasoning remains coherent. For
agentic tasks this is a fraction of the nominal limit.

---

## Per-model ceilings

These are working estimates. Not empirically tested against this architecture.
Measure scion task success rate as a function of spore file size for your
specific task classes, then adjust.

| Scion model class     | Nominal window | Spore file ceiling | Per-entry target |
|-----------------------|----------------|-------------------|-----------------|
| Haiku 4.5             | 200K           | 600 tokens        | ~60 tokens      |
| Gemini 2.5 Flash      | 1M             | 800 tokens        | ~60 tokens      |
| Sonnet-class scion    | 200K           | 1,200 tokens      | ~80 tokens      |

The ceiling applies to the full spore file content, not per entry.
At 60 tokens per entry, a 600-token ceiling accommodates approximately
8-10 entries — enough for most well-scoped task classes.

---

## Frontier model note

A frontier parent reading a spore file written for Haiku-class scions will
process it with less interpretive load, not more. The density that is required
for a small model is simply easier for a large model. There is no upper bound
on how dense context can be for a frontier model — only a lower bound on how
dense it must be for a small one.

This means spore files composed for the smallest expected scion are valid for
any larger scion that might also read them. Write for the least capable consumer.

---

## Prompt caching note

Spore files are near-perfect prompt caching targets because they change slowly
and load at the beginning of the scion's context. Anthropic's prompt caching
charges 10% of base input token price on cache hits. A stable spore file
achieves near-100% cache hit rate — the token cost is effectively paid once
across all scion runs in its task class.

Keep spore file content stable between runs. Avoid injecting timestamps,
run IDs, or other variable content into the spore file — variable content
breaks the cache prefix and forces a full re-read on every run.

---

## Counting tokens

Rough heuristic for English prose: 1 token ≈ 0.75 words, or ~4 characters.
A 60-token entry is approximately 45 words or 240 characters. Write an entry,
count the words, multiply by 1.33 — if the result exceeds 60, trim.

For precise counts, use the Anthropic tokenizer or the `tiktoken` library
(cl100k_base encoding as a close approximation for Claude models).