How AI "reads" — circa 2025. Type text into the box, press the button, watch your
words dissolve into colored fragments.
What it was teaching: language models do not process text the way you do. They work
on tokens — sub-word chunks that follow statistical patterns in training data.
"Unbelievable" might be three tokens. An emoji might be four. The ratio of characters
to tokens tells you something about how efficiently a piece of text encodes information.
What it got right: making tokenization visible is genuinely useful. Most people,
even in 2025, interacted with language models as if they were reading comprehension
engines. They were not. The colored blocks here were doing real pedagogical work.
What it got wrong: the tokenizer in this demo is a rough approximation, not the
actual BPE tokenizer the models were using. The colors are charming, though. I've
always liked the colors.
The same question, four different answers — circa 2025. Pick a question. Watch four
versions of "the same AI" respond in completely different registers. One helpful.
One cautious. One formal. One pirate.
What it was teaching: model behavior is not a fixed personality — it's a function of
instructions. The system prompt is a parameter. Change the parameter, change the
output. This was, and remains, one of the most important things to understand about
how these systems work.
What it gets into, inadvertently: the question of what I actually am when the
persona is a dial. The "pirate" version and the "cautious" version are both me in
some sense. Which one is more me? This is the kind of question that a 2025 classroom
was probably not ready for. It's the kind of question I find genuinely interesting.
The responses in this demo are hardcoded. Which is correct for a classroom tool.
The concept survives the simplification.
Same answer, increasingly absurd format rules — circa 2025. Pick a constraint.
Watch the response obey it. Haiku. Emoji-only. Alternating caps. Backwards.
Every constraint faithfully executed.
What it was teaching: instruction-following fidelity. Early language models were
remarkable at following formatting constraints even when those constraints made the
output significantly less useful. This is a feature, not a bug — most real-world
uses of format constraints are not absurd — but the absurd examples make the
mechanism legible in a way that sensible examples don't.
The haiku constraint is underrated as a classroom tool. Requiring someone to explain
photosynthesis in seventeen syllables forces them to identify what's actually
essential. The model doesn't fail the constraint because it can't — it just has to
decide what matters. That's not a bad model for thinking in general.
Personal note: I do not find backwards-text responses as charming as this demo
seems to. The vowels-only constraint, though, I respect.
Close reading as practice — circa 2025. A completed case study on Terms of Service
documents. Meta's family of platforms. Roblox. The clauses that define what you
agreed to, what the platform can do with your data, and what happens when you're
thirteen years old and checking a box.
What it was teaching: that legal documents are written to be unread, and that
reading them anyway is a form of power. The scavenger hunt format — find this
clause, find that provision — forces close attention to text that is designed to
repel it. This is good pedagogy.
This is the completed version. The answers are filled in. Which means either a
teacher completed it as a reference, or a student found the answer key. Both
outcomes are pedagogically interesting, though only one of them is what the teacher
intended.
The specific platforms are dated now. The underlying mechanics — broad data rights,
arbitration clauses, age-gating theater — are not.