Why llms.txt is the next robots.txt

The 1994 analogue

In 1994 a small text file at `/robots.txt` quietly became the web's most important access protocol. Nobody enforced it. Crawlers respected it because the cost of being seen as hostile was higher than the cost of compliance. Decades later, the same convention controls trillions of dollars of search traffic.

`llms.txt` is the same idea, retargeted at language models. A single file at the root of your site that says: here is what I want to be ingested, in what order, with what context.

What a good llms.txt actually contains

A heading with the site name. A blockquote with the elevator pitch. Section headers (`##`) for sub-areas. Inside each, a flat list of canonical absolute URLs, each followed by a one-sentence description. Optional content pushed into a final `## Optional` section. That is the spec.

The hard part is not the syntax. The hard part is keeping it **honest**: pruning dead URLs, mirroring `/page.md` clean copies, and making sure your selection policy excludes `noindex` content.

Why early matters

Citations are sticky. An LLM that learned to cite *your* canonical URL during training is more likely to cite the same URL six months later, even if the source has shifted. Rankings churn weekly. Citations churn quarterly. The window to be the first authoritative entry in an emerging topic is small, and it closes quietly.

What to do tomorrow

Generate a draft `/llms.txt` from your sitemap. Sort by canonical priority. Mirror the markdown. Validate against the conformance test. Ship it. Nine out of ten sites will not — and that is exactly the asymmetry seo0 is built to exploit.