llms.txt — the new robots.txt for AI engines
llms.txt is an emerging standard for declaring AI-friendly content sources at your domain root. What it does, who reads it, and how to ship one this week.
What llms.txt is
llms.txt is a markdown-formatted file at the root of your domain (/llms.txt) that declares the canonical, machine-readable sources of information for AI consumption. Think of it as a curated table of contents for LLMs: instead of crawling 500 pages and guessing what's authoritative, an AI engine can read your llms.txt and know which 20 pages contain your canonical product information, pricing, FAQ, etc.
What it looks like
Minimal valid llms.txt:
# YourBrand
> One-line description of what your company / product does.
## Documentation
- [Product overview](https://yoursite.com/product): what we do, key features
- [Pricing](https://yoursite.com/pricing): plans and limits
- [API reference](https://yoursite.com/api): full endpoint list
## Comparison
- [vs. Competitor A](https://yoursite.com/vs/comp-a)
- [vs. Competitor B](https://yoursite.com/vs/comp-b)
## Support
- [FAQ](https://yoursite.com/faq)
- [Contact](https://yoursite.com/contact)Who reads llms.txt today
Adoption is partial but growing. Anthropic's Claude has explicit support, Perplexity has signalled they parse it, and several smaller research-grade engines (and increasingly developer tools — IDE assistants, doc generators) read it routinely. Google and OpenAI haven't formally committed, but neither has signalled they will ignore it. Cost-benefit is overwhelmingly in favour of shipping one.
When to ship llms.txt vs. when to skip it
Ship one if:
- You have more than 50 indexable pages (signal-to-noise ratio benefits from curation).
- You sell something complex (API, SaaS, professional services) where the wrong page being cited is worse than no citation.
- You have multiple language variants (DE/EN/FR) and want AI engines to know which is canonical for your primary market.
Skip if:
- You have fewer than 20 pages and your homepage is already a clear summary.
- You actively don't want AI engines to consume your content (block via robots.txt instead).
Common mistakes
- Listing too many pages. llms.txt is a curation file, not a sitemap. Aim for 10–30 entries grouped by section.
- Forgetting the H1. Without
# YourBrandat the top, AI engines can't establish entity identity for the file. - Linking to deep blog posts. Link to evergreen pillar content. Time-sensitive blog posts go in your RSS feed or sitemap, not your llms.txt.
- Mixing languages without declaration. If you have a multilingual site, ship
/llms.txtfor the primary language and/en/llms.txt,/de/llms.txtper locale.
How to verify it's working
Three checks:
- `curl https://yourdomain.com/llms.txt` returns 200 + plain text (not 404, not 301-to-anywhere).
- Ask Claude or Perplexity about your brand and look at the citation pattern. Pages from your llms.txt curated list should appear disproportionately.
- Run a longitudinal AI-search visibility track to see if your citation count rose after deploying llms.txt. Two weeks is usually enough to see signal.