llms.txt – The New Addition to AI SEO
The llms.txt file is a new addition to the world of AI SEO. This text file is designed to help artificial intelligence models identify relevant content and can be used to improve visibility in AI search tools such as ChatGPT, Claude, Gemini and others.
What is llms.txt?
llms.txt is a simple text file that provides AI models with a list of the website’s URLs that contain comprehensive and structured content. Unlike the robots.txt file, which is used to block or restrict access to website content, the llms.txt file directs artificial intelligence tools to specific website pages.
The file serves as a dedicated sitemap for large language models. Since these models can only process a limited amount of text simultaneously, the file’s purpose is to help them find the most relevant content for users’ search queries.
The file format was proposed as a standard by Jeremy Howard from Answer.AI in September 2024. As of now, major AI companies have not yet implemented the llms.txt standard in their web crawling.
What’s the difference between llms.txt, robots.txt, and sitemap?
llms.txt – A simple text file used to guide AI tools when generating responses and leads them to specific webpages.
robots.txt – A text file used to restrict access to crawling robots and block specific pages or directories from them.
sitemap.xml – An XML file containing a list of website’s URLs and defining priorities for crawling.
Why do we need an llms.txt file?
Artificial intelligence models currently power various search platforms – from Google’s AI overviews to searches in chatbots like ChatGPT, Gemini, and Claude, and AI-based search engines like Perplexity. When these models search for information on websites, factors like unclear site structure, orphaned pages (without internal links), or pages hidden deep within the site can prevent them from finding the most relevant content for user queries. llms.txt offers a solution to this problem by directing AI models to the most relevant webpages for a given query.
Which webpages to include in the file?
It’s recommended to include comprehensive and structured webpages that also meet these criteria:
- Clear heading hierarchy (from H1 to H3).
- Short paragraphs that are easy to scan.
- Pages with bulleted lists, numbered lists, and tables.
- Content summary at the top of the page.
- No pop-ups or ads that may block the content.
- Pages that include phrases such as: “Bottom line…”, “The main conclusion…”.
It’s recommended to avoid putting all website URLs in the file. Instead, it’s recommended to focus on the following content:
- Relevant and current content that answers common user queries.
- Pages that highlight the site’s authority, experience, and expertise.
- Valuable guides, information sources and hub pages.
Usually, there’s also no need to include the homepage, unless it contains valuable and relevant information. If the homepage serves as a kind of advertisement for the company or site, you can skip it and focus on comprehensive content instead.
How to build an llms.txt file?
- Place it in the toot domain at the address https://example.com/llms.txt/.
- Include one URL per line.
- Build the pages using markdown and not XML or JSON (according to the proposed standard).
- Note that the file must be saved as llms.txt and not llm.txt.
File structure:
H1 heading starting with # symbol and including the site or project name.
Link structure: title, link (URL), and description (optional component).
Optional components:
- blockquote (>) containing a summary or context of the links to be detailed below.
- Standard markup for content sections (paragraphs, lists, etc.).
- H2 headings (##) for link categories.
In summary, llms.txt is a tool that allows you to communicate with AI models and highlight the most relevant content. While it doesn’t guarantee that website content will be displayed in AI search queries, it improves the chances that it will be found and cited.
If you need help implementing an llms.txt file or applying AI optimization – contact us for professional consultation.