Ahrefs Data: 97% of llms.txt Files Ignored

Ahrefs data suggests that llms.txt files saw extremely low engagement in May 2026, with 97% of files receiving no requests at all. Despite growing discussion around the format, real-world usage remains minimal across the web.

The study analysed logs from around 137,000 domains to understand how often llms.txt files are being accessed and by which types of bots or tools. While roughly 28% of these domains were found to host an llms.txt file, Ahrefs notes that this figure is likely higher than the broader internet average due to its more technical customer base. In total, only around 38,000 domains had valid files, and just about 1,100 of those recorded any traffic.

Very limited overall usage

Even among the small number of files that were accessed, most showed extremely low activity. The data indicates that 97% of llms.txt files had zero requests during the period studied, meaning they were neither accessed by humans nor bots.

Where requests did exist, the overwhelming majority came from automated systems rather than genuine usage. Around 96% of all observed requests were bot-driven, but only a very small fraction of these were linked to AI retrieval systems. AI bots associated with tools such as ChatGPT and Perplexity accounted for roughly 1% of total requests, suggesting that direct AI engagement with llms.txt remains extremely limited.

Who is actually accessing llms.txt files?

The breakdown of traffic sources shows a strong skew towards technical and diagnostic tools rather than AI systems using the files for retrieval.

SEO auditing tools represented the largest share at around 21% of requests. This was followed by unidentified bots at 14%, traditional search engine crawlers such as Googlebot at 13%, and technical profiling services like BuiltWith at 11%.

AI-related bots across all categories collectively accounted for around 19% of requests, making them a noticeable presence but not the dominant group. Within that segment, coding agents were responsible for approximately 10%, training crawlers around 5%, and AI assistants about 2%. Individual tools such as Claude-Code and GPTBot were among the most active AI-related agents, while Slackbot actually recorded more llms.txt requests than PerplexityBot.

This distribution highlights a key mismatch between expectations and reality. While llms.txt is often discussed as a tool for AI retrieval systems, the data suggests it is currently being accessed more by infrastructure tools than by AI search systems themselves.

A large share of “self-analysis” traffic

One of the more unusual findings in the dataset is that around 12% of all requests came from tools designed to analyse, audit, or validate llms.txt files rather than consume their content in practice.

Within this category, GEO and AEO readiness tools accounted for roughly 5% of requests, while dedicated scanners and validators made up around 3%. This means that more traffic is currently being generated by tools assessing or testing the format than by AI assistants or retrieval systems using it directly.

A further 2% of requests came from research-focused bots, including systems designed to evaluate prompt injection risks. This indicates that llms.txt is already being examined from a security and reliability standpoint, even before it has achieved widespread adoption.

Missing files and low human interaction

The study also found that requests to missing llms.txt files (those returning 404 errors) did not come from AI bots at all. Instead, these appear to be mostly human users manually entering URLs, likely to test whether competitors or other sites are using the format.

Even high-profile checks, such as Chrome’s Lighthouse audit for llms.txt, generated only a very small number of requests across the dataset. This further reinforces the idea that awareness exists, but active usage remains limited.

What the findings suggest

Overall, the data aligns with the view that llms.txt has not yet become a meaningful input source for AI retrieval systems. Instead, its current usage is dominated by technical crawlers, SEO tools, and experimental systems rather than AI agents generating live answers or citations.

Earlier commentary from industry figures has already suggested that llms.txt is not primarily designed for search, and this dataset appears to support that interpretation. Rather than being widely used by AI assistants, the file is currently more relevant to coding agents, infrastructure tools, and systems testing how AI-readable formats might evolve.

Previous studies have also shown little correlation between having an llms.txt file and improved AI visibility, and the low request volume may help explain why.

Looking ahead

One area that stands out is the growing attention around prompt injection analysis. Some crawlers are actively testing llms.txt files for potential security risks, reflecting increasing caution around how AI systems ingest external content.

It is also important to remember that these figures only reflect requests, not how (or if) the data was ultimately used by the systems that accessed it. Even when llms.txt files are fetched, there is no clear evidence that they are influencing AI outputs at scale.

For now, the data paints a clear picture: interest in llms.txt is growing in discussion and tooling, but actual usage remains extremely limited in practice.