Google has issued an update to its official documentation to clarify how file size limits work across its crawling systems, separating general crawler rules from those that apply specifically to Googlebot when indexing content for Google Search.
The change is intended to remove long-standing confusion about how much content Google can fetch from a webpage and how those limits differ depending on the crawler being used. Previously, information about file size restrictions was housed entirely within the Googlebot documentation, even though those limits applied more broadly to Google’s entire crawling infrastructure.
As part of the update, Google has reorganised its guidance so that default file size limits are now explained in its crawler infrastructure documentation. At the same time, the Googlebot page has been revised to focus only on limits that are relevant to Google Search.
Google described the update in its changelog as a two-step clarification. First, the company moved the default file size rules to a more logical location, explaining that these limits affect all Google crawlers and fetchers, not just Googlebot. Second, it refined the Googlebot documentation so that it now shows more precise limits for content being crawled specifically for search results.
Under the newly updated crawler infrastructure documentation, Google states that its crawlers generally apply a default file size limit of 15MB when fetching content. This figure applies across Google’s wider crawling ecosystem, which supports products such as Search, News, Shopping, Gemini and AdSense.
Meanwhile, the Googlebot documentation now outlines different thresholds depending on the type of file being crawled. For HTML pages and supported text-based file formats, Googlebot applies a limit of 2MB. For PDF documents, however, Googlebot can crawl files up to 64MB in size when indexing them for Google Search.
Google also reiterates that each resource referenced by a webpage is fetched separately. This means that CSS files, JavaScript files and images are treated as individual requests and are not bundled together under a single size limit. As a result, very large or complex pages may still encounter crawling challenges if individual files exceed the permitted thresholds.
This documentation change is part of a broader restructuring effort that Google has been carrying out since late 2025. In November, the company moved much of its crawling documentation away from Search Central and into a new standalone site dedicated to crawling infrastructure. The reasoning behind this shift was that Google’s crawling systems now serve many products beyond traditional web search.
In December, Google followed up with further documentation updates covering areas such as crawl budget optimisation and guidance for handling faceted navigation. The latest clarification on file size limits continues this process of separating universal crawler guidance from product-specific information.
Although the documentation has changed, the actual limits themselves are not new. Google first publicly documented the 15MB default file size limit in 2022. At the time, Google’s John Mueller confirmed that the restriction had been in place for years and that Google was simply making it more transparent rather than introducing a new rule.
What has changed is how those limits are explained and where they appear. Previously, website owners may have assumed that the 15MB figure applied only to Googlebot. Now, Google makes clear that this is a general rule across its crawling systems, while Googlebot has its own specific limits for different file types when crawling for search.
However, Google has not fully explained how the various limits relate to one another in practice. The crawler infrastructure overview refers to a 15MB default limit, while the Googlebot documentation lists 2MB for text-based files and 64MB for PDFs. The company’s changelog does not clarify whether these figures operate independently or how they interact in different crawling scenarios.
This distinction is particularly important for website owners managing large volumes of content or complex pages with heavy scripts and resources. When troubleshooting crawling or indexing problems, site operators now need to consider which crawler is involved and which documentation applies.
For example, a page that is accessible to Google’s general crawlers may still face limitations when being indexed for Google Search if individual files exceed Googlebot’s stated limits. This could have implications for large HTML pages, dynamically generated content, or extensive PDF libraries.
The reorganisation also reflects Google’s evolving technical landscape. As the company continues to expand its AI-driven products and services, including Gemini and other data-driven platforms, its crawling infrastructure must support more use cases than ever before. Separating crawler-wide defaults from search-specific guidance allows Google to document future crawlers more easily as new tools are introduced.
From a practical perspective, this update encourages webmasters and SEO professionals to be more precise when reviewing Google’s technical guidance. Rather than relying on a single Googlebot page for all crawling behaviour, they now need to consult both the crawler infrastructure documentation and the Googlebot-specific pages depending on the issue they are investigating.
Looking ahead, Google’s documentation changes suggest that further updates are likely in the coming months. As its crawling systems become more specialised, Google appears keen to ensure that its guidance remains structured, scalable and easier to maintain.
The separation of default crawler limits from Googlebot-specific rules also improves transparency. It highlights that Google Search is only one of many services that rely on crawling and fetching content, and that technical requirements may differ depending on the product being served.
For publishers, developers and SEO practitioners, the key takeaway is that no new restrictions have been introduced. Instead, Google has clarified where file size limits apply and how they should be interpreted. Understanding these distinctions will be increasingly important when diagnosing performance issues, managing crawl budgets, or ensuring that important content is accessible to Google’s systems.
Overall, this update reflects Google’s continuing effort to modernise its technical documentation in line with the expanding role of its crawling infrastructure. While the limits themselves remain unchanged, the clearer structure should help website owners better understand how their content is accessed, processed and indexed across Google’s growing range of platforms.
More Digital Marketing BLOGS here:Â
Local SEO 2024 – How To Get More Local Business Calls
3 Strategies To Grow Your Business
Is Google Effective for Lead Generation?
How To Get More Customers On Facebook Without Spending Money
How Do I Get Clients Fast On Facebook?
How Do You Use Retargeting In Marketing?
How To Get Clients From Facebook Groups