Google: AI Content Requires Human Review

Google’s Gary Illyes has stated that content produced by AI should undergo editorial checks to verify its accuracy and correctness.

He clarified that Google has no issue with AI-generated material, provided it maintains a high standard of quality. According to Illyes, the term “human created” doesn’t fully reflect Google’s stance; instead, “human curated” would be a more accurate way to describe the policy.

These remarks were made during an exclusive interview with Kenichi Suzuki, where Illyes responded to questions about Google’s approach to AI content.

AI Overviews and AI Mode Models

Kenichi Suzuki asked Gary Illyes about the AI models powering Google’s AI Overviews (AIO) and AI Mode. Illyes explained that both are driven by custom-built Gemini models. While he didn’t have full details on how these models were trained, he confirmed they were tailored specifically for these features.

Suzuki then moved on to ask whether AIO and AI Mode rely on separate indexes for “grounding.” Grounding refers to the process where a large language model links its responses to reliable sources, such as a database or search index, to ensure answers are accurate, verifiable, and less prone to hallucinations. In Google’s case, this grounding is typically based on data from its web index.

Illyes clarified that, as far as he knows, both AI Overviews and AI Mode use Google Search for grounding. They work by sending multiple queries to Google Search, which then returns relevant results for each query.

The discussion then turned to the role of the Google Extended crawler. Suzuki asked whether AIO and AI Mode get their training data from regular Google crawling rather than from Google Extended.

Illyes explained that grounding does not involve AI—it’s part of the process that happens before AI generates the final output. He added that Google Extended can influence grounding; if a site blocks Google Extended, Gemini will not use that site’s data for grounding purposes.

AI Content In LLMs And Search Index

Illyes was recently asked whether the growing amount of AI-generated content online is polluting large language models (LLMs). He clarified that while this is not a concern for search indexing, it could be problematic for LLMs.

The question, posed by Kenichi, was:

“With more content being produced by AI, and LLMs learning from that material, what do you see as the potential risks?”

Illyes responded that he is not worried about the impact on search engines. However, he stressed that AI model training needs to find a way to filter out AI-generated material. Without this, models could enter a “training loop” where they continually learn from AI-created content, which is far from ideal. He added that he is unsure how much of an issue this currently is, possibly because of the way training data is selected.

Content Quality And AI-Generated Content

Suzuki then steered the conversation towards the link between AI and content quality.

He asked:
“So, you’re saying it doesn’t matter how content is produced, provided the quality is high?”

Illyes confirmed that when it comes to training large language models (LLMs), the main priority is the quality of the material, regardless of whether it is written by humans or generated by AI. He highlighted factual accuracy as a key consideration and warned against including “extremely” similar pieces of content in the search index.

Illyes clarified that Google generally doesn’t focus on how content is created, but there are important caveats. He explained that if accuracy and high standards are maintained, the method of production is largely irrelevant. Problems arise, he said, when content is either too similar to existing material — which ideally should not be present in the training index — or when it contains inaccuracies. The latter, he warned, is more serious as it can introduce bias or incorrect information into AI models.

He concluded by noting that in most cases, ensuring high quality still requires human review of AI-generated output before it is used for training.

Human Reviewed AI-Generated Content

Illyes went on to discuss AI-generated content that has been checked by a human. He stressed that human review should not be treated as something publishers need to declare on their pages, but rather as an essential step before the content goes live.

He clarified that simply stating “human reviewed” on a webpage is not a reliable signal and is not what he was suggesting.

According to Illyes, Google’s guidance on reviewing content is unlikely to change anytime soon. He believes the term “human created” is misleading and that “human curated” is more accurate. In other words, content should have editorial oversight, with someone verifying that it is correct and reliable.