The ongoing Google antitrust case has revealed that links are not used in the ranking process for AI Overviews.
A search marketer recently noticed why spammy websites were appearing in AI Overviews results. The Memorandum Opinion from the case included a section that sheds light on this, suggesting it may indicate Google’s shift away from links as a key ranking signal.
Ryan Jones, founder of SERPrecon, highlighted the passage in the court document that explains how Google grounds its Gemini models.
Grounding Generative AI Answers
The relevant section of the court documents discusses how Google grounds its AI answers using search data. Normally, one might expect that links influence the ranking of pages an AI system retrieves before creating a summary. Typically, when a user poses a question, Google Search would be queried, and the AI would generate an overview based on those results.
However, the case reveals that Google follows a different process. Instead of relying on its main search engine, the company uses a separate system that pulls in fewer pages, but at a faster pace.
The passage explains:
“To ground its Gemini models, Google uses a proprietary technology called FastSearch. … FastSearch is based on RankEmbed signals—a set of search ranking signals—and generates abbreviated, ranked web results that a model can use to produce a grounded response. … FastSearch delivers results more quickly than Search because it retrieves fewer documents, but the resulting quality is lower than Search’s fully ranked web results.”
Commenting on this, Ryan Jones highlighted that the findings match what many specialists had suspected. According to him, Google is not relying on the same ranking system for grounding its AI. Instead, speed takes priority, and fewer signals are factored in. The goal is simply to gather text to support AI-generated answers.
Jones suggested that many of the normal filters and quality signals may not apply in FastSearch, which could explain why earlier versions of AI Overviews occasionally displayed spammy or even penalised websites.
He further argued that links do not seem to influence this process, as the grounding appears to focus more on semantic relevance than traditional ranking signals.
What Is FastSearch?
The court documents also reveal more details about how FastSearch works, noting that it produces only a limited set of search results.
According to the Memorandum, “FastSearch is a technology that rapidly generates limited organic search results for certain use cases, such as grounding of LLMs, and is derived primarily from the RankEmbed model.”
This naturally raises the question: what exactly is the RankEmbed model?
The same document clarifies that RankEmbed is a deep-learning model. In simple terms, deep learning involves training systems to recognise patterns within huge datasets. Instead of truly “understanding” information in a human sense, the model is designed to pick out semantic relationships and patterns in data.
The Memorandum explains further: “At the other end of the spectrum are innovative deep-learning models, which are machine-learning models that discern complex patterns in large datasets.”
Google has developed a range of “top-level” signals that contribute to the final ranking score of a web page. These signals can include factors such as a page’s quality and its popularity.
Among those top-level signals are the outputs from deep-learning systems like RankEmbed. This means RankEmbed plays a role in shaping how Google evaluates and ranks web content.
User-Side Data
The Memorandum also sheds light on how RankEmbed, the model underpinning FastSearch, actually works and the type of data it relies on.
It explains that RankEmbed makes use of “user-side” data, describing it as: “User-side Data used to train, build, or operate the RankEmbed model(s).”
Further details reveal that both RankEmbed and its later version, RankEmbedBERT, are ranking systems built on two core sources of information: a large sample of search log data and quality scores from human reviewers, which Google uses to assess the relevance of search results.
The document goes on to describe RankEmbed as an AI-driven, deep-learning model with advanced natural language capabilities. This allows it to identify relevant documents more effectively, even when a search query does not include every term related to the topic.
According to the Memorandum, “embedding based retrieval is effective at semantic matching of docs and queries.”
Interestingly, RankEmbed is trained on only a fraction of the data compared to earlier systems—just one-hundredth of the volume—yet it is said to deliver better quality search outcomes.
One of its key contributions has been improving how Google handles long-tail queries, which tend to be more specific and less commonly searched.
The underlying training data used by RankEmbed combines details about the user’s query, the most important terms identified by Google within that query, and the web pages returned in response.
This dataset is then supplemented with click-and-query behaviour, as well as human evaluation scores of different web pages, ensuring the system learns not only from user activity but also from human judgement.
The Memorandum also notes that RankEmbedBERT, the more recent version of the model, requires regular retraining to keep up with fresh data and changes in search behaviour.
A New Perspective On AI Search
A key question being asked is whether links play any part in how Google selects pages for its AI Overviews. The system behind this, known as FastSearch, is designed to prioritise speed. Ryan Jones suggests that this may indicate Google operates more than one index, with FastSearch relying on a version that highlights sites most frequently visited. This would align with the role of RankEmbed, which reportedly draws on a mix of “click-and-query data” and input from human evaluators.
When it comes to human rater data, it’s worth noting the sheer scale of Google’s index, which contains billions if not trillions of pages. Clearly, it would be impossible for reviewers to assess more than a very small portion of this. Instead, their input is used to create labelled examples that guide the training of ranking models. These labels serve as quality markers, helping the system to recognise the patterns that distinguish strong content from weaker material.
More Digital Marketing BLOGS here:
Local SEO 2024 – How To Get More Local Business Calls
3 Strategies To Grow Your Business
Is Google Effective for Lead Generation?
How To Get More Customers On Facebook Without Spending Money
How Do I Get Clients Fast On Facebook?
How Do You Use Retargeting In Marketing?
How To Get Clients From Facebook Groups