Google Patent US20250103662 describes how document-level signals - link authority, freshness, traffic, and engagement - are integrated directly into transformer attention matrices, making the trustworthiness of a source a structural part of how models understand its content.
What is the Patent About?
This patent describes a system for unifying transformer-based models with document-level ranking signals, such as link-based authority, freshness, traffic, and engagement.
The core problem it addresses is fundamental:
Transformers are excellent at understanding relationships between tokens, but they traditionally ignore the importance of the document that contains those tokens. In the context of web pages, this means that conventional transformer attention mechanisms focus on what is written, but not where it comes from.
The invention introduces a way to incorporate web page signals - signals that apply to the entire document - and use them to adjust the attention matrix of a transformer model.
Key aspects of the patent include:
- tokenizing a web page into a training sequence of tokens
- associating the page with document-level signals (author, ranking, traffic, freshness, interactions)
- computing a standard attention matrix
- adjusting attention weights based on document signals and a relevancy threshold
- generating predictions using the adjusted attention
As described in the Summary, the system explicitly adjusts the attention matrix using web page signals, rather than relying on token-level information alone.
Patent Metadata
- Patent / Publication: US20250103662 (Application)
- Filed: Sep 21, 2023
- Publication date: Mar 27, 2025
- Applicant: Google LLC (Mountain View, CA)
- Inventor: Antonino Gulli (Paradiso)
- Official title: "Unifying Transformers with Link Based Ranking"
Key Patent Elements
1. Document / Web Page Signals
The patent introduces document-level signals that describe the importance of the entire web page, not individual tokens.
Examples explicitly listed in the patent include:
Author
Link-based ranking
Traffic statistics
Creation date
Modification date
Interaction metrics
These signals represent contextual information about the source of the content.
2. Training Sequence of Tokens
Each document (e.g., a web page) is tokenized into a training sequence of tokens.
For web pages:
- tokens represent words or sub-words
- embeddings are generated as usual
The patent also generalizes this approach to other modalities (images, audio, video), but the web page case is most relevant for search and SEO.
3. Attention Matrix Construction
The model builds a standard transformer attention mechanism:
- query matrix
- key matrix
- value matrix
- attention weights computed via dot products
At this stage, the system behaves like a conventional transformer.
4. Attention Matrix Adjustment Using Document Signals
This is the core innovation.
After the attention matrix is computed, the system adjusts its weights using document-level signals.
The patent describes a relevancy threshold:
- if the document satisfies the threshold → attention weights are increased
- if it fails → attention weights are reduced or not boosted
In effect, the importance of the document influences how strongly its tokens contribute to the model's understanding.
5. Training vs. Inference
The patent notes that attention adjustment can occur during training, which then affects behavior at inference time.
One stated benefit is:
- reduced hallucinations - the model is biased toward tokens from documents with higher authority
- improved grounding - outputs are anchored to real, signal-verified sources
This explicitly positions document signals as a trust and reliability mechanism, not just a ranking input.
SEO Implications
Understanding, not only ranking
Ranking signals can shape how content is interpreted, not merely where it appears. The question shifts from "does this page rank?" to "does this page get trusted by the model?"
Authority as a weight multiplier
Two pages may contain identical information, but the model may weight one's tokens more heavily. Authority no longer just boosts position - it can amplify the impact of every word on the page.
Freshness gains structural importance
Creation and modification dates are explicit signals in this system. Meaningful updates aligned with relevance can directly influence attention weighting - not just temporal freshness boosts.
Engagement as a confidence signal
Traffic and interaction metrics feed the document signal set. Content that users actually use becomes a more trusted input to the model - usage patterns reinforce model confidence in the source.
Strategic SEO Recommendations
What to Do
- Build topical authority, not isolated pages - document-level signals are accumulated across a domain's content history, not a single article
- Strengthen document-level trust - treat content as something models should trust, not just crawl
- Make authorship explicit - the patent lists authorship as a named signal; ensure your content clearly attributes the expert behind it
- Update content meaningfully - modification date is a signal; changes that reflect real informational updates carry more weight than cosmetic edits
- Pursue links that signal authority to Google - link-based ranking is the foundational signal in this patent; incoming links from trusted sources directly feed the adjustment mechanism
What to Avoid
- Assuming "great content alone is enough" - the patent shows that two identical texts can be weighted differently based on their source's signals
- Publishing content disconnected from domain context - topical focus consolidates document-level signals; scattered topics dilute them
- Mass updates without real informational change - updating modification dates through superficial edits may not satisfy the relevancy threshold
Practical Implementation Examples Based on This Patent
Authority-weighted content hubs
Create hub-and-spoke structures with one primary reference document. Consolidating document-level signals in this way clarifies importance and makes attention boosting more likely.
Update strategy aligned with relevancy thresholds
Update content when intent changes or data evolves - not cosmetically. The modification date becomes a meaningful signal, not noise, when the change reflects real informational value.
Internal linking as signal consolidation
Use internal links to reinforce which documents carry the most authority. Internal linking strengthens ranking signals that may directly influence the attention weighting mechanism.
Author and source clarity
Clearly define who created the content and why they are qualified. Authorship is explicitly named as a document signal - it is not a soft E-E-A-T signal, it is a hard input to the system.
In Conclusion
This patent is a strong signal of direction.
It shows how document-level ranking signals - including link-based authority, freshness, engagement, and trust - can be integrated directly into transformer attention mechanisms.
Not in the sense of "ranking signals decide who's #1" - but in a deeper, more structural way:
Ranking signals can influence what the model pays attention to, and how much it trusts the tokens it sees.
What makes this particularly important is how the inventor himself frames the goal of the patent.
According to Antonino Gulli, this work is explicitly about bridging LLMs with Search - using real-world web pages and their associated signals to ground large language models. The motivation is clear: while LLMs are fluent, they can hallucinate. Search, on the other hand, provides verified, contextual, and signal-rich data.
By injecting document-level signals into attention:
- hallucinations can be reduced by biasing attention toward reliable sources
- quality and accuracy improve as the model focuses on trusted, recent, and authoritative documents
- explainability increases, because outputs can be tied back to document-level context and signals
In other words, this patent is not just about ranking. It is about trust-aware attention.
In an era where search is increasingly mediated by models - AI summaries, assistants, and generative answers - the competitive advantage shifts toward publishers who can combine:
- strong content (token-level relevance and clarity) with
- strong signals (document-level authority, freshness, engagement, and trust)
Patents aren't product announcements. But this one reads like a clean blueprint for how classic SEO signals and LLM-style understanding converge into a single mechanism.
If attention is the steering wheel of modern models, then document-level signals may well be the power steering.
And SEO, in that world, is no longer about optimizing pages - it's about designing documents that models can safely rely on.
Summary
Patent US20250103662 introduces a method for adjusting transformer attention matrices using document-level ranking signals, making authority, freshness, authorship, and engagement structural inputs to how models process web content - not just ranking factors.
The core innovation lies in the relevancy threshold mechanism: documents that satisfy the threshold have their tokens' attention weights boosted, while weaker documents are deprioritized. This creates a direct channel through which classic SEO signals influence transformer behavior.
For SEO strategists, the implication is clear: success depends on building documents that models can trust - combining topical authority, explicit authorship, meaningful freshness, and engagement-backed relevance. The ranking game and the attention game are now the same game.