← Back to blog

Google Patent US20250103662 - Unifying Transformers with Link-Based Ranking

Google Patent US20250103662 - Unifying Transformers with Link-Based Ranking

Google Patent US20250103662 describes how document-level signals - link authority, freshness, traffic, and engagement - are integrated directly into transformer attention matrices, making the trustworthiness of a source a structural part of how models understand its content.

What is the Patent About?

This patent describes a system for unifying transformer-based models with document-level ranking signals, such as link-based authority, freshness, traffic, and engagement.

The core problem it addresses is fundamental:

Transformers are excellent at understanding relationships between tokens, but they traditionally ignore the importance of the document that contains those tokens. In the context of web pages, this means that conventional transformer attention mechanisms focus on what is written, but not where it comes from.

The invention introduces a way to incorporate web page signals - signals that apply to the entire document - and use them to adjust the attention matrix of a transformer model.

Key aspects of the patent include:

  • tokenizing a web page into a training sequence of tokens
  • associating the page with document-level signals (author, ranking, traffic, freshness, interactions)
  • computing a standard attention matrix
  • adjusting attention weights based on document signals and a relevancy threshold
  • generating predictions using the adjusted attention

As described in the Summary, the system explicitly adjusts the attention matrix using web page signals, rather than relying on token-level information alone.

Patent Metadata

  • Patent / Publication: US20250103662 (Application)
  • Filed: Sep 21, 2023
  • Publication date: Mar 27, 2025
  • Applicant: Google LLC (Mountain View, CA)
  • Inventor: Antonino Gulli (Paradiso)
  • Official title: "Unifying Transformers with Link Based Ranking"

Key Patent Elements

1. Document / Web Page Signals

The patent introduces document-level signals that describe the importance of the entire web page, not individual tokens.

Examples explicitly listed in the patent include:

Author

Link-based ranking

Traffic statistics

Creation date

Modification date

Interaction metrics

These signals represent contextual information about the source of the content.

2. Training Sequence of Tokens

Each document (e.g., a web page) is tokenized into a training sequence of tokens.

For web pages:

  • tokens represent words or sub-words
  • embeddings are generated as usual

The patent also generalizes this approach to other modalities (images, audio, video), but the web page case is most relevant for search and SEO.

3. Attention Matrix Construction

The model builds a standard transformer attention mechanism:

  • query matrix
  • key matrix
  • value matrix
  • attention weights computed via dot products

At this stage, the system behaves like a conventional transformer.

4. Attention Matrix Adjustment Using Document Signals

This is the core innovation.

After the attention matrix is computed, the system adjusts its weights using document-level signals.

The patent describes a relevancy threshold:

  • if the document satisfies the threshold → attention weights are increased
  • if it fails → attention weights are reduced or not boosted

In effect, the importance of the document influences how strongly its tokens contribute to the model's understanding.

5. Training vs. Inference

The patent notes that attention adjustment can occur during training, which then affects behavior at inference time.

One stated benefit is:

  • reduced hallucinations - the model is biased toward tokens from documents with higher authority
  • improved grounding - outputs are anchored to real, signal-verified sources

This explicitly positions document signals as a trust and reliability mechanism, not just a ranking input.

SEO Implications

Understanding, not only ranking

Ranking signals can shape how content is interpreted, not merely where it appears. The question shifts from "does this page rank?" to "does this page get trusted by the model?"

Authority as a weight multiplier

Two pages may contain identical information, but the model may weight one's tokens more heavily. Authority no longer just boosts position - it can amplify the impact of every word on the page.

Freshness gains structural importance

Creation and modification dates are explicit signals in this system. Meaningful updates aligned with relevance can directly influence attention weighting - not just temporal freshness boosts.

Engagement as a confidence signal

Traffic and interaction metrics feed the document signal set. Content that users actually use becomes a more trusted input to the model - usage patterns reinforce model confidence in the source.

Strategic SEO Recommendations

What to Do

  • Build topical authority, not isolated pages - document-level signals are accumulated across a domain's content history, not a single article
  • Strengthen document-level trust - treat content as something models should trust, not just crawl
  • Make authorship explicit - the patent lists authorship as a named signal; ensure your content clearly attributes the expert behind it
  • Update content meaningfully - modification date is a signal; changes that reflect real informational updates carry more weight than cosmetic edits
  • Pursue links that signal authority to Google - link-based ranking is the foundational signal in this patent; incoming links from trusted sources directly feed the adjustment mechanism

What to Avoid

  • Assuming "great content alone is enough" - the patent shows that two identical texts can be weighted differently based on their source's signals
  • Publishing content disconnected from domain context - topical focus consolidates document-level signals; scattered topics dilute them
  • Mass updates without real informational change - updating modification dates through superficial edits may not satisfy the relevancy threshold

Practical Implementation Examples Based on This Patent

Authority-weighted content hubs

Create hub-and-spoke structures with one primary reference document. Consolidating document-level signals in this way clarifies importance and makes attention boosting more likely.

Update strategy aligned with relevancy thresholds

Update content when intent changes or data evolves - not cosmetically. The modification date becomes a meaningful signal, not noise, when the change reflects real informational value.

Internal linking as signal consolidation

Use internal links to reinforce which documents carry the most authority. Internal linking strengthens ranking signals that may directly influence the attention weighting mechanism.

Author and source clarity

Clearly define who created the content and why they are qualified. Authorship is explicitly named as a document signal - it is not a soft E-E-A-T signal, it is a hard input to the system.

Diagram: Unifying Transformers with Link-Based Ranking - web page signals (rank, traffic, date) adjusting the attention matrix, flowing through the transformer model to produce predictions
Figure: Web page signals - rank, traffic, date - adjust the attention matrix before the transformer produces its predictions

In Conclusion

This patent is a strong signal of direction.

It shows how document-level ranking signals - including link-based authority, freshness, engagement, and trust - can be integrated directly into transformer attention mechanisms.

Not in the sense of "ranking signals decide who's #1" - but in a deeper, more structural way:

Ranking signals can influence what the model pays attention to, and how much it trusts the tokens it sees.

What makes this particularly important is how the inventor himself frames the goal of the patent.

According to Antonino Gulli, this work is explicitly about bridging LLMs with Search - using real-world web pages and their associated signals to ground large language models. The motivation is clear: while LLMs are fluent, they can hallucinate. Search, on the other hand, provides verified, contextual, and signal-rich data.

By injecting document-level signals into attention:

  • hallucinations can be reduced by biasing attention toward reliable sources
  • quality and accuracy improve as the model focuses on trusted, recent, and authoritative documents
  • explainability increases, because outputs can be tied back to document-level context and signals

In other words, this patent is not just about ranking. It is about trust-aware attention.

In an era where search is increasingly mediated by models - AI summaries, assistants, and generative answers - the competitive advantage shifts toward publishers who can combine:

  • strong content (token-level relevance and clarity) with
  • strong signals (document-level authority, freshness, engagement, and trust)

Patents aren't product announcements. But this one reads like a clean blueprint for how classic SEO signals and LLM-style understanding converge into a single mechanism.

If attention is the steering wheel of modern models, then document-level signals may well be the power steering.

And SEO, in that world, is no longer about optimizing pages - it's about designing documents that models can safely rely on.

Summary

Patent US20250103662 introduces a method for adjusting transformer attention matrices using document-level ranking signals, making authority, freshness, authorship, and engagement structural inputs to how models process web content - not just ranking factors.

The core innovation lies in the relevancy threshold mechanism: documents that satisfy the threshold have their tokens' attention weights boosted, while weaker documents are deprioritized. This creates a direct channel through which classic SEO signals influence transformer behavior.

For SEO strategists, the implication is clear: success depends on building documents that models can trust - combining topical authority, explicit authorship, meaningful freshness, and engagement-backed relevance. The ranking game and the attention game are now the same game.

Rafał Borowiec
About the author

Rafał Borowiec

Rafał Borowiec is an SEO expert and creator of the Patent-Based SEO methodology - an approach where every SEO recommendation is grounded in a specific Google patent number, not industry speculation.

He has analyzed over 1,000 Google patent documents to understand ranking mechanisms at their source. His approach combines Semantic SEO and Topical Authority with knowledge drawn directly from search engine engineers - creating strategies resistant to algorithm changes.

Since 2010, he has worked with e-commerce, SaaS and B2B companies, helping them build stable organic visibility and predictable, long-term results. He works personally on every project - no delegation, no intermediary layers.

He treats SEO as information engineering, not a marketing campaign. He's interested not only in visibility, but in how the search engine understands a client's brand - that's why every word, every content structure, and every semantic connection in his strategies serves a specific purpose.

Founder of Patent Core Digital

Want to leverage Google patents in your SEO strategy?

Let's discuss your online business potential.

Book a free consultation