Google’s Martin Splitt explains how Google analyses web page content
In a recent webinar, Google’s Martin Splitt shared some inside and valuable information about how Google analyzes web page content. He also introduced a new term, Centerpiece Annotation, which Google uses when analyzing web content.
How Google analyzes web page content
According to Martin Splitt, Google uses what he referred to as Centerpiece Annotation. Through this, Google identifies the main component or topic of the page. Based on that information — what the main topic is — Google divides the web page content into multiple components and assigns them different weightage based on their relevance.
“We have a thing called the Centerpiece Annotation, for instance, and there’s a few other annotations that we have where we look at the semantic content, as well as potentially the layout tree.
But fundamentally, we can read that from the content structure in HTML already and figure out “Oh! This looks like from all the natural language processing that we did on this entire text content here that we got, it looks like this is primarily about topic A, dog food.”
“And then there’s this other thing here, which seems to be like links to related products, but it’s not really part of the centerpiece. It’s not really the main content here. This seems to be [the] additional stuff.
And then there’s like a bunch of boilerplate or, “Hey, we figured out that the menu looks pretty much the same on all these pages and lists. This looks pretty much like that menu that we have on all the other pages of this domain,” for instance, or we’ve seen this before. We don’t even actually go by domain or like, ‘Oh, this looks like a menu.’ We figure out what looks like boilerplate, and then, that gets weighted differently as well.”
So, the most important component of the page — the “centerpiece” — gets the biggest weightage. The other sections are not given “as much of a consideration.”
As Martin explained:
“So if you happen to have content on a page that is not related to the main topic of the rest of the content, we might not give it as much of a consideration as you think.
We still use that information for the link discovery and figuring out your site structure and all of that.
But if a page has 10,000 words on dog food and then 3000 or 2000 or 1000 words on bikes, then probably this is not good content for bikes.”
This is an important insight into how Google works and analyzes web page content.
We always knew that content relevance was important, but we now know that it is possible that content relevance varies section by section on the same page.
For content creators and SEO professionals, it is important that each page has a separate topic that it covers in detail. It is not worth mixing a bunch of different topics on the same page and expecting it to rank for all the different types of queries.
If you are interested, you can watch the full video here.