Food for Thought
Looking at Google's patent for clues is smart SEO detective work. We need to thank Slawski for paving the way for us. That being said, can we safely assume that Google doesn't implement anything that wasn't mentioned in the patent?
15 👍🏽1545 💬🗨
I've heard of this for years. That is how Latent Semantic Indexing (LSI) got in the picture. It's smart but all conjecture. Just because they have a patent doesn't mean they are using it.
Bell Labs had the patent on LSI, from 1988, so they were able to legally exclude Google from using LSI till 2O08. The thing is, Google had no need for LSI, cause it only works on small unchanging databases, unlike the Web.
And the patent doesn't cover LSI Keywords. That is something that an SEO toolmaker made up to profit off LSI technology. Thy didn't have the patent, and Google wasn't ever using it.
Jason » Bill
You would know better than I
I believe the question is a bit broader. The simplest answer is "yes". Whatever Google implements as part of the algo will be mirrored in some of their patents.
That being said, it is very difficult to sift through the numerous Google patents, draw inferences, identify patterns and model some cohesive hypotheses regarding the convergence of various technologies and how they might affect the algo. The tech detailed in patents can rarely be directly correlated to some precise ranking factor.
This is where Slawski really shines, in my opinion, and why his work is uniquely brilliant. Anyone can access the USPTO database and query it for patents filed by Google. And I trust a lot of us do. But Bill manages to identify commonalities between very different technologies and form coherent hypotheses as to how they might practically affect the algo. Not to mention he goes the extra mile to make his insights available and accessible to the general audience.
"can we safely assume that Google doesn't implement anything that wasn't mentioned in the patent?"
No. The patents are just the tip of the iceberg.
Google is testing all sorts of things.
Also, Bill doesn't analyze every patent Google applies for. We've seen a few cases where processes patented by other teams appear to be implemented by the Web Search team.
Through the years, I've seen other people suggest that Pay Per Click (PPC)/Adwords team patents are being used by Google search. I can't think of any examples where I was convinced their analyses were correct, but they were certainly right to look beyond patents that are only sought by the Web search team.
Google's internal teams learn from each other. They can share ideas and I don't believe they have to apply for a second patent every time they adapt a process to a new section of their business. I know they sometimes do this, but I don't know (calling Slawski) if they must do that.
Anyway, there's a lot going on with the machine learning that never appears in the patents but which is used in the Web search system (and other search systems).
Yep. I mostly look at the patents about search. The patents about ads, I often ignore, and many others that they file. This week, they were granted around 40, and I only saved about 5 of those, which I may or may not write about, and that is totally my call.
They have been granted some machine learning patents, and I have written about a few of those and skipped some too.
Ankit » Slawski
How about you train a few people under you on how to find, analyse and write about these patents (like you do) and eventually form a team so that the process is spread out more? Does that make sense?
Very few people do (that too very inconsistently) the thing that you do. And we need more of this on a bigger scale so that the SEO imdustry is more of a researched based thing and not just tips ans tricks learned from blogpost.
Slawski » Ankit
I learned how to breakdown and analyze search patents based on years of education after an undergrad English degree and a graduate law degree, and then years of writing about patents (,probably over 2,000 posts). I have mentored people on how to do Search Engine Optimization (SEO), and it was a lot of work. Teaching someone to read new patents every week, and write about those, and possibly share SEO experiences or how it could possibly help with doing SEO would be a lot of work.
I'm thinking that it might be easier to write a book.
Ankit » Slawski
That makes sense.
Couldn't imagine the depth and breadth of it.
Lieven » Slawski
We need to feed patents to a new Enigma Google simulation machine like as developped at the Polish Cipher bureau by Marian Rejewski in 1933 and used by Alan Turing in 1943 to decode Googles patents. I would love to use this instead of guessing with Ahrefs a domains traffic history. More good things coming from Poland. I saw such a computer like Querty machine in the Gdansk Ww2 museum.
No. Certain things aren't patentable.
For example, Google has several patents about determining block elements on a page – header, footer, sidebars, navigation, etc. These are limited to determining elements that aren't labeled because… well, labeled ones are labeled and you don't need and can't patent reading a document that has been formatted to be read by something like this.
So… to say that Google ONLY uses the patented "I can use these clues to determine that this is actually the header of the page" and do NOT use the "This is the header because it's properly labeled as the header" signals wouldn't make a lot of sense.
Google's patents are there to protect their IP rights on things that take a unique process to execute. You can't patent something like "<h1> tags tend to describe the entire purpose of the document therefore the <h1> tag is the most important h tag to have a keyword in it." That's the point of the h1 tag by its very nature.
Page segmentation has been patented by Google. They have another one that tells us about segmentatkion, mostly in a couple of sentences, and by parts of page, which is also linked to in this post:
Googles Page Segmentation Patent Granted
Marco » Slawski
The mighty Bill has spoken! I read the actual patents, but I also read your ideas.
Microsoft has patented how they distinguish between different blocks on pages by looking at different features of those blocks, in a little more detail than Google has provided.
Right – that is about how it might use the information in the segments. And the other one I was thinking of is determining the various blocks when they aren't explicitly labeled.
We can always safely assume but as big as G is and as much data as they're collecting, I'm guessing they've got other implementations that only a few may know of. KInda the ol' "let's distract them with patents and slip this or that in while they're not looking". All that being said though, I really don't know how many more ways there are to collect data.
Things that are prior art, and in the public domain are not patented, such as hypertext relevance. Theses types of things might be mentioned in a patent, but are not claimed by a patent.
If a search engine invents something and they want to try to exclude others from doing the same thing, they may patent it.
Many ideas patented by one search engine have been invented in other ways and patented as well by other search engines.
For example, Microsoft, Google, and Yahoo each have at least one patent on Page Segmentation, identifying 3 different ways that each may follow.
I've never seen anyone else in the search industry mention this so I will share it here.
Search engines license research and patents from universities.
There is a world of research papers published by certain universities that the search industry is completely unaware of and represents a new area for understanding what is possible. These universities earn royalties from their research and patents.
Google negotiated with tencent in China, and has an agreement to use technology in their patents. I didn't look through them.
Roger » Slawski
That's interesting! Knowledge transfer between countries seemed to have become a sensitive issue in both countries.
Slawski » Roger
They had more than a few hundred (possibly thousands) I had heard they were big in machine learning too.
This is how the PageRank patent from Lawrence Page was filed and assigned to Stanford, and exclusively licensed to Google until it expired.
Ammon Johns 🎓
As SEO users, I tend to advise we *assume* as little as possible. Just because there is a patent on something doesn't even mean they did implement exactly what is in the patent, let alone exactly the way it was shown in the patent. It is only when we see three or more patents all looking at the same things in the same ways that we can feel a lot more confident that that thing is being used.
There's also a lot of stuff that Google use that not only they have not patented, but someone else has. Like all those lovely graphics cards they use for machine learning. The web is built on an awful lot of separate intellectual properties, some in the public domain, some under creative commons, and some that definitely belong to someone.
So we most certainly cannot assume that just because something isn't in a Google patent that Google don't use it. They use the Internet and the Web, for example though invented nor patented neither. 😃
Plus of course the sneaky beggars will file patents under subsidiaries, and also licence existing patented stuff.
It's as clear as mud!
Slawski » Chris Edwards
There are some patents filed under subsidiaries names, but most of those were reassigned to Google.
There are a lot of medical patents now Assigned to verily, which is a subdivision.
603 medical patents to verily:
Chris Edwards 🎓
You are the patent King and always have been.
You know you're a geek when you have uspto.gov as the number 1 favourite
United States Patent and Trademark Office
Slawski » Ammon Johns
I'll love that there are at least 20 patents about phrase based Indexing at Google
I especially love the patent on phrase posting lists (an inverted index of phrases), because it would be a lot of work, but would show that they likely use phrase based Indexing.
I love Bill's work here, but I think we have to be VERY careful with patents, especially with Google. Patents are a way of protecting IP, but with a company the size of Google who has money to spend on lawyers ad infinitum, they could file all day long for any and every idea that crosses their table but only implement a handful that actually works or test out as a way to make the mousetrap better. I'm sure Bill will tell you that after you study enough of them, they start to contradict each other.
That's true in several cases, where different inventors have approached an issue in different ways. Sometimes I find those conflicts especially interesting though, as the simple fact that multiple people were engaged in researching the same problem, regardless of the difference in approach, certainly reinforces that the problem is one that was not only worth solving, but one multiple teams worked on independently.
And it always seems like that 'conflict' is even more true than it really is when people so often mistake where a patent has come from – for example the Trust Rank patent that so many attributed to Google but IIRC was from Yahoo? Certainly several PageRank refinements were actually from Microsoft, etc.
Slawski » Ammon Johns
IBM wrote the patent on Dangling nodes for documents such as jpgs or pdfs without links, and how PageRank was calculated for them. Google acquired that patent from IBM
Jeff » Slawski
Dangling Nodes? This is a family chat room!!! 🤣
Slawski » Jeff
Search engineers have fun too:
Search Indexing Dead Ends: IBM Patent Explores Dangling Nodes
Slawski » Jeff
No, most Google patents are consistent with each other. All of them set out the problem they are intended to solve, show prior art, and then set out the patent's solution. Research the inventors, and see what they do, what papers they have written and other patents in their name. The people writing patents are not today's Jules Vern or Jorge Luis Borges. They are very serious about creating something new that works. Lawyers work with search engineers to write patents. They are creating to make Google Better, and aren't just writing patents to milk the system. I know bullcrap when I see it
I have written about over 1,300 patents. They don't file frivous patents.
Jeff » Slawski
Don't take that the wrong way – nothing frivolous at all. Just submitted, but never executed.
Google uses Natural Language Processing (NLP) instead of Latent Semantic Indexing (LSI) due to Both Different Patent Owners
Does Google Crawl and Index Users Generate Content Sites like Quora and Medium Easier than Yours?