Google and the Pollution of the Link Graph

There has been a lot of recent talk about Google’s search results becoming more polluted with spam over time. Matthew Ingram’s article on GigaOm did a decent job summarizing it all. Also worth reading (and not linked directly in Matthew’s post) is Jeff Attwood’s post on the way this impacts StackOverflow, a popular programming Q&A site.

Most of the recent attention on search result spam revolves around sites that usurp top positions on the Search Results Page (SRP) from high quality original content. In some cases, sites are scraping the legitimate content and Google’s SRP ranks the scraped second hand content higher than the original content. This happened to StackOverflow, who were then forced to optimize their page title for search result position rather than user experience in order to win back their rightful place on the SRP. In explaining the change Jeff Attwood, StackOverflow CTO, said:

I believe the site name should come first on the home page, but SEO forces you to do the opposite.

The other (more grey) area is so-called content farms, a label more people are extending to mainstream sites like and These sites are criticized for creating large quantities of low quality content that is keyword optimized and therefore search friendly. Focused on garnering search traffic rather than providing high quality information.

These concerns about the declining utility of search results are legitimate, but what strikes me is that most of the ire is directed at the content scrapers and content farms, rather than at Google itself. This means that people view the organic system of hyperlinks between sites as a public good (like clean air or unpolluted water) and anyone who tries to exploit this system for commercial gain as evil. Google is largely viewed as the non-evil custodian of this public good.

In fact, this was the basis of my post last year, “Paid Links and the Tragedy of the Commons“. Untouched by commercial incentives, our links to other websites are a fantastic indicator of relevance, importance and influence on the web. Google used this insight to create web search that was light years ahead of anything else at the time. Now ugly commercialism is perverting those wonderful search results.

The fact that escaped me back then is that Google itself was an early example of this phenomenon. They were first in line to exploit our organic links for commercial gain with their AdWords product, which allowed advertisers to promote their products alongside search results. Then with AdSense they made it an easy, automated process to put ads on your web pages. There is no doubt that this intensified the commercialization of hyperlinks.

Now it seems that Google might be eaten by the monster they helped to create. Their business is dependent on a link graph that is unpolluted by commercial interests. But the current complaints about spam in their search results are just one sign that a single company, no matter how large and powerful, is not going to win the war against millions of spammers that are looking for every conceivable loophole to undermine the search algorithm and get to the top of the SRP.

When Google can no longer use the link graph to generate decent search results, its main revenue stream will collapse. It is not overly dramatic to say they are in a life or death battle against the tsunami of individuals and companies that seek to game their system.

I was looking for help on Google’s support forums recently and I stumbled across this question:

So I have a plan to build 4,000 mini sites in the next 3-4 months.

Will Google have a problem with this? I’m a domainer and I’m thinking of putting up content and ads instead of just parking them for better ROI.

Any thoughts?

Google’s support forums are actually a great place to see the problem in its purest forum. Just spend half an hour in there to see what people are planning to do to your Internet. It ain’t pretty.

None of this constitutes a reason to be angry with Google, or even with the scrapers, farmers and schemers that are distorting relevance in the link graph for commercial gain. It does mean that Internet search is going to change into something different. The link graph is not a public good. It is just an artificial construct that will be replaced or augmented by something else.

History suggests that reigning incumbent Google is unlikely to be the one to introduce the change. Their hands will be full, fighting off wave after wave of attackers on the current model.


Tweets that mention Expletive Inserted » Google and the Pollution of the Link Graph — said on January 9, 2011 at 1:56 am

[…] This post was mentioned on Twitter by toddwseattle. toddwseattle said: RT @gregorysean: Thoughts on search result spam. Is the link graph a public good? […]