Cracking Google's 'secret sauce' algorithm
Rand Fishkin knows how valuable it is for a Web site to rank high in a Google search. But even this president of a search engine optimization firm was blown away by a proposal he received at a search engine optimization conference in London last month, where he was a panelist.
The topic — Can a poker Web site rank high on a Google search using purely white hat tactics — meaning no spamming, cloaking, link farms or other frowned-upon “black hat” practices. Fishkin answered yes, provided the site also added other marketing techniques and attracted some media attention.
The rest of the panel scoffed. “Don’t bring a knife to a gunfight,” one chided. After all, this is the cutthroat online gambling sector.
But one poker Web site owner was intrigued, and he later approached Fishkin. “He said, ‘If you can get us a search ranking in the top five for online poker or gambling [using white hat methods], we’ll buy that site from you for $10 million,’” recalls Fishkin, president and CEO of SEOmoz in Seattle. Intrigued but skeptical, Fishkin consulted other gambling site owners at the conference. “They said, ‘If it really does rank there, we might be interested in paying you $10 million more.’”
Turns out, a single online gambling customer brings in at least $1,000 in revenue. With a recent Google search of “Texas Holdem Poker” yielding 1.64 million results, it’s easy to see why site owners would pay millions to crack the code for Google’s PageRank algorithm — the elusive Holy Grail of online marketing.
The stakes are high for online businesses — and Google is the formidable gatekeeper between site owners and their customers. Web sites, such as kinderstart.com, have even sued Google for what they allege are deliberate de-rankings, though none have been successful to date. Site owners are eager to get their hands on the 75 percent of free Google traffic that is not affected by AdSense and AdWords, Google’s pay-per-click programs. With 47 percent market share among search engines and 3 billion search inquiries a month, Google is indeed king.
“Being at the top of Google is probably the most important factor in your whole marketing plan online,” says Chris Winfield, president of 10e20 LLC, a global search marketing and Web solutions company in New York. “Nothing comes close to being able to match it with people looking for what you do.”
Deceptive black hat tactics run rampant among highly competitive Web sites, but they are now under the watchful eye of Google’s spam group, which identifies deceptive practices and then quashes the problem, sometimes by devaluing the site’s ranking or relinquishing it to the supplemental index, which effectively means “no priority.” Google, however, says it takes steps to help sites identify and fix the problem so they can “apply for re-inclusion.” Pursuing white hat, legal tactics to raise a search engine ranking has become a top priority.
Google acknowledges the influence that its algorithms have in the Web world, but officials also say that — just like Spider-Man — with great power comes great responsibility. No decision to devalue or omit a Web site is made without the algorithm behind it.
“Just because somebody doesn’t like Google, it certainly is no reason to take any action [against them],” says Matt Cutts, senior engineer and Google’s de facto liaison to the webmaster community. “We care about spam and the quality of the results. That’s purely about whether the site is abiding by our quality guidelines.”
But even playing by the rules can be frustrating. Winfield has even cried foul to Google. In a search for translation services, “it amazes me that five out of 10 results were for the same Web site, and it was completely irrelevant. When I see things like that, it boggles my mind,” Winfield says.
But he’s quick to add that Google was “great for feedback” to an inquiry about the issue. “They do listen. It can be frustrating that they control so much, but they do care.” Still, a Google search on March 9 for “translation services” showed that two of the 10 results were for the same site that Winfield questioned. The aroma of secret sauce wafts through the air.
What’s in the secret sauce?
PageRank is Google’s trademarked process where a numeric value represents how important a page is on the Web. But that’s only part of the formula. There are also the value places on each link to that Web site. The secret sauce, much like many recipes in the food world, is a matter of how much of each ingredient is being used — the “weighting” of each piece. While there’s a bevy of information on the Web on the primary parts of the algorithm and what marketers or site owners should do to increase their rank, Google remains elusive on most of the 200 factors it uses to score pages and decide which page goes to the top of the results.
“There isn’t one answer anymore. The majority of factors we don’t talk about,” Cutts says. “A lot of people have theories, but we don’t usually confirm the theories.”
Cutts freely offers clues such as placing keywords in the title, headings and even the URL, and keeping related words close together. He points to many tips that Google offers on its site to raise a site’s search ranking, as well as Web forums and conferences for communicating. But some say Google’s tips are often ambiguous.
“We deploy all the techniques that Google approves of and they have given those measure out on their guideline pages,” says Atul Gupta, president and CEO of RedAlkemi, a search engine marketing and Web development firm in Chandigarh, India. “But not all of it is in plain and simple language that anyone can understand. For example, they’ll say, ‘We will reward links from old pages’ [that have important information]. Then in the next sentence, they say, ‘We’ll reward pages that are freshly updated because they newsworthy.’ So one has to figure out what are they really wanting and how much are they wanting.”
Page link mysteries
Even mathematicians familiar with the equations used to create the PageRank algorithm struggle with other non-numeric factors. David Austin, a math professor at Grand Valley State University in Allendale, Mich., who published a paper on cracking Google’s algorithm, says the secret sauce is really a popularity contest wrapped in linear algebra.
“It’s like you’re having a popularity contest and you think everybody gets a vote, so I can vote for as many people as I want to,” Austin says. “So if I vote for 10 people, I give everyone 1/10th of a vote. So who wins that popularity contest?”
But then Google goes further. “They take a second pass through it, and look at who voted for who,” he explains. Google assigns a value to the importance of the site that casts the vote (or links to a site), and that site can pass on its popularity and importance to the site it linked to.
Gupta chisels away at the PageRank algorithm simply by looking at what the No. 1 ranked sites are doing. “We have identified 250 parameters that Google studies to rank a site,” he says. “We’ve got labs where people are constantly monitoring the impact of each. But the birds-eye view is, how can we make a site simply perform well in the natural course?”
Cracking the code — the hard part is in the execution
Google’s Cutts says that creating a high-ranking Web site is really quite easy. Just focus on the customer and create compelling content that’s buzzworthy, vital, or that provides some sort of service or resource that a reader would want to bookmark. In other words, pretend Google isn’t even there.
He points to comparisons between two English-to-Japanese translation Web sites. One site was very much like a brochure and just four or five pages. The higher-ranking site included a tutorial about how to write your name in Japanese, as well as information about different Japanese dialects. “It’s that kind of creativity and information that hooks people,” Cutts says. “Those kinds of compelling content are exactly the things that help you crack the code at Google.”
Pro Acoustics USA, an acoustic equipment and consulting and design service company in Harker Heights, Texas, followed Google’s guidelines and worked with RedAlkemi to redesign its Web site for higher page rankings. It has doubled traffic every year for the past four years. “We probably have 100 keywords we work at all the time … and I think we have a good handle on it,” says Emery Kertesz, president and CEO, who launched the site in 2003.
Kertesz says the key to a high page rank is to correctly present relevant information, including title tags, meta tags and links. His site usually ranks among the top 10 sites in his category. But he won’t rest on his current ranking.
“Nobody is very satisfied” with their ranking,” Kertesz says. “More is better, higher is better. Everybody’s in a death battle for money and ranking.”
As for the offer of $10 million or more for Fishkin to develop a poker site that ranked in the top five of Google’s results, he said “We’re still weighing it, but the general sentiment is that the effort and time required would force us to abandon a lot of other projects and clients, and even then it’s a bit of a gamble [no pun intended], so we are leaning against taking the offer.”