A Rexblog Guessay: How Google might apply PageRank to measure the relative value of links pushed through Twitter


[Note: A guessay is an essay comprised mainly of guesses.]

In a post on Friday (when rumors were swirling about the possibility of Google acquiring Twitter), I suggested that the acquisition of Twitter could add tremendous value to Google search results by adding real-time data from our (all Twitter users’) collective stream-of-consciouness. I suggested (as have many, many others) that Google’s PageRank algorithms could benefit greatly from all of the real-time linking that millions of Twitter users do.

In my post, I included a line that made some people think I was suggesting that the number of followers a Twitter user has is an indication of the authority Google might award links shared (or, as Dave Winer terms it, “pushed”) by that user. I got a few e-mails from people disagreeing with what they thought I was saying: “The number of followers one has is not necessarily a measurement of someones link-sharing skills,” is typical of what the e-mailers said.

First things first: Despite implying it, I didn’t actually say nor mean the number of followers indicates authority. Indeed, I whole-heartedly agree that the number of followers one has on his or her Twitter account is NOT in and of itself, a measure of authority. Lots of followers on Twitter may be an indication of authority, however it is likely rather a measurement of popularity, celebrity, strategy or an annointment from Twitter’s management.

So no, if Google owned Twitter, the PageRank of links would not be based on such an easy-to-game or manipulate (or what would be called “optimized” by people seeking fees to manipulate it) thing as having lots of followers. If that were so, a tweet-turbo’d Google would be handing over search results to @the_real_shaq and whoever ghost-tweets for @britneyspears.

No. If Google owned Twitter, there would be a great mystery surrounding exactly how Google measures linking authority on Twitter — and likely it would merely turn the whole thing over to its pigeons to figure out. And by pigeons, of course I mean Google’s secret-sauce PageRank algorithms the company describes this way:

PageRank reflects our view of the importance of web pages by considering more than 500 million variables and 2 billion terms. Pages that we believe are important pages receive a higher PageRank and are more likely to appear at the top of the search results. PageRank also considers the importance of each page that casts a vote, as votes from some pages are considered to have greater value, thus giving the linked page greater value. We have always taken a pragmatic approach to help improve search quality and create useful products, and our technology uses the collective intelligence of the web to determine a page’s importance.

I (and anyone else who writes using lots of hyperlinks) have served as one of those Google voters for a long time. So I have developed my personal theories about what goes into PageRank algorithms. (There’s a whole industry called Search Engine Optimization that, figuratively speaking, sacrifices virgins to please the Google PageRank Algorithm Gods.) Based on my vast experience (translation: guesses and theories), here are some factors I think would be be applied by Google in determining the relative value of a link pushed via Twitter — if Google owned Twitter.

1. The number of followers your followers have would be more important than the number of followers you have. And that measure would cascade out several levels. This is somewhat akin to the “strength of schedule” factor in the BCS formula.

2. Extreme ratios of followers to following and vice versa would cause authority to fall. Discounting (or ignoring) high follower/following users would lessen the influence (i.e., ability to game) of celebrity Twitterists. And kicking out high following/follower users would undermine Twitter spam efforts.

3. User accounts that crank out high numbers of tweets will be discounted unless the tweets are posted from various third-party clients. In other words, high volume tweeting would be thought to be automated unless there are markers indicating the user is an actual human named Scoble or Brogan.

4. Tweeting about a limited number of topics would probably be rewarded. I am not a fan of directories of Twitter users. I find them an easy target for gaming schemes. However, I have one thing to praise about a recent entry to the Twitter directory category, WeFollow.com. It requires a Twitter user to do something akin to declaring a major. To be listed, your must limit where you Twitter feed will appear to three categories. For example, I limited my account @r to smallbusiness, nashville and another category I’ve forgotten. Seeing WeFollow force Twitter users into defining the category of their “authority” makes me think Google would likely tweak the PageRank algorithm to anticipate the categories of links in which Twitter users might have authority, rather than giving any link they add on any topic the same weight.

5. While “re-tweeting” and “replies” may appear to be indicators of authority, I think they would be discounted due to factors related to “celebrity” or automation.

6. The real golden goose (or golden pigeon) that Google would use to juice up its algorithms would be actual clicks. Google knows more about measuring, analyzing and making money from click throughs than any other company will ever know. If, in a scenario that was merely rumor on Friday, Twitter was bought by Google, it would throw lots of a resources into understanding which Twitter users generate the most clicks-throughs on links they “push” out via Twitter.

Sidenote 1: Dave Winer demonstrates and explaines a means of measuring the relative clickiness o links he pushes out via Twitter.

Sidenote 2: Here’s a Greasemonkey script (and a demo from Doc Searls) that shows related “tweets” on a Google search results page. This is not exactly what I’m talking about in this post, but is cool, nevertheless.

