« Previous entry | Home | Next entry »

Web stats are broken -- so you'd better have brass knuckles

Juggling statistics
Statistics on the Internet is like water for a flower. You need them for much of the Web to survive.

But independent third-party tracking of traffic to Web sites, and of user clicks on Web page links, is deeply flawed, developments this week at Google and elsewhere underscored. There doesn't seem to be any remedy in sight.

Web sites that rely on advertising -- including some the most poplar, such as Google, Yahoo, MySpace and YouTube -- get paid based on the amount of traffic to their site and the number click-throughs on their ads. Without accurate data, advertisers have no idea how much they should be paying.

The only reason the system isn't breaking down, and advertisers aren't pulling out, is because they have no choice but to play. They are taking informed guesses, based on the shoddy statistics available. And Google et al. are using every strategy they can find to deal with this problem.

We were reading TechCrunch's wobbly efforts to pinpoint whether traffic to bookmarking company Del.icio.us is climbing, flattening, or plunging. In the end, TechCrunch had to give up. One seeming reliable statistics vendor Comscore is trumped by another, Hitwise. (Update: Here's a good description of why many of these services are different.)

We're wondering what lesson young entrepreneurs will take from this.

Some, no doubt, will decide they should be pretty brazen about how they report statistics. There is only upside to fudging the stats slightly. And you won't get caught if no one can double check, right? And if the other guy is padding his traffic numbers, well, you'd better do it too.

This could get dangerous. As it is, even without the stat problem, success on the Internet has depended on good ol' "nasty grassroots, viral campaigns, brass knuckling marketing," says David Stern, a venture capitalists of Clearstone Ventures, who has learned a lot from watching successful start-ups while at Idealab, and from watching MySpace.

Which brings us to Bebo, the social networking company that is trying to make some inroads on MySpace. Bebo's monthly visitor stats basically flat here in the U.S. over the past couple year, according to proprietary (and thus no link) statistics from Magestic Research and comScore. We don't want to pick on Bebo, but in February the site told us it had overtaken MySpace in the UK. So we were surprised to see an announcement this week that Bebo had -- surprise -- just overtaken MySpace in the UK "for the first time." Looks like Bebo is just taking the statistics that suits its marketing purposes (but to be fair, we don't know the background to how the recent announcement came about).

Indeed, this may be fair game -- given that statistics are dead, and you've got to do what you've got to do -- as long as you don't go over that fuzzy line, wherever it is.

And that's where Google is fighting. The giant search engine gets billions of revenues from people clicking on ads on its main search results and on Google ads carried by large publishers. Reliable stats mean everything. The problem is, Google can't prove that certain clicks aren't fradulent -- or, being tapped in by ambitious content owners or others who have an incentive to game the system and make more money than they should.

John Battelle, a search expert has a good post here about how Google's tactic is to turn the tables: Pointing to flaws of studies that purport to show click-fraud is a problem. After all, if statistics aren't any good, you can't prove it one way or the other.

In other words, we're back to square one.

What are we saying? If you're an Internet start-up, you'd better think about how to play hardball. You don't want to cross the line. If you do, you'll get called on it sooner or later. But numbers can affect your buzz, and your revenue. Make sure you've got someone on staff who is as tough as nails, and who is prepared to do some aggressive marketing. Unfortunately, statistics uncertainty appears to impact the small start-up sites the most.

TrackBack URL for this entry:

Links to blogs that reference this entry:

From: Webanalyticsbook.com
Siliconbeat “beats” webanalytics industry
Excerpt: Today Matt Marshall at SiliconbeatŚ wroteŚ inŚ hisŚ blogŚ that webanalytics is back to square one. He points out thatŚ Hitwise, Comscore and Alexa show different numbers. Here the graphs that he was writing about: I. Comscore - Unique visitors and tota...
Tracked: August 10, 2006 2:07 PM
From: mkaz.com
Web Metrics: Measuring Blog Popularity
Excerpt: I found your entry interesting do I've added a Trackback to it on my weblog :)
Tracked: August 10, 2006 11:27 PM
From: Online Dating Insider
Web Analytics Back to Square One
Excerpt: Here's a Silicon Beat article stating webanalytics is back to square one. Found via Webanalyticsbook. Technorati Tags: web+analytics...
Tracked: August 11, 2006 7:43 AM
From: Web
Excerpt: Last Modified: Sat, May 6 01:39:29 UTC 2006 For other contact information, see the Debian contact page . Bridging the paper to digital di...
Tracked: August 13, 2006 1:27 AM


There's another trend going on that seems to spawn many misleading memes: the growing popularity of free, amateur measurement tools. I'm talking tools like Alexa and Google Trends, among others. They're interesting, fun, illustrative and sometimes offer good directional insights. Combine their lack of rigour with amateur analysts (like a lot of talkative bloggers) and you wind up with a ton of analyses that are misguided (albeit some good ones, too).

Max Kalehoff of Nielsen BuzzMetrics
(the guy who you went sailing with on Bounty, which used to be my home)

Max Kalehoff on August 11, 2006 7:34 AM
Comment link

I think you're combining two different issues. One problem is getting accurate information from third parties (Alexa, Hitwise, Comscore, etc.) about traffic to specific sites. This is because these third parties are at best making calculations and estimates based on a fraction of the traffic. Only the site owners, who can deploy a variety of analytics tools, can get particularly accurate data - and often they're not willing to share this raw info.

The Google problem of click fraud is something totally different. Click fraud isn't an issue of incorrect statistics - the clicks are happening, the question is if they are being made by valid visitors or not.

I think you get can pretty accurate stats about your own site, it's getting accurate stats about other sites that's next to impossible - and you're right that there doesn't seem to be any good solutions coming down the pipe.

Andy on August 11, 2006 1:03 PM
Comment link

It's tough to even get accurate stats about your *own* site. Ever see a company switch from a log-based metrics tool to tag-based one? Reported traffic can be cut in half - or much worse.

I've had to talk clients thought that change. It's not easy explaining why 2/3 of their traffic never really existed.

Bryan on August 11, 2006 3:00 PM
Comment link

Sure, internally you may have a better idea of real stats (although even there stats are not perfect), but part of my point is that a lot of buzz, and money can be made from the uncertainty out on the part of outsiders and advertisers (yes, on two different issues, traffic and click-fraud).

Matt Marshall on August 11, 2006 3:06 PM
Comment link

Matt - great post and some interesting commentary here.

Just a courtesy message to let you know I linked your post here:


at the Wetjello blog which covers anything and everything associated with online video.

Comments are welcome and encouraged!

Sarah on August 17, 2006 5:25 AM
Comment link
Post a comment

Remember personal info?