Spam Or Ham? BISH Index Analyses (Over-) Monetization Of Websites

Advertising Network Distribution

I know this from my experience with Google’s search results but I am sure you have encountered the same: searching for a specific keyword, clicking on the first result and being confronted with an overly “AdSense optimized” website that was purely created for generating AdSense clicks.

You might have a gut feel the website you’re looking at has information on the topic but you’re not trusting it since everything is plastered with ads, i.e. the information might not be as trustworthy as you expect it to be. But how do you measure analytically if a website is overly monetized (“spam”) or provides a healthy balance (“ham”) between ads and content?

Introducing the BISH Index (alpha version), an analysis software I have worked on and am now releasing as early alpha version.

Single Website Advertising DistributionFollowing a set of about 20 rules, for ex. placement of ads, text/ad ratio, amount of ads, amount of words, dialogue on page exit, a specific theme the analysis leads to a number: the BISH Score. The higher the BISH above 0 the better (ham), the lower the BISH below 0 the worse (spam), close to 0 being neutral.

But there is more to it than just a simple number. At a glance you can see what ad networks are implemented by the website, what elements are used for search engine optimisation, how the site compares to the average, if it is a parked domain and various other interesting metrics.

On Page Statistics Compared To The Average
On Page Statistics Compared To The Average: venturebeat.com

Here are a couple of interesting facts from the first runs with the software:

  • Most websites are still marketed with Google AdSense, the high-traffic website with DoubleClick
  • The average amount of ads on a domain is 3
  • Some websites have up to 25(!) ads on a single page
  • Ad networks like Yieldmanager are becoming increasingly popular
  • Huffingtonpost has a BISH score of 114 and has a total of five ads on their website, Google Ads and Doubleclick.
  • Readwriteweb has over 21 ads on their website, the majority being self-marketed with OpenAds (OpenX) technology

The BISH Index currently works for the frontpage so advertisements on subsequent pages are not detected, but this could be an interesting feature in the future. The more domains you feed it the better it will become and the more data I have to tweak it.

Please let me know if you have any feedback on what further analysis would be interesting and if you encounter any bugs, domains not working or ad networks not being included (yet).