1601 E. 5th St. #109

Austin, Texas 78702

United States


Module 002/2, Ground Floor, Tidel Park

Elcosez, Aerodome Post

Coimbatore, Tamil Nadu 641014 India


138G Grays Hill

Opp. BSNL GM Office, Sims Park

Coonoor, Tamil Nadu 643101 India


Block 7, Lot 5,

Camella Homes Bermuda,

Phase 2B, Brgy. Banlic,

City of Cabuyao, Laguna,


San Jose

Escazu Village

Calle 118B, San Rafael

San Jose, SJ 10203

Costa Rica

News & Insights

News & Insights

Data Discovery

Every information service wants their content to be discovered on the Internet and billions have been made serving that need. Lately there’s been a lot of talk about the “data discovery” technology that powers services like Outbrain, Taboola, nRelate (IAC), Gravity (AOL), Disqus, Scribol, and ShareThrough. Using a combination of content recommendation algorithms or content sharing tools and a simple affiliate-type marketing business models, these services have taken the Internet’s advertising-supported information services by storm.

You want 1 million new unique visitors to your site? Just pay these folks $15/M for the traffic and the traffic will come… guaranteed. Also, if you want to trade some space at the end of your articles to display teasers for third-party articles, then you can earn $1 for every thousand folks who click-through to the articles. What’s not to like?

The technology offering itself is usually a “more like this” algorithm that suggests “more articles like this,” “more articles that your profile indicates that you would like,“ or just popular types of articles or popular articles by category. The “sharing” services are just widgets embedded in your site to make social sharing easier and less breakable, and they earn you a little incremental revenue on the side as well.

The models are simple, the technology is easy to implement, and it’s less expensive than complex SEO efforts or buying premium US traffic from Google AdWords. And, like anything that sounds a bit too simple, there are some issues.

  • The traffic you get has a high bounce rate (i.e., one impression and gone) of around 85%, which makes it more expensive than Facebook for acquiring an active, engaged user.
  • The recommended articles can be unabashed link bait from dubious sources that can tarnish your brand (“Kim Kardashian went bikini shopping and you’ll never guess what happened next!”)
  • The teasers are ads that compete for reader attention with other advertising that you sell directly at a much higher price.
  • If you want very high volumes you need to use several of these services. For more control over the sources, traffic, and the recommended articles, you have to spend the time and money to actively manage this.

For these reasons some publishers prefer to handle their own data discovery the “organic” way. At a recent Open Data Institute event in London the folks behind the BBC News sites spoke about the tools they used to improve “discovery” of their content. They append subject taxonomies, geo tags, and normalized organization and people names like you’d imagine, but they also have authors associate thematic topics to the articles as well. Add links to underlying primary sources and you’ve got enough “hooks” to each article to make them much easier to find both on BBC sites and via search engines. Topical monitoring also gets much more granular this way.

Of course, these organic approaches will never have the immediate gratification appeal of the content syndication platforms, but they incrementally increase engagement and pageviews while preserving the integrity of the brand and sometimes a smaller, demographically more desirable, more engaged audience is exactly what product-oriented advertisers would like to discover.

Keep on top of the information industry 
with our ‘Data Content Best Practices’ newsletter:

Keep on top of the information industry with our ‘Data Content Best Practices’ newsletter: