Print This Post Print This Post

Innovation and aggregation: 
Why news needs a bigger—
and more beneficial—‘tapeworm’

By Justin Massa

While the lines continue to blur among the quality and types of content produced by traditional media and their Web-based counterparts—including amateurs, hobbyists and start-ups—the battle over distribution is just heating up.

Wall Street Journal editor Robert Thomson recently wrote of Web sites, like Google, that aggregate content without paying fees to the content creators, “There is no doubt that certain Web sites are best described as parasites or tech tapeworms in the intestines of the Internet.”1

This criticism of aggregation without payment has merit but misses the broader point. Once traditional media operations opened themselves up to search engines and began sharing their stories in RSS feeds, the cat was out of the bag. Disconnecting their online presence from the rest of the Internet simply isn’t an option, and nearly all experiments with Internet-only subscription fees have been unsustainable.

Figuring out the model of distribution, aggregation and syndication will define the next decade of journalism, presenting at the same time incredible opportunity and enormous challenge. To use Thomson’s term, the future of media lies in building a better tapeworm.

It’s hard to talk about the future of journalism without bringing up and, which exemplify early innovation in aggregation and provide us with insight as to what may come next. Adrian Holovaty’s ChicagoCrime was the progenitor “mashup,” aggregating two previously uncombined elements into something amazing. By simply layering city crime reports over interactive Google Maps and sorting them by time and type, the site enabled anyone to view and analyze crime in their community.

The next generation of mashups took this concept to a new level. By aggregating everything from street closures to home sales to restaurant reviews around a specific location and presenting it beautifully, EveryBlock became the gold standard for presenting hyperlocal news. Similar sites now exist for a variety of other audiences; aggregates data, maps, and home listings for homebuyers, and aggregates dozens of television networks for TV watchers. These Web sites are popular and useful because they are comprehensive, constantly innovating and expanding, and thoughtful about how they present and organize data.

This model of the distributed Web was born 10 years ago when in 1999 Netscape pioneered the use of RSS. Through the My Netscape portal, using the “Rich Site Summary” standard, users could create a custom homepage that aggregated information from any Web site that shared their content. The project was abandoned when AOL purchased Netscape in 2001, but a group of open source developers picked up the project and re-dubbed it “Really Simple Syndication.”

By 2005, Microsoft’s Outlook e-mail software and Internet Explorer as well as Web browsers Opera and Firefox had all adopted the standard. RSS became mainstream; the Chicago Tribune shared its first RSS feed in early 2005.

At first, publishers of all types of content embraced syndication. Instead of having to rely on visitors to remember to return to their site day after day, they could push content to the programs or custom homepages. By sharing content in a standard format, publishers empowered developers to figure out ways to sort and remix their content in millions of different ways.

But RSS proved only to be the beginning. Additional standards began to emerge with an alphabet soup of names— such as JSON, XML, GeoRSS, and KML— that focused on sharing all sorts of content beyond text. Web sites began to syndicate data sets, maps, calendars, video, audio, and images.

The trend towards a distributed Web becomes accelerated with the explosion of the application programming interface, or API, which enables Web sites to exchange nearly any type of information automatically. And the practice of “screen scraping” enables data on a Web site to be syndicated even without the publisher adopting a shared standard. Nearly every type of content has been set free.

Complicating matters for those hunting for a business model in this new environment, the tools to aggregate content in incredibly rich and complex ways are free. Open source content management systems are growing more powerful. This is a scary climate for many journalists. When the public can read your stories, view your pictures, and interact with your source data without ever visiting your Web site or viewing your ads, how are you supposed to pay the bills?

More than any other outlet, the New York Times has embraced this new landscape and is trailblazing a path for other publishers. Through their APIs for articles, books, movie reviews, comments, real estate, campaign contributions, and legislative records they are sharing nearly their entire archive with the Web for free—with a few important caveats. Web sites using the API must be non-commercial (although a paid version is available for for-profit sites). In addition, the content pulled through the API can’t be archived but must be requested anew for each user, and all content must clearly link back to and be attributed to the Times. (

“One thing that big media still does have a particularly good share of.…is information processing resources and archival content,” said Marshall Kirkpatrick on the ReadWriteWeb blog, explaining the power of APIs. “The newspaper is far better prepared to organize that raw information, and perhaps offer complimentary content, than any individual blogger or small news publisher.” (

It’s unfortunate that multiple newspapers are planning to create new devices, such as electronic paper, to deliver the news. People didn’t buy newspapers because they liked specific formulas of paper and ink but because of the words and images they delivered. TV stations don’t make televisions and radio stations don’t make radios for good reason; their strengths lie in creating content rather than building devices.

Journalists should return to their roots, focusing on how they create and contextualize content through some innovative online aggregation of their own rather than building a new device or closing themselves off from the rest of the Web.

Justin Massa is the executive
director/co-founder of and the Program and Technical
Coordinator for NetSquared, an
initiative of TechSoup Global.


Category: Essay


Don't see your favorite online news website?
Submit an online news site

Mailing Address:
Community Media Workshop
at Columbia College Chicago
600 S. Michigan Ave
Chicago, IL 60605

Walk-in Address:
218 S. Wabash, 7th Floor
Chicago, IL 60605

Tel: 312-369-6400
Fax: 312-369-6404

Follow on @npcommunicator