Wednesday, March 15, 2006

The RSS Platform concept

In some ways it is good to see that Microsoft are finally catching up on the feed aggregator idea (especially since Python and Java chaps having been banging on about this kind of thing for years). There has been some recent blog activity recently in response to the recent IE7 beta distribution and Microsoft's RSS Platform activities.

I've also seen various snapshots of the presentation given by Amar Gandhi at PDC05 in quite a few places around the web now. Amar's presentation was about Windows Vista: Building RSS Enabled Applications [PPT] (there is also a ~130Mb version including video available here).

I have long subscribed to the idea that the way to approach syndicated feed collection is via a single collection and storage mechanism. In the past I have even created my own modest feed aggregator application (inspired by This seems a remarkably similar idea to the Microsoft common feedlist concept. I eventually chose to use ROME for my multi-format aware feed collection, Microsoft are implementing their own feed collection API.

I thought it was interesting that in Amar's presentation there was refence to the difference between time-based feeds and lists (e.g. News vs Top 10, Wish lists, Playlists, Bestsellers etc). It is good that it has been recognised that the way time-based feeds and list feeds are handled and merged for output purposes will need to differ.

When I first started looking at my modest feed aggregator my initial thought was to use an XML database as the storage mechanism. I played around with Apache Xindice for a bit but it was just a little too awkward for me to use at that time and so I chose to use EHCache backed storage instead. I hope Microsoft have opted for an XML database backed approach, I think it would be a mistake to concentrate too much on the RSS formats in current circulation. It would be nice to hand your XML storage mechanism a DTD or XML Schema and ask it to collect feeds that validate to those specifications. I really hope that we don't see the proliferation of too many future proprietary Microsoft XML formats (especially since MS Office will kick out XML).

Of course, there has been talk of embedding databases into Firefox, so why not an XML database? It could be a real shame that Oracle recently bought Sleepycat as I think the Berkeley DB XML would probably be an avenue worth exploring for Firefox integration.

It is a shame that the likes of Google, Sun, Mozilla, Oracle and Sleepycat won't have time to collectively come up with a world beating Firefox based RSS Platform of their own. Microsoft's IE7 looks like it might become the world's most installed feed aggregator software by default. Maybe Google could knock up a version of Google Desktop that works as a feed aggregator with database backing and search facility.

Then again, if Microsoft's feed aggregator implementation is sufficiently flawed in IE7 then maybe there is still an opportunity for somebody! Firefox hybrids already exist (e.g. Flock) and maybe feed aggregation will become the mechanism around which future Firefox hybrids are developed.