The Rise of the Data Web

Via a tweet by Tim O’Reilly, I came across an excellent post entitled The Rise of the Data Web, on Dataspora Blog (which I quickly subscribed to). The author, Michael E. Driscoll, summed up nicely an important trend occurring presently on the web.

The web originated as a set of documents and files served up for people to read and occasionally edit. Indeed, it is this web that we all still experience as we surf from page to page reading news or looking at photos.

Underneath all this, and unseen except by software engineers, is a massive web of data. Sensors of all kinds are now measuring everything that can be quantified, from people’s location and biological functions to the environment we all share. All of this data is flowing through the network.

The web we experience will continue to be dominated by documents — e-mail, blogs, and news. And while many sites are data-centric — Google maps,, and Yahoo finance — it’s the web that we can’t see that surging with data. It’s not about us, it’s about servers in the cloud mediating entire pipelines of data, only occasionally surfacing in a browser.

But this new influx of data is presenting challenges. Conventional wisdom often means engineers approach data with a document-centric mindset. One has to look no further than the vast assortment of XML schemas for evidence of this.

Data and documents are different, however. The suggestion, and one I agree with, is that we begin to think of data as a stream “extending infinitely forward in time” which can be “split, sampled, and filtered.”

Conceiving of data as streams moves us out of the realm of static objects and into the realm of signal processing. This is the domain of the living: where the web is not an archive but an organism, reacting in real-time.


Lonna Hanson
September 4, 2009 at 1:35 PM

Just had to comment. Your writing was so interesting. I had never thought of the evolution of the web.

Post a comment