<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Backdrifter &#187; xml</title>
	<atom:link href="http://www.backdrifter.com/tags/xml/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.backdrifter.com</link>
	<description>The personal site of Jared Hanson</description>
	<lastBuildDate>Tue, 22 Jun 2010 22:16:08 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>The Rise of the Data Web</title>
		<link>http://www.backdrifter.com/2009/09/02/the-rise-of-the-data-web/</link>
		<comments>http://www.backdrifter.com/2009/09/02/the-rise-of-the-data-web/#comments</comments>
		<pubDate>Wed, 02 Sep 2009 23:54:07 +0000</pubDate>
		<dc:creator>Jared Hanson</dc:creator>
				<category><![CDATA[Unknown]]></category>
		<category><![CDATA[bigdata]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[xml]]></category>

		<guid isPermaLink="false">http://www.backdrifter.com/?p=133</guid>
		<description><![CDATA[Via a tweet by Tim O&#8217;Reilly, I came across an excellent post entitled The Rise of the Data Web, on Dataspora Blog (which I quickly subscribed to).  The author, Michael E. Driscoll, summed up nicely an important trend occurring presently on the web.
The web originated as a set of documents and files served up [...]]]></description>
			<content:encoded><![CDATA[<p>Via a <a href="http://twitter.com/timoreilly/status/3697001810">tweet</a> by <a href="http://tim.oreilly.com/">Tim O&#8217;Reilly</a>, I came across an excellent post entitled <a href="http://dataspora.com/blog/the-rise-of-the-data-web/">The Rise of the Data Web</a>, on <a href="http://dataspora.com/blog/">Dataspora Blog</a> (which I quickly subscribed to).  The author, Michael E. Driscoll, summed up nicely an important trend occurring presently on the web.</p>
<p>The web originated as a set of documents and files served up for people to read and occasionally edit.  Indeed, it is this web that we all still experience as we surf from page to page reading news or looking at photos.</p>
<p>Underneath all this, and unseen except by software engineers, is a massive web of data.  Sensors of all kinds are now measuring everything that can be quantified, from people&#8217;s location and biological functions to the environment we all share.  All of this data is flowing through the network.</p>
<p><span id="more-133"></span></p>
<blockquote><p>The web we experience will continue to be dominated by documents — e-mail, blogs, and news. And while many sites are data-centric — Google maps, Weather.com, and Yahoo finance — it’s the web that we can’t see that surging with data. It’s not about us, it’s about servers in the cloud mediating <a href="http://radar.oreilly.com/archives/2007/02/pipes-and-filte.html">entire pipelines of data</a>, only occasionally surfacing in a browser.</p></blockquote>
<p>But this new influx of data is presenting challenges.  Conventional wisdom often means engineers approach data with a document-centric mindset.  One has to look no further than the vast assortment of <a href="http://en.wikipedia.org/wiki/XML">XML</a> schemas for evidence of this.</p>
<p>Data and documents are different, however.  The suggestion, and one I agree with, is that we begin to think of data as a stream &#8220;extending infinitely forward in time&#8221; which can be &#8220;split, sampled, and filtered.&#8221;</p>
<blockquote><p>Conceiving of data as streams moves us out of the realm of static objects and into the <a href="http://mitpress.mit.edu/sicp/full-text/book/book-Z-H-24.html#%_sec_3.5">realm of signal processing</a>. This is the domain of the living: where the web is not an archive but an organism, <a href="http://radar.oreilly.com/2009/08/big-data-and-real-time-structured-data-analytics.html">reacting in real-time</a>.</p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://www.backdrifter.com/2009/09/02/the-rise-of-the-data-web/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>oEmbed FAIL! Represent RESTfully.</title>
		<link>http://www.backdrifter.com/2008/05/09/oembed-fail-represent-restfully/</link>
		<comments>http://www.backdrifter.com/2008/05/09/oembed-fail-represent-restfully/#comments</comments>
		<pubDate>Fri, 09 May 2008 22:12:14 +0000</pubDate>
		<dc:creator>Jared Hanson</dc:creator>
				<category><![CDATA[Unknown]]></category>
		<category><![CDATA[http]]></category>
		<category><![CDATA[json]]></category>
		<category><![CDATA[oembed]]></category>
		<category><![CDATA[rest]]></category>
		<category><![CDATA[xml]]></category>
		<category><![CDATA[xrds]]></category>

		<guid isPermaLink="false">http://www.backdrifter.com/2008/05/09/oembed-fail-represent-restfully/</guid>
		<description><![CDATA[oEmbed is the latest standard to come out of the Web 2.0 developer crowd.  While I&#8217;m typically a huge advocate of standards, and think much of the prior work has been superb, oEmbed fails on almost every level.  So much so, in fact, that I&#8217;m writing this post to declare it worthless, and [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://oembed.com/">oEmbed</a> is the latest standard to <a href="http://www.readwriteweb.com/archives/oembed_open_format.php">come out</a> of the Web 2.0 developer crowd.  While I&#8217;m typically a huge advocate of standards, and think much of the prior work has been superb, oEmbed fails on almost every level.  So much so, in fact, that I&#8217;m writing this post to declare it worthless, and suggest an alternative.</p>
<p>Upon first seeing the spec, I immediately <a href="http://twitter.com/jaredhanson/statuses/807531511">tweeted</a>:</p>
<blockquote><p>
oEmbed completely ignores REST principles. GET with a Content-Type to the original resource. Benefit: a single URL per resource. C&#8217;mon!
</p></blockquote>
<p>Allow me to further explain&#8230;</p>
<p><span id="more-95"></span></p>
<p>To begin with, let&#8217;s dive into the <a href="http://oembed.com/">specification</a>, and take note of the things that immediately raise red flags.</p>
<blockquote><p>
An oEmbed exchange occurs between a consumer and a provider. A consumer wishes to show an embedded representation of a third party resource on their own web site, such as a photo or an embedded video. A provider implements the oEmbed API to allow consumers to fetch that representation.
</p></blockquote>
<p>Hmm&#8230; that sounds suspiciously like an <a href="http://en.wikipedia.org/wiki/HTTP">HTTP</a> GET request.  Why the need to refer to this as &#8220;the oEmbed API&#8221;?!?  Something suspect must be about to go down.</p>
<blockquote><p>
Configuration for oEmbed is very simple. Providers must specify one or more URL scheme and API endpoint pairs. The URL scheme describes which URLs provided by the service may have an embedded representation. The API endpoint describes where the consumer may request representations for those URLs.
</p></blockquote>
<p>Wait&#8230; what?  This oEmbed thing requires two <a href="http://en.wikipedia.org/wiki/URL">URL</a>s to fetch different representations of the same resource?  Have we not learned even the most rudimentary <a href="http://en.wikipedia.org/wiki/Representational_State_Transfer">REST</a> principles in the past few years?  Here&#8217;s a quick lesson: a single resource can have multiple representations.  One URL, multiple data formats.  Once that is understood, the need for an oEmbed API endpoint drops away entirely.</p>
<p>Even so, let&#8217;s be pragmatic and use oEmbed, since some respectable providers have already implemented support for it. (Which begs further questions, but I&#8217;ll let that slide.)  To start, we need to get the address of the API endpoint.  That mechanism is not specified, so we&#8217;ll have to resort to hardcoding endpoints.  <a href="http://en.wikipedia.org/wiki/XRDS">XRDS</a> could be useful here, but that introduces further complications.  Besides, there is better way.</p>
<p><b>oEmbed with REST to the Rescue!</b></p>
<p>Since the URL to the original resource must be known, and an oEmbed representation is desired, the oEmbed API endpoint is completely unnecessary.  Just request the oEmbed representation from the resource directly.<br />
<code><br />
GET /photos/bees/2362225867/ HTTP/1.1<br />
Host: flickr.com<br />
Accept: application/oembed+xml<br />
</code></p>
<p>Flickr could then respond with our embed-ready resource.<br />
<code><br />
HTTP/1.x 200 OK<br />
Content-Type: application/oembed+xml<br />
&nbsp;<br />
&lt;oembed&gt;<br />
&lt;version&gt;1.0&lt;/version&gt;<br />
&lt;type&gt;photo&lt;/type&gt;<br />
&lt;title&gt;Bacon Lollys&lt;/title&gt;<br />
&lt;author_name&gt;bees&lt;/author_name&gt;<br />
&lt;author_url&gt;http://www.flickr.com/photos/bees/&lt;/author_url&gt;<br />
&lt;cache_age&gt;3600&lt;/cache_age&gt;<br />
&lt;provider_name&gt;Flickr&lt;/provider_name&gt;<br />
&lt;provider_url&gt;http://www.flickr.com/&lt;/provider_url&gt;<br />
&lt;width&gt;500&lt;/width&gt;<br />
&lt;height&gt;375&lt;/height&gt;<br />
&lt;url&gt;http://farm4.static.flickr.com/3040/2362225867_4a87ab8baf.jpg&lt;/url&gt;<br />
&lt;/oembed&gt;<br />
</code></p>
<p>See that! There is absolutely zero requirement for the oEmbed API.  All we need to do is define a couple data formats and the associated <a href="http://en.wikipedia.org/wiki/MIME_type">MIME types</a>.  While there is evidence of duplicative effort, the oEmbed formats look reasonable and have both <a href="http://en.wikipedia.org/wiki/XML">XML</a> and <a href="http://en.wikipedia.org/wiki/JSON">JSON</a> output.  I propose we define them as <code>application/oembed+xml</code> and <code>application/oembed+json</code>.</p>
<p>Error handling is equally trivial.  If support for RESTful oEmbed is not implemented, a standard HTTP <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.7">406 Not Acceptable</a> status code would be returned to the client.  The client could then fail over to screen scraping or proprietary APIs as necessary.</p>
<p>I hope I&#8217;ve outlined my case against oEmbed effectively.  If anyone has any questions or concerns, drop a line in the comments.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.backdrifter.com/2008/05/09/oembed-fail-represent-restfully/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
	</channel>
</rss>
