<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Cantorva Limited &#187; rdfa</title>
	<atom:link href="http://cantorva.com/blog/tag/rdfa/feed/" rel="self" type="application/rss+xml" />
	<link>http://cantorva.com/blog</link>
	<description></description>
	<lastBuildDate>Mon, 11 Jan 2010 20:36:26 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Hints on browsing embedded RDFa data as data</title>
		<link>http://cantorva.com/blog/2009/07/01/hints-on-browsing-embedded-rdfa-data-as-data/</link>
		<comments>http://cantorva.com/blog/2009/07/01/hints-on-browsing-embedded-rdfa-data-as-data/#comments</comments>
		<pubDate>Wed, 01 Jul 2009 23:20:49 +0000</pubDate>
		<dc:creator>Simon</dc:creator>
				<category><![CDATA[Guides]]></category>
		<category><![CDATA[rdfa]]></category>

		<guid isPermaLink="false">http://cantorva.com/blog/?p=79</guid>
		<description><![CDATA[I want to share a few notes about viewing the embedded data from RDFa pages, as a sort of mini-guide for anyone interested.
The thing to get out of the way upfront is that the easiest thing to extract look the ugliest and is often hard to follow. Its worth taking a few precautions to avoid [...]]]></description>
			<content:encoded><![CDATA[<p>I want to share a few notes about viewing the embedded data from RDFa pages, as a sort of mini-guide for anyone interested.</p>
<p>The thing to get out of the way upfront is that the <em>easiest </em>thing to extract look the <em>ugliest </em>and is often hard to follow. Its worth taking a few precautions to avoid the horror of machine generated RDF/XML.  So, install the <a href="http://dig.csail.mit.edu/2007/tab/" target="_blank">Tabulator Firefox extension</a> from MIT and find the button labelled &#8220;N3&#8243; &#8211; it looks like a dense network icon. Hit that for a compact text based view, and un-toggle the default Tabular as screen space requires. The default is the loose network icon.</p>
<p>To actually extract the data, use <a href="http://www.w3.org/2007/08/pyRdfa/" target="_blank">the RDFa Distiller service</a>. Put in a URL and this service gives you ugly RDF/XML by default, but the Tabulator extension comes to the rescue. With Tabulator hitting &#8220;Go&#8221; gives you &#8211; unsurprisingly &#8211; a table and switching completely to N3 is just two clicks.</p>
<p>In table mode, Tabulator will pick up and cache labels for things as it goes along and will use the last bit of the URL if it doesn&#8217;t have a label yet. If your URLs look ugly, then the view in Tabulator will look ugly &#8211; hopefully your URLs are pretty.</p>
<p>Pretty can also be bad, especially if the unique part of a URL is at the front. For example:</p>
<ul>
<li>http://feelitlive.com/events/2009/7/3/W2/2UH/Hyde+Park/Blur#event</li>
<li>http://feelitlive.com/events/2009/7/4/HA9/0WS/Wembley+Stadium/Take+That#event</li>
<li>http://feelitlive.com/events/2009/7/5/SW7/2AP/Royal+Albert+Hall/The+Killers#event</li>
</ul>
<p>will all be rendered as &#8220;event&#8221; &#8211; very confusing! If this happens, switching to N3 may be the way forward.<span class="attribute-value"> Part of the problem seems to be that Tabulator does not read RDFa on its own, which makes it harder to access the RDF and harder for Tabulator to calculate good labels. Apparently the next version will read RDFa &#8211; great.</span></p>
<p><strong><span class="attribute-value">Those links again:</span></strong></p>
<ul>
<li><a href="http://dig.csail.mit.edu/2007/tab/" target="_blank">Tabulator Firefox extension</a></li>
<li><a href="http://www.w3.org/2007/08/pyRdfa/" target="_blank">RDFa Distiller service</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://cantorva.com/blog/2009/07/01/hints-on-browsing-embedded-rdfa-data-as-data/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Moving forward on Search Monkey</title>
		<link>http://cantorva.com/blog/2009/05/11/moving-forward-on-search-monkey/</link>
		<comments>http://cantorva.com/blog/2009/05/11/moving-forward-on-search-monkey/#comments</comments>
		<pubDate>Mon, 11 May 2009 22:15:56 +0000</pubDate>
		<dc:creator>Simon</dc:creator>
				<category><![CDATA[News]]></category>
		<category><![CDATA[hcalendar]]></category>
		<category><![CDATA[rdfa]]></category>
		<category><![CDATA[search monkey]]></category>

		<guid isPermaLink="false">http://cantorva.com/blog/?p=47</guid>
		<description><![CDATA[I've moved forward marginally with my Search Monkey presentation app, and backwards a little as well.]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve moved forward marginally with my Search Monkey presentation app, and backwards a little as well.</p>
<p>The backward step was that some of the DataRSS content in the Yahoo Index appears to have vanished, this broke the examples in the <a href="http://cantorva.com/blog/2009/04/03/vcal-rdfa-and-search-monkey/" target="_blank">previous post</a> and on <a href="http://gallery.search.yahoo.com/application?smid=WCi.s" target="_blank">the Search Monkey site</a>. I don&#8217;t think there is a lot a lowly web site operator like me can do to recover that, but I hope coverage will continue to be at least patchy.  I look forward to the day they just make the RDF available directly, I can see this being much more scalable for them and certainly easier for me.</p>
<p>At the same time I have made a little progress with adding hCalendar support. The level of adoption for hCalendar makes this compelling, though the lack of precision does mean its fundamentally restricted.  I will not, for example, be able to enhance results for pages that mention more than one event, even if one event is clearly (to a human) the primary topic of the page, choosing meaningful graphics is also circuitous.</p>
<p>On the topic of graphics, it is often not possible to depict an event that has not yet occurred. You might choose a graphic for purely aesthetic reasons, use an inconsistent rationale that is difficult to capture, or may abstract away the reasoning into another software module making it unavailable to the UI (as in my case).  This scenario seems to require a specific predicate which I will coin and document as I get around to it. A microformats equivalent is a non-starter as I do not have the inclination to make official representations to standards bodies for work of speculative value (see below).</p>
<p>In total then, the roadmap will be something like this (in order of priority):</p>
<ol>
<li>Basic support for hCalendar (actually tested on <a href="http://upcoming.yahoo.com/event/2488827/" target="_blank">upcoming</a>), with simple safeguards for the ambiguous cases.</li>
<li>Support for RDFa enhanced pages mentioning multiple events, using foaf:primaryTopic to disambiguate.</li>
<li>Support for cases where the event itself can be previewed as a commercial proposition without requiring excess detail in the data, using the new predicate unless I can find one.</li>
<li>Support for event ontology actor and factor and associated foaf:depiction triples such that a factor or actor can be reliably depicted in the result.</li>
<li>Support for the organiser and logo <a href="http://microformats.org/wiki/hcalendar" target="_blank">hCalendar</a> and <a href="http://microformats.org/wiki/hcard" target="_blank">hCard</a> properties. This is useless to me as the interesting entity for music events is almost never the organiser and never the attendees [ unless you are on a date <img src='http://cantorva.com/blog/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' />  ], but its feasible so deserves a modicum of attention. I don&#8217;t think photo is worth supporting as that would immediately break on conferences.</li>
</ol>
<p>I&#8217;m realistic about the importance of this project for making money as there is no predictable way to ensure large scale adoption of the plugin and the difference in click though rates is also unpredictable. As a result, I&#8217;ll be moving this forward primarily as a hackspace project rather than a commercial one, albeit commercially motivated. Doing it on hack evenings is a simple way to box off an amount of time.</p>
<p>Progress will be slow for other reasons, I have to locate implementations of the relevent vocabularies, trawl the Yahoo Index for fully crawled examples and possibly establish some test cases. This will all take time and much of it will be out of my control.</p>
]]></content:encoded>
			<wfw:commentRss>http://cantorva.com/blog/2009/05/11/moving-forward-on-search-monkey/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>UK Gov shares its data cheaply, avoiding change</title>
		<link>http://cantorva.com/blog/2009/04/24/uk-gov-shares-its-data-cheaply-avoiding-change/</link>
		<comments>http://cantorva.com/blog/2009/04/24/uk-gov-shares-its-data-cheaply-avoiding-change/#comments</comments>
		<pubDate>Fri, 24 Apr 2009 14:41:22 +0000</pubDate>
		<dc:creator>Simon</dc:creator>
				<category><![CDATA[News]]></category>
		<category><![CDATA[cheap]]></category>
		<category><![CDATA[integration]]></category>
		<category><![CDATA[rdfa]]></category>
		<category><![CDATA[seo]]></category>
		<category><![CDATA[sharing]]></category>

		<guid isPermaLink="false">http://cantorva.com/blog/?p=43</guid>
		<description><![CDATA[No, not a story of intrusion and data-loss, this is data the government should be sharing &#8211; job adverts.
While apparently being in a position to know,  Mark Birkbeck &#8220;speculates&#8221; publicly that the UK Gov are using RDFa because:
by using RDFa to mark-up vacancies on each individual government site, it&#8217;s possible to allow each department to [...]]]></description>
			<content:encoded><![CDATA[<p>No, not a story of intrusion and data-loss, this is data the government should be sharing &#8211; job adverts.</p>
<p>While apparently being in a position to know,  <a href="http://webbackplane.com/mark-birbeck/blog/2009/04/23/more-rdfa-goodness-from-uk-government-web-sites" target="_blank">Mark Birkbeck</a> &#8220;speculates&#8221; publicly that the UK Gov are using RDFa because:</p>
<blockquote><p>by using RDFa to mark-up vacancies on each individual government site, it&#8217;s possible to allow each department to publish jobs however it sees fit. Many companies want to have some centralised information, not just government, but this usually involves imposing on each department some new database system or workflow. By using RDFa as the interface, each department merely needs to have the ability to publish HTML, and then they can share their data.</p>
<p>The second[reason] is that by publishing vacancies using RDFa, it&#8217;s easy for <em>third-parties</em> to &#8217;scrape&#8217; the data into their own databases, in a reliable way.</p>
<p>For example, some external company could import all vacancies for a particular region or of a particular type, and then show them on their own site.</p></blockquote>
<p>Of course, all the benefits he mentioned a provably true &#8211; just <a href="http://www.w3.org/2007/08/pyRdfa/extract?uri=http%3A%2F%2Fwww.civilservice.gov.uk%2Fjobs%2Fcareers-detail.aspx%3FJobId%3D2832&amp;format=pretty-xml&amp;warnings=false&amp;parser=lax&amp;host=xhtml&amp;space-preserve=true&amp;submit=Go!">run a page through a distiller</a> &#8211; this is not bleeding edge technology any longer, it&#8217;s merely cutting edge.</p>
]]></content:encoded>
			<wfw:commentRss>http://cantorva.com/blog/2009/04/24/uk-gov-shares-its-data-cheaply-avoiding-change/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>VCal RDFa and Search Monkey</title>
		<link>http://cantorva.com/blog/2009/04/03/vcal-rdfa-and-search-monkey/</link>
		<comments>http://cantorva.com/blog/2009/04/03/vcal-rdfa-and-search-monkey/#comments</comments>
		<pubDate>Fri, 03 Apr 2009 01:18:22 +0000</pubDate>
		<dc:creator>Simon</dc:creator>
				<category><![CDATA[Feel It Live]]></category>
		<category><![CDATA[Releases]]></category>
		<category><![CDATA[innovation]]></category>
		<category><![CDATA[rdfa]]></category>
		<category><![CDATA[search monkey]]></category>
		<category><![CDATA[vcal]]></category>
		<category><![CDATA[yahoo]]></category>

		<guid isPermaLink="false">http://cantorva.com/blog/?p=30</guid>
		<description><![CDATA[I've done what I've called a "Generic Event Information" enhanced search result format for Yahoo. It's a little thing that you can choose to add to your Yahoo search experience.]]></description>
			<content:encoded><![CDATA[<p>I just returned home from the <a href="http://upcoming.yahoo.com/event/2173930/" target="_blank">3rd London hack evening</a>, which is a highly productive get together of, well, nerds in Islington. With a nod to the <a href="http://www.brepettis.com/blog/2009/3/3/the-cult-of-done-manifesto.html" target="_blank">done manifesto</a>, I&#8217;ve continued on into the night and done what I&#8217;ve called a<a href="http://gallery.search.yahoo.com/application?smid=WCi.s" target="_blank"> &#8220;Generic Event Information&#8221; enhanced search result format</a> for Yahoo. It&#8217;s a little thing that you can choose to <a href="http://gallery.search.yahoo.com/application?smid=WCi.s" target="_blank">add to your Yahoo</a> search experience.</p>
<p>One of the challenges with this work is that Yahoo has not given these widgets a particularly easy name to throw around, but here&#8217;s a picture to help explain things:</p>
<p><a href="http://cantorva.com/blog/wp-content/uploads/2009/04/generic-event-info-ultravox-example.png"><img class="alignnone size-full wp-image-31" title="generic event info Ultravox example" src="http://cantorva.com/blog/wp-content/uploads/2009/04/generic-event-info-ultravox-example.png" alt="generic event info Ultravox example" width="690" height="115" /></a></p>
<p>The text appears in a <a href="http://uk.search.yahoo.com/search?p=ultravox+roundhouse+london+live" target="_blank">search result for Ultravox in London at the Roundhouse</a> and as you can see it includes the usual snippet of text (or the event summary, if its longer) plus some very simple name value pairs &#8211; &#8220;Location&#8221; and &#8220;Starts&#8221;, which is altered to &#8220;Started&#8221; for events that have already started.</p>
<p>The idea is to answer the basic <em>when </em>and <em>where </em>questions common to events and also to allow users to quickly scan through search results and exclude events that they cannot attend and focus on those that they can attend, which is obviously those that they aren&#8217;t already missing! Therefore the &#8220;Starts&#8221; vs &#8220;Started&#8221; thing.</p>
<h2>&#8220;Generic&#8221;</h2>
<p>I can&#8217;t see a single person outside of the RDFa fan-club using this plugin unless it supports a wide variety of different sites and subject areas. There is no advantage to me in putting out something that only works for FeelItLive.com, yet its only FeelItLive.com that I&#8217;m aware of that is publishing VCal RDFa and Yahoo do not allow wildcards. I want people to use it, because I want them to focus their attention on my search results.</p>
<p>Chicken, meet Egg.</p>
<p>So, I&#8217;m doing two things:</p>
<ul>
<li>Labelling the thing generic &#8211; there is no point even mentioning FIL in the promotional blurb. It can only confuse matters.</li>
</ul>
<ul>
<li>Inviting anyone and everyone to attach a comment to this post letting me know where they have published VCal RDFa. If they do that, I&#8217;ll do my best to support as many sites as Yahoo will allow.</li>
</ul>
<p>I&#8217;ll also extend the same invitation to users of similar vocabularies <strong>(Edit: </strong>iCal, Event ontology, event microformats etc), though Vcal is the one recommended by Yahoo so I&#8217;ll have to see if anything else is supported.</p>
]]></content:encoded>
			<wfw:commentRss>http://cantorva.com/blog/2009/04/03/vcal-rdfa-and-search-monkey/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>RDFa is smoking</title>
		<link>http://cantorva.com/blog/2009/03/27/rdfa-is-smoking/</link>
		<comments>http://cantorva.com/blog/2009/03/27/rdfa-is-smoking/#comments</comments>
		<pubDate>Fri, 27 Mar 2009 13:59:50 +0000</pubDate>
		<dc:creator>Simon</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[myspace]]></category>
		<category><![CDATA[rdfa]]></category>

		<guid isPermaLink="false">http://cantorva.com/blog/?p=21</guid>
		<description><![CDATA[In a bizarre twist that proves, if nothing else, that data is data for everybody, Paris Hilton has become RDFa enabled.]]></description>
			<content:encoded><![CDATA[<p>In a bizarre twist that proves, if nothing else, that data is data for everybody, <a href="http://lists.w3.org/Archives/Public/public-rdfa/2009Mar/0055.html" target="_blank">Paris Hilton has become RDFa enabled</a>.</p>
<p>Stripping her down to <a href="http://www.w3.org/2007/08/pyRdfa/extract?uri=http%3A%2F%2Fwww.myspace.com%2Fparishilton&amp;format=pretty-xml&amp;warnings=false&amp;parser=lax&amp;host=xhtml&amp;space-preserve=true&amp;submit=Go!" target="_blank">her triples</a> we find out <a href="http://friends.myspace.com/index.cfm?fuseaction=invite.addfriend_verify&amp;friendID=6459682" target="_blank">how we can add her to our network</a>, find <a href="http://viewmorepics.myspace.com/index.cfm?fuseaction=user.viewAlbums&amp;friendID=6459682">photos</a>, that her and I are already in the same extended network  (well maybe in my dreams) and that she&#8217;s feeling productive right now (is that the same as being hot?).</p>
<p>More usefully, her friend count (203,412) and last login date (Tuesday) are both published in the only <a href="http://rdfa.info/2008/10/16/rdfa-is-a-w3c-recommendation/" target="_blank"><em>de jure</em> machine readable meta-data format</a> that we know &#8211; for sure &#8211; can publish just about any fact.</p>
]]></content:encoded>
			<wfw:commentRss>http://cantorva.com/blog/2009/03/27/rdfa-is-smoking/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
