• Recent posts

    • Undugg
    • Releases = Tweets
    • Rasqal 0.9.21 and SPARQL 1.1 Query aggregation
    • End of life of Raptor V1. End of support for Raptor V1 in Rasqal and librdf
    • Writing an RDF query engine. Twice
  • Follow me on twitter

Command Line Semantic Web with Redland

I gave a ‘lightning’ talk (actually in about 15 mins) Command Line Semantic Web with Redland on 15th March 2010 at the Semantic Web Austin Meetup during SXSW at Texas Coworking, Austin, TX, USA. Today I recorded it as a screencast and put it online.

The embedded Vimeo version is below (best to view full screen to see all the text), but you can also get alternate hosted and downloadable versions (iPhone, 3GP, Full size) from my site.

Command Line Semantic Web With Redland from Dave Beckett on Vimeo.

Flickcurl C API to Flickr 1.17 Released

In the last few days I released Version 1.17 of my Flickcurl C library interface to the Flickr API. It has new complete support for three new recent sets of new APIs.

Added 15 new functions for the new Stats API calls announced 2010-03-03:
flickr.stats.getCollectionDomains, flickr.stats.getCollectionReferrers, flickr.stats.getCollectionStats, flickr.stats.getPhotoDomains, flickr.stats.getPhotoReferrers, flickr.stats.getPhotosetDomains, flickr.stats.getPhotosetReferrers, flickr.stats.getPhotosetStats, flickr.stats.getPhotoStats, flickr.stats.getPhotostreamDomains, flickr.stats.getPhotostreamReferrers, flickr.stats.getPhotostreamStats, flickr.stats.getPopularPhotos and flickr.stats.getTotalViews.

Added 8 new functions for the new People and “photos of” people API calls announced 2010-01-21:
flickr.photos.people.add, flickr.photos.people.delete, flickr.photos.people.deleteCoords, flickr.photos.people.editCoords and flickr.photos.people.getList, flickr.people.getPhotosOf.

Added 3 new functions for the new, unannounced (and seems incomplete) Gallery API calls:
flickr.galleries.addPhoto, flickr.galleries.getList and flickr.galleries.getListForPhoto .

Updated the flickcurl(1) to support the new gallery, people photos and stats API calls.

See the Release Notes for full details.

Get it at: http://download.dajobe.org/flickcurl/flickcurl-1.17.tar.gz (GPL2 / LGPL2 / Apache2.)

This is what I do for fun between releasing Redland RDF libraries more of which soon…

Rasqal 0.9.18 RDF Query Library Released

Update: you want 0.9.19 not 0.9.18 after package configuration issue found. Links fixed.

This release of Rasqal adds draft syntax support for the SPARQL 1.1 Update language being developed by the W3C SPARQL Working Group. The SPARQL 1.1 Update W3C Working Draft of 2010-01-26 introduces the first syntax design with some uncertainties and gray areas still present (no grammar spec section yet). I added what I thought would work, avoiding the ambiguous WITH forms where everything is optional. Since this is draft work, this extra parsing is only done when the ‘laqrs’ query language syntax is chosen. LAQRS stands for LAQRS adds to Querying RDF in SPARQL.

This is just syntax and API support in Rasqal, so it means you can prepare the upload queries, but there is no code to execute it. The API allows getting access to the decoded sparql update (INSERT, DELETE with or without DATA) and graph operations (CLEAR, DROP etc.). There is still more to do, when the syntax gets changed in later drafts and there is no API to stream triple insert/deletes during parsing, to handle uploading and downloading large triple blocks. That would required a rewrite of the SPARQL parser to use a different technology than flex+bison (maybe lemon, maybe Ragel) as well as new APIs.

Rasqal has several things to finish for SPARQL 1.0 support (UNION and nested OPTIONALs don’t work) but the recent rewrite of the query engine internals should make other SPARQL 1.1 parts such as aggregate functions and nested queries, a lot easier to do than with the old query engine. I will probably remove the old query engine from the codebase soon.

The second substantial change is a set of APIs moved from private to public in rasqal.h to enable the construction of query result sets and query result set rows (rasqal_row) via the public API. This allows query results to be read from a syntax or constructed by API as well as serialized to result formats, without any query being executed. Rasqal can be used with this addition to provide the sparql results syntax support for other applications that may have created query results via a different method. It can read query results formats from the SPARQL XML format (the standard format), and write or serialize them to SPARQL XML, SPARQL JSON, CSV, TSV and an ASCII Table format. This functionality is all available via Triplr where you can make HTTP GET URLs for saved queries.

The final change is in the area of resilience. The functions in the public API have been updated so that when invalid or NULL pointers are given, the functions return failure or NULL / false rather than try to use the pointer and probably crash. Hopefully I caught all of them. The release testing (as usual) included valgrind memory leak checking of all of the 100s of tests and there were no leaks or buffer overruns found.

This is also the first Rasqal release since switching to GIT as the source control for the Redland libraries so the source pointers have moved to git.librdf.org where details of how to check it out can be found.

So in summary, the main changes in this release are:

  • 0.9.19: Fix rasqal.pc to Requires raptor again.
  • Add initial draft parsing and API (NOT execution) support for SPARQL 1.1 Update W3C Working Draft of 2010-01-26.
  • Add public APIs (row, results, result formatter, variables table) so that query results can be built, read and written without a query.
  • Add API resilience checks for invalid NULL pointer arguments.
  • Many other bug fixes and improvements were made.

Fixed Issues:

  • 0000320: Add a void* user_data field to rasqal_variable
  • 0000323: Official MIME Type for JSON isn’t text/json
  • 0000343: Mime type for ‘table’ results format is text/plan
  • 0000345: MIME Type and URI for TSV and CSV
  • 0000347: rasqal linking fix

See the Rasqal 0.9.19 Release Notes for the full details of the changes.

Download: at http://download.librdf.org/source/rasqal-0.9.19.tar.gz

Raptor 1.4.21 released – Raptor 2 GIT work

I just released version 1.4.21 of my Raptor RDF parsing / serialising library to the world. This release is just bug fixes:

  • RDFa parser buffer management problems were fixed.
  • The Turtle parser and serializers now use QNames correctly as required by the specification.
  • The RDF/XML parser now resets correctly to detect duplicate rdf:IDs when a parser object is reused.
  • A few other minor bug and build fixes with made.
  • Fixed reported issues: 0000318, 0000319, 0000326, 0000331, 0000332 and 0000337

This is the first release since switching to GIT as the source control for the Redland libraries. The above release is on branch ‘raptor1′ in the new Redland GIT.

In parallel to this is the ongoing Raptor 2 ABI/API updating which is cleaning up 10 years of API and internal cruft. GIT is really helping speed up the ease of this work with the branching, staging/index and stash concepts it supports allowing false paths to be managed. The results can be seen on branch ‘master’ of raptor.

The updating is going well in the sense that make distcheck test suite passes, but there are still things to decide including:

  • Rename all raptor_CLASS_copy copy constructors to something else: either raptor_new_CLASS_from_CLASS (also used in raptor – Doh!) or to raptor_CLASS_addref which signifies better that it just adds a reference to the object, it’s a shallow copy, not a deep one.
  • Unify raptor_world, rasqal_world and librdf_world – which might help share classes between the libraries. Not sure if this is a good idea yet.
  • Add a graph term to the (subject, predicate, object) triple returned from parsing. I am probably going to do this.
  • Turn the raptor_locator object into a more of a log (like librdf_log) or exception object, with inner log/exceptions.
  • Improve the callback interface that passes error, warning etc. messages to user code.

I need to decide at what point to roll out an alpha release of Raptor 2, which will probably be numbered 1.9.0. Some of the above possibilities might be worth putting in a later alpha release.

This can all be seen in the GIT repository which includes instructions for checkout at git.librdf.org.

RDF Syntaxes 2.0

I’ve been diligently ignoring the RDF 2.0 threads on the semantic-web interest list, especially on Syntax since I’ve been there before (Modernising Semantic Web Markup). Firstly I’d endorse what Jeremy Carroll says about the features.

I think I’m qualified as an expert on RDF graph serializations / syntax since:

and I implemented all of the above plus GRDDL, RDFa (via librdfa), Atom and RSS*es, RDF/JSON, … in Raptor

People moan about RDF/XML and have for years. I even wrote down in great detail the flaws in Modernising Semantic Web Markup. Over all that time nobody has come up with a credible and complete XML syntax alternative that stuck, even myself. Let me summarize the ones I know:

  • TriX: had little takeup
  • RXR: ditto
  • GRIT: new, but flawed since it can only represent trees (no named bnodes)

The fundamental problem I think with using XML to write down graphs is:

People looking at XML expect they are looking at a hierarchical Tree.

So writing a Graph in an XML Tree is just going to always fail the simplicity test. This might come from using the XML DOM or looking at HTML, XHTML, but it’s pretty embedded in the mind.

Right now I’d dismiss any XML format for any “simple” or “obvious” way to write down RDF graphs that will be accepted by new users.

(Aside: There’s also a technical argument that no XML format can ever represent all RDF graphs since RDF allows Unicode codepoints that are not allowed in XML).

Now this isn’t a problem just with XML, it’s also true of other non-XML formats that are serial hierarchical documents. That means formats like JSON, which cannot even out-of-the-box represent anything that is not a tree, since it has no ID/REF mechanism.

Of course, apart having dealt with the RDF/XML I also invented Turtle (based on the N3 syntax, simplified) and although it’s a non-XML syntax, does seem to be in the sweet spot for users understanding it, without having the hierarchical document expectation. Yes, Turtle is close to JSON/python in syntax design space but this doesn’t seem to have been a problem.

So I’m happy with how Turtle turned out and that should be the focus of RDF syntax formats for users. It does need an update and I’ll probably work on that whether or not a new syntax is part of some future working group – I have a pile of fixes to go in. Adding named graphs (TRIG) might be the next step for this if it was a standard.

It may be there is a need for a better machine format, but please don’t mix them. Also, machines can read Turtle RDF :)

Consider this stream of conciousness RDF syntax thoughts as the basis of my position paper for the W3C RDF Next Steps workshop.