[redland-dev] State of Redland RDF Libraries 2008-02
Dave Beckett
dave at dajobe.org
Tue Feb 19 00:51:30 GMT 2008
State of Redland RDF Libraries 2008-02
Redland was born 2000-08. Happy 7.5th birthday!
This is a review of 2007 (roughly) since I reported in the State of
Redland 2007-02 on 2007-02-18. It covers:
* Review of Redland users, current state, development
* Redland challenges, tasks including work already underway and my
future ideas
* Changes: how I want to change the project
(This is on the web at http://librdf.org/2008/02/18-state/)
1. Redland Users
Redland is made available by several Linux, Unix and other open source
projects such as:
* Debian (sarge onwards)
* Fedora (FC4 onwards). In 2007 FC6 onwards added librdf.
* FreeBSD Ports
* Gentoo
* Macports. Added all packages by Oct 2007
* Mandriva (9.1 onwards). In 2007 Mandriva Cooker gained all
packages.
* SUSE (9.2 onwards)
* Ubuntu (breezy onwards)
and the libraries are also used inside other applications and services
such as, for example:
* ActiveRDF ruby RDF
* Amaya web browser and HTML Editor
* Ardour digital audio workstation
* Hydrogen simple drum machine/step sequencer
* Joost somewhere
* Morla RDF graphical editor
* Nepomuk KDE semantic desktop app
* Redland C++ API by Sebastian Faubel. Oct 2007.
* Soprano QT based RDF framework. 2007.
* Storyist commercial story writing app. 2007
* Venus python feed aggregator
* many Yahoo! web sites behind the scenes
* ... but I am not keeping track of these very well in the
applications list ...
2. State of the Packages
My summary of the high-level state of the packages is:
Raptor syntax parsing and serializing: libraptor
Mature. The API is growing a little since some internal parts
(SAX2 API) are getting pushed into the public API. There are
also some new features and syntaxes being added, portability
fixes and more rarely, actual bug fixes.
Rasqal query parsing, executing: librasqal
Under development. The current API is unstable and being
deliberately broken in the next release. The query engine is not
complete enough to execute SPARQL and that is still the priority
for 1.0.
Redland RDF API and triple stores: librdf
Mature. Some API change is happening and the storages are
getting improvements. Mostly updates from Raptor and Rasqal plus
bug fixes.
Language Bindings to Perl, PHP, Python and Ruby
Mature. Removed the C#, Java and Tcl bindings in 2007 as
promised since I was not going to maintain them.
3. Development
In 2007, each of the packages has seen the following releases and major
changes:
Raptor 1.4.15 - 1.4.16 (2 releases)
+ GRDDL support was completed and passed the test cases.
+ Improved XML and URI error handling
+ Updated Turtle parser for Turtle 2007-09-11
+ Added a TRiG parser
+ Many low-memory situation improvements
Rasqal 0.9.14 - 0.9.15 (2 releases)
+ Updated the SPARQL syntax support to match the W3C
Recommendation.
+ Query engine supports all SPARQL datatypes and evaluation
rules.
+ Added LAQRS syntax extensions
+ Many low-memory situation improvements
Redland 1.0.6 - 1.0.7 (2 releases)
+ A new transactions API was added implemented for MySQL and
SQLite storages
+ Added a optional modular storage configuration to load storage
modules on demand
+ A new query results formatter class was added
+ Many low-memory and resource allocation failure improvements
+ Many bug fixes
Language Bindings 1.0.6.1 - 1.0.7.1 (2 releases)
+ Removed Tcl, Java and C# bindings as promised
+ Many updates to the Python and Ruby Bindings
+ Many bug fixes
In 2007 Lauri Aalto was a new committer and made a lot of changes to
the libraries in the areas of low-memory and handling resource
allocation failures, mentioned above plus portability fixes for non-gcc
compilers and Win32 as well as other bug fixes and improvements.
The redland mailing lists are now (early 2008) archived by gmane.org
and The Mail Archive. You can read them on their web sites at:
http://gmane.org and http://www.mail-archive.com/
4. Challenges
The main challenge continues to be to make the project more scalable.
Although I package the source code, I only really deal with Debian
binary packages since as can be seen above, there are others working on
distribution-specific packages, which is good.
The loss of SourceForge's compile farm was tragic since it means there
is no automated way to test cross-platform compatibility.
I noticed that although 2007 had 9 releases, the previous year there
were 15. This is mainly a consequence of me being busier and not
developing code as part of my day job. Less releases is not necessarily
bad as the packages mature but it can mean a long time to get out bug
fixes.
My main goal to deal with these could be summarised as:
* Try harder to encourage more shared development
5. Tasks
5.1 General tasks
More of a wishlist than an ordered list
* Think about a License change to Apache2 only.
* Make Redland turn SPARQL into underlying SQL queries when possible.
* Start the Redland (librdf) API tutorial.
* Create some documentation to explain the libraries structure and
relationships.
* Consider not shipping Raptor and Rasqal inside the Redland tarball
* Create documentation on the data flow inside the libraries
* Figure out whether to keep writing manual pages as well as gtkdoc.
(DRY)
* The demos need to be updated and the changes made put back into
subversion.
* A SPARQL protocol endpoint demo would be good to have
DRY =3D Don't Repeat Yourself
5.2 Pending stuff
There are several tasks already in progress either sitting in a patch,
in Subversion or underway separately.
* A new schema for the SQLite store: me (patch emailed to
redland-dev)
* Object-based PHP5 bindings: Yahoo! (pending)
* Two JSON serializers for Raptor (svn)
* AVL Trees improvements added to Raptor that should make RDF/XML and
Turtle serializers faster (svn)
* Rasqal API and ABI change announced for future 0.9.16 (partially in
svn)
* Rasqal can read result sets from the SPARQL query results XML (svn)
5.3 Raptor tasks
* Plan for Raptor 2 API/ABI change
* Focus should be bug fixes
5.4 Rasqal tasks
* Rasqal 1.0: when Rasqal can execute complete SPARQL
+ Make SPARQL OPTIONALs work
+ Make SPARQL GROUP work
+ Make SPARQL UNION work
* Write a query optimiser
* Add a way to declare extension functions
* Look into language extensions
* Address query engine denial of service:
+ limit query wall clock time
+ limit triple pattern matches
+ callback to allow application to abort queries?
+ limit memory use?
+ limit sorting of results?
5.5 Redland librdf tasks
* Improve the storages performance
5.6 Bindings tasks
* Split the single language bindings package to be one per-binding.
That would be: Perl, PHP5, Python and Ruby
* Make the Perl binding into a CPAN installable tarball - partially
done but not entirely working
6. Future Ideas
6.1 New Version Control System
This is the same as last year's New Version Control System idea and
although it is not urgent, I'm favouring GIT right now with the main
issue that it's got a steep learning curve compared to anything else.
6.2 Raptor Version 2
This break-the-binary-API I also discussed last year in Raptor Version
2 I can see being started once the focus on Rasqal 1.0 is over which
should happen in 2008. There are several cleanups that need to be done.
7. Changes
In order to encourage more help with Redland, I'm proposing this:
Five good patches get you commit access.
(after Brian Aker but I'm slightly more cautious)
Plus I have started a Redland development blog at http://blog.librdf.or=
g/
Thanks for reading.
Dave Beckett, http://www.dajobe.org/
California, USA, 2008-02-18
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 186 bytes
Desc: OpenPGP digital signature
Url : http://lists.usefulinc.com/pipermail/redland-dev/attachments/20080218=
/bee9433c/signature.pgp
More information about the redland-dev
mailing list