[redland-dev] Python/SQLite Model.load() performance

Dave Beckett dave at dajobe.org
Sat Nov 4 05:06:36 UTC 2006


Matt Chaput wrote:
> Hi,
> 
> I downloaded the Redland python bindings (1.03 binary package for
> Windows from the download section of the website).
> 
> I tried importing the RDF XML dump from my current solution into a
> Redland model backed by a SQLite store.
> 
> ie.
> 
>>>> s = RDF.Storage(storage_name="sqlite", name="test.db",
> option_string="")
>>>> m = RDF.Model(s)
>>>> m.load("test.rdf")
> 
> (BTW there's a bug in the documentation where it doesn't mention that
> option_string is required, and the error message you get if you don't
> include it is needlessly cryptic ("illegal arguments"))

I've added a few more words here (in RDF.py)

> I realize that the SQLite store is optimized for space (which is
> fantastic, and exactly the reason I'm thinking of switching to Redland),
> but importing 4.4MB of XML data took _over an hour_ on a fast machine.
> Meanwhile, the CPU never broke 4% the entire time, which makes me
> suspect I hit some kind of bug.
> 
> Anyone know what might be going on? Has anyone else used Python and
> SQLite with Redland lately and seen or not seen anything like this?

There's a lot of improvements and fixes to the sqlite store in Subversion
including adding the flag to turn off the PRAGMA synchronous = full,
and to do transactions better.  These might help you substantially,
especially using storage option synchronous='off'

I'm considering making the sqlite store the default persistent one
and shipping it with redland so it's always available.

Dave


More information about the redland-dev mailing list