[redland-dev] Python/SQLite Model.load() performance
Dave Beckett
dave at dajobe.org
Sat Nov 4 05:06:36 UTC 2006
Matt Chaput wrote:
> Hi,
>
> I downloaded the Redland python bindings (1.03 binary package for
> Windows from the download section of the website).
>
> I tried importing the RDF XML dump from my current solution into a
> Redland model backed by a SQLite store.
>
> ie.
>
>>>> s = RDF.Storage(storage_name="sqlite", name="test.db",
> option_string="")
>>>> m = RDF.Model(s)
>>>> m.load("test.rdf")
>
> (BTW there's a bug in the documentation where it doesn't mention that
> option_string is required, and the error message you get if you don't
> include it is needlessly cryptic ("illegal arguments"))
I've added a few more words here (in RDF.py)
> I realize that the SQLite store is optimized for space (which is
> fantastic, and exactly the reason I'm thinking of switching to Redland),
> but importing 4.4MB of XML data took _over an hour_ on a fast machine.
> Meanwhile, the CPU never broke 4% the entire time, which makes me
> suspect I hit some kind of bug.
>
> Anyone know what might be going on? Has anyone else used Python and
> SQLite with Redland lately and seen or not seen anything like this?
There's a lot of improvements and fixes to the sqlite store in Subversion
including adding the flag to turn off the PRAGMA synchronous = full,
and to do transactions better. These might help you substantially,
especially using storage option synchronous='off'
I'm considering making the sqlite store the default persistent one
and shipping it with redland so it's always available.
Dave
More information about the redland-dev
mailing list