[redland-dev] sqlite storage performance question

Lauri Aalto laalto at gmail.com
Mon Mar 1 10:24:25 CET 2010


Hi,

I'm not Dave but I've done some work on the sqlite storage performance
issues back in 2008.

First of all, the sqlite storage is not really optimized for performance.

Since 2006, I've added support for sqlite transactions to the storage
module. Use them whenever you can. librdf_storage_transaction_start(),
librdf_storage_transaction_commit() or the similar librdf_model
functions, depending on your level of abstraction.

Looking at the code, you can change the sqlite pragrma synchronous
setting with synchronous="x" where x is one of { off, normal, full }.
The default value is "normal" so if you want any speedup that way, the
only way is to set it to "off", with the caveats described at
http://sqlite.org/pragma.html#pragma_synchronous . If memory serves
me, it does speed things up to some degree.

Other than that, you can write up a sqlite backed storage module
yourself that is optimized for your needs. I've done some experiments
(but not to the level that I'm comfortable publishing) an alternative
sqlite storage doing e.g. the following, in no particular order:

* Denormalized the schema, moved statement data from four
uris/blanks/literals/statements tables to a single statement table to
get rid of joins and reduce the number of writes/lookups required.

* Used a simple hashing function to turn strings in
uris/blanks/literals into integers and put the integer columns first
in both the schema and queries.

* Used prepared (pre-compiled) sqlite statements with bound variables
instead of cooking the SQL in ASCII for every operation.

* Relaxed some constraints such as allowing duplicate statements to
get rid of a lookup during insertion.

* sqlite indexes didn't seem useful. They cost a lot in inserts but
bring little benefit if your schema and queries are already optimized.
Your mileage may vary.

Hope this helps,

Lauri


More information about the redland-dev mailing list