[redland-dev] What is the expected performance of a sparql Query?

Wed Aug 20 23:17:23 BST 2008

Kieron Taylor wrote:
> I have queried small triplestores with SPARQL and experienced near 
> instantaneous responses. 20,000 should be trivial for in-memory queries. 
> I suspect you're doing something slow in your query, rather than there 
> being something odd in your data model.
> 

I'm also seeing performance issues that I'm trying to understand. I'm 
working with a store that's sitting at around 37020 triples in an sqlite
backend. I've also got a larger version of that data set (739239 
triples) in a  mysql backend (which is the one I'm really interested in).

Here's an example query, query time is the time in seconds for 
query.execute() to return, execution time is the time to access (count) 
all the results.

Query:
--------------------------
PREFIX ...

select ?a
where {
   ?a rdf:type dada:Annotation .
   ?a dada:type oanc:s .
}
--------------------------
sqlite:
Query time:  0.222019910812
Number of results:  798
Execution time: 9.65870690346

mysql:
Query time:  0.173084974289
Number of results:  1511
Execution time: 19.6600589752

And another with only mysql numbers:
--------------------------
select ?a
where {
   ?a dada:partof <http://localhost:8010/corpora/OANC/VOL15_1> .
   ?a rdf:type dada:Annotation .
   ?a dada:type oanc:s .
}
--------------------------
mysql:
Query time:  0.117928028107
Number of results:  713
Execution time: 35.7706859112

So, is this a typical execution time for this kind of query? Is there 
something I can do to get more realistic performance.

(A version of the data set (with around 2m triples) is exposed at 
http://dada1.ics.mq.edu.au/ravs/sparql if you'd like to take a look but 
it's slooow at the moment :-)

Any advice appreciated.

Steve