[redland-dev] [Rasqal RDF Query Library 0000557]: performance hit with nested OPTIONAL blocks

Mantis Bug Tracker mantis-bug-sender at librdf.org
Wed Oct 23 05:24:56 EDT 2013


The following issue has been SUBMITTED. 
====================================================================== 
http://bugs.librdf.org/mantis/view.php?id=557 
====================================================================== 
Reported By:                FND
Assigned To:                
====================================================================== 
Project:                    Rasqal RDF Query Library
Issue ID:                   557
Category:                   query engine
Reproducibility:            always
Severity:                   minor
Priority:                   normal
Status:                     new
Query Language:             SPARQL 
====================================================================== 
Date Submitted:             2013-10-23 09:24
Last Modified:              2013-10-23 09:24
====================================================================== 
Summary:                    performance hit with nested OPTIONAL blocks
Description: 
using the Python bindings and an in-memory ("hashes") storage with a few
thousand triples, executing a query with nested OPTIONAL blocks exhibits odd
behavior:

    t0 = datetime.now()
    for result in RDF.Query(sparql).execute(graph):
         t1 = datetime.now()
         print "retrieval duration: %s" % (t1 - t0)
         t0 = t1

While the first iteration takes barely a millisecond, subsequent iterations take
between 10 and 30 seconds (with no clear pattern).

The SPARQL query looked like this:

    SELECT ?obs ... ?startTime ?endTime
    WHERE {
        ?obs a qb:Observation .
        ...
        OPTIONAL {
            ?obs led:temporal ?time .
            OPTIONAL { ?time dct:start ?startTime . }
            OPTIONAL { ?time dct:end ?endTime . }
        }
        ...
    }

Changing that nested OPTIONAL block to this:

        OPTIONAL { ?obs led:temporal ?time . }
        OPTIONAL { ?time dct:start ?startTime . }
        OPTIONAL { ?time dct:end ?endTime . }

... fixed the performance issue, with retrieval duration per iteration dropping
to milliseconds.

(Please excuse the lack of a reduced test case; I can supply this on request,
but figured describing it was better than not reporting it at all.)

====================================================================== 

Issue History 
Date Modified    Username       Field                    Change               
====================================================================== 
2013-10-23 09:24 FND            New Issue                                    
======================================================================



More information about the redland-dev mailing list