[redland-dev] Reducing query time variance with cost based planner
Andrew Reslan
andrew.reslan at mac.com
Thu Feb 13 05:19:31 EST 2014
As a follow on, I have been looking for a simpler solution.
I have a custom storage to an underlying document store, each document represents a single subject and contains all triples for that subject.
If I have a query of the form:
select * where { ?subject ns1:predicate2 ns2:subject2 . ?subject ns1:predicate3 ns2:subject3 . ?subject ns1:predicate1 "value1" . }
From my current understanding and implementation of the custom storage, separate calls to find_statements will be made on the storage, with separate streams being returned and having to be merged by rasqal/redland.
In the case of my custom storage, it is possible in this case to execute a single composite query to find subjects that match all three statements.
This would resolve 80-90% of the slow queries we are experiencing and would cover the initial case where there is a unique match on a string literal, as the composite query would only return a single subject from the one composite call.
As far as I can see this composite approach is not currently supported by storage API.
Extending the storage API would be straight forward, and implementation optional, but I can not figure out if/how to identify this special case from the parse tree and pass it through to a storage that supports it.
Any pointers would be appreciated.
Andy
On 12 Feb 2014, at 10:59, Andrew Reslan <andrew.reslan at mac.com> wrote:
> Are there any mechanisms in redland/rasqal for reducing query time variance depending on statement order in a SPARQL query.
>
> For example a query that contains an exact literal match as the first statement, may take a few ms to execute and return a single result.
>
> But if the statement order is changed so that the first statement returns 1000's of potential matches then the query time may increases to tens of seconds to return the same single result.
>
> I'm trying to find where in the code the execution engine walks the syntax tree, and I'm wondering of this could be modified to use a cost based planner that could modify the execution order of statements?
>
> Andy
> _______________________________________________
> redland-dev mailing list
> redland-dev at lists.librdf.org
> http://lists.librdf.org/mailman/listinfo/redland-dev
More information about the redland-dev
mailing list