[redland-dev] Limited optional parameters in SPARQL?

Dave Beckett dave.beckett at bristol.ac.uk
Wed May 4 11:54:42 BST 2005


On Mon, 18 Apr 2005 16:03:25 +0100, "Phil Archer" <phil.archer at icra.org> wrote:

> Dear all,
> 
> My first post to this list so a (very) brief introduction. The organisation 
> I work for, ICRA, is working with others on creating a successor to the PICS 
> system for labelling content (largely for child protection purposes). The 
> basic method of using RDF is now pretty well tied down and we're working on 
> some tools to read the new labels which will use Redland.
> 
> With this in mind I've been using the online demo of RASQL to run some 
> tests. The schema and method we're using allows an RDF node to have a 
> variety of properties, all of which are optional. My question arises from 
> the fact that I get different answers depending on the length of the query.
> 
> Given the data source: http://www.icra.org/test/rdftests/labels11.rdf I 
> offer 2 versions of the same query:
> 
> PREFIX label: <http://www.w3.org/2004/12/q/contentlabel#>
> SELECT ?uri ?action ?label ?classification ?management ?frequent ?several
> ?occasional ?single
> WHERE
>   (?x rdf:type label:Ruleset)
>   (?x label:rules ?o1)
>   (?o1 rdf:first ?node)
>   OPTIONAL (?node label:hasURI ?uri)
>   OPTIONAL (?node rdf:type ?action)
>   OPTIONAL (?node label:hasLabel ?label)
> 
> If I run this I get the expected result:
> 
> 1  \.jpg$ http://www.w3.org/2004/12/q/contentlabel#intersectionOf
> http://www.icra.org/test/rdftests/labels11.rdf#label_2
> 
>  2 nude http://www.w3.org/2004/12/q/contentlabel#intersectionOf
> http://www.icra.org/test/rdftests/labels11.rdf#label_2
> 
> (plus a load of empty variables because I declared more than there is data 
> for).

I updated the demo to use the newer SPARQL syntax, so rewriting:

PREFIX label: <http://www.w3.org/2004/12/q/contentlabel#>
SELECT ?uri ?action ?label ?classification ?management ?frequent ?several
?occasional ?single
WHERE {
  ?x rdf:type label:Ruleset .
  ?x label:rules ?o1 .
  ?o1 rdf:first ?node .
  OPTIONAL { ?node label:hasURI ?uri } .
  OPTIONAL { ?node rdf:type ?action } .
  OPTIONAL { ?node label:hasLabel ?label }
}

tested at:

http://librdf.org/query?uri=http%3A%2F%2Fwww.icra.org%2Ftest%2Frdftests%2Flabels11.rdf&query=PREFIX+label%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F2004%2F12%2Fq%2Fcontentlabel%23%3E%0D%0ASELECT+%3Furi+%3Faction+%3Flabel+%3Fclassification+%3Fmanagement+%3Ffrequent+%3Fseveral%0D%0A%3Foccasional+%3Fsingle%0D%0AWHERE+%7B%0D%0A++%3Fx+rdf%3Atype+label%3ARuleset+.%0D%0A++%3Fx+label%3Arules+%3Fo1+.%0D%0A++%3Fo1+rdf%3Afirst+%3Fnode+.%0D%0A++OPTIONAL+%7B+%3Fnode+label%3AhasURI+%3Furi+%7D+.%0D%0A++OPTIONAL+%7B+%3Fnode+rdf%3Atype+%3Faction+%7D+.%0D%0A++OPTIONAL+%7B+%3Fnode+label%3AhasLabel+%3Flabel+%7D%0D%0A%7D%0D%0A&language=sparql

> 
> The full query is this:
> 
> PREFIX label: <http://www.w3.org/2004/12/q/contentlabel#>
> SELECT ?uri ?action ?label ?classification ?management ?frequent ?several 
> ?occasional ?single
> WHERE
>   (?x rdf:type label:Ruleset)
>   (?x label:rules ?o1)
>   (?o1 rdf:first ?node)
>   OPTIONAL (?node label:hasURI ?uri)
>   OPTIONAL (?node rdf:type ?action)
>   OPTIONAL (?node label:hasLabel ?label)
>   OPTIONAL (?node label:hasManagementInfo ?management)
>   OPTIONAL (?node label:hasClassification ?classification)
>   OPTIONAL (?node label:hasFrequentScenes ?frequent)
>   OPTIONAL (?node label:hasSeveralScenes ?several)
>   OPTIONAL (?node label:hasOccasionalScenes ?occasional)
>   OPTIONAL (?node label:hasSingleScene ?single)
> 
> If I run this, I get a single result of
> 
> 1 nude http://www.w3.org/2004/12/q/contentlabel#intersectionOf
> http://www.icra.org/test/rdftests/labels11.rdf#label_2

In the newer sparql syntax

PREFIX label: <http://www.w3.org/2004/12/q/contentlabel#>
SELECT ?uri ?action ?label ?classification ?management ?frequent ?several
?occasional ?single
WHERE {
  ?x rdf:type label:Ruleset .
  ?x label:rules ?o1 .
  ?o1 rdf:first ?node .
  OPTIONAL { ?node label:hasURI ?uri } .
  OPTIONAL { ?node rdf:type ?action } .
  OPTIONAL { ?node label:hasLabel ?label } .
  OPTIONAL { ?node label:hasManagementInfo ?management } .
  OPTIONAL { ?node label:hasClassification ?classification } .
  OPTIONAL { ?node label:hasFrequentScenes ?frequent } .
  OPTIONAL { ?node label:hasSeveralScenes ?several } .
  OPTIONAL { ?node label:hasOccasionalScenes ?occasional } .
  OPTIONAL { ?node label:hasSingleScene ?single } 

}

http://librdf.org/query?uri=http%3A%2F%2Fwww.icra.org%2Ftest%2Frdftests%2Flabels11.rdf&query=PREFIX+label%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F2004%2F12%2Fq%2Fcontentlabel%23%3E%0D%0ASELECT+%3Furi+%3Faction+%3Flabel+%3Fclassification+%3Fmanagement+%3Ffrequent+%3Fseveral%0D%0A%3Foccasional+%3Fsingle%0D%0AWHERE+%7B%0D%0A++%3Fx+rdf%3Atype+label%3ARuleset+.%0D%0A++%3Fx+label%3Arules+%3Fo1+.%0D%0A++%3Fo1+rdf%3Afirst+%3Fnode+.%0D%0A++OPTIONAL+%7B+%3Fnode+label%3AhasURI+%3Furi+%7D+.%0D%0A++OPTIONAL+%7B+%3Fnode+rdf%3Atype+%3Faction+%7D+.%0D%0A++OPTIONAL+%7B+%3Fnode+label%3AhasLabel+%3Flabel+%7D+.%0D%0A++OPTIONAL+%7B+%3Fnode+label%3AhasManagementInfo+%3Fmanagement+%7D+.%0D%0A++OPTIONAL+%7B+%3Fnode+label%3AhasClassification+%3Fclassification+%7D+.%0D%0A++OPTIONAL+%7B+%3Fnode+label%3AhasFrequentScenes+%3Ffrequent+%7D+.%0D%0A++OPTIONAL+%7B+%3Fnode+label%3AhasSeveralScenes+%3Fseveral+%7D+.%0D%0A++OPTIONAL+%7B+%3Fnode+label%3AhasOccasionalScenes+%3Foccasional+%7D+.%0D%0A++OPTIONAL+%7B+%3Fnode+label%3AhasSingleScene+%3Fsingle+%7D+%0D%0A%0D%0A%7D%0D%0A&language=sparql

> Hmmm...  OK, so I play around and I find that I can remove _any_ of the 
> OPTIONAL lines after (?node label:hasLabel :label) and get the expected 
> result (i.e. both if the bindings that are present).
> 
> At first I thought it might be because the demo front end on librdf was just 
> that, a demo, but, we've now built Redland into a little application that 
> downloads the RDF instance and runs the query locally. We get a similar 
> problem there (including error messages about running out of memory) but the 
> essential point remains - more than a certain number of optional items in 
> the query produces incomplete results.

Odd that it runs out of memory.

> Is there a limit on the number of OPTIONAL items that can be in a query?

No.

> If so, is this something that is in the SPARQL spec (I can't find it) or in 
> Redland? If it's in Redland is it deliberate or a bug?

It's a bug.

I've always be a bit unhappy with the way optionals are executed and
more than that, a bit suspicious it isn't right.  I hope that will
change soon as the engine needs an update in that area to deal with
group graph patterns (group/and, union) but at present I'm
concentrating on getting rasqal working again with the newer sparql
syntax.

Dave


More information about the redland-dev mailing list