[redland-dev] Contexts and Querying
Christopher Schmidt
crschmidt at crschmidt.net
Sun Mar 13 14:38:11 GMT 2005
Hi all,
I'm looking at using contexts in a Redland model again, and I want to
outline what I see as the current problem for me, and find out whether
I'm misunderstanding, or if this is just the way it works.
When I add Statements to the Model in a context, I get one statement in
the model for every context that statement is part of. However, this has
the negative effect of creating "duplicates" of statements. For example,
if I have:
http://crschmidt.net/foaf.rdf#crschmidt foaf:nick "crschmidt" . (context
test1)
http://crschmidt.net/foaf.rdf#crschmidt foaf:nick "crschmidt" . (context
test2)
when querying, for example, something like:
SELECT ?b ?c where (?a foaf:nick "crschmidt") (?a ?b ?c)
it takes twice as long to return the results, because it
basically has to fetch all the nodes attached to the first and the
second crschmidt. Having loaded http://crschmidt.net/foaf.rdf into the
model twice (total of ~1590 statements), the following query:
q = RDF.Query('SELECT ?c where (?a <http://xmlns.com/foaf/0.1/nick>
"crschmidt") (?a <http://xmlns.com/foaf/0.1/nick> ?c)')
gives me four results, when in reality it should only give me two.
I'm assuming that it gives me the results in each context, but it seems
that it returns the results for each context of both variables.
q = RDF.Query('SELECT ?c where
(<http://crschmidt.net/foaf.rdf#crschmidt>
<http://xmlns.com/foaf/0.1/nick> ?c)')
Does what I'd expect, and only returns two results, one for each
context.
Although this is not a huge deal in relatively small datasets, when I
start working with megatriple storages, it becomes a bigger problem,
because I have to deal with a result return that is much larger than it
should be.
I guess, looking up, what happens is that it finds ?a in two contexts,
so it has foaf.rdf#crschmidt (context 1), foaf.rdf#crschmidt (context
2), then it queries against each of those, returning all the
combinations of 1 and 2 taken two at a time.
Am I missing something? Is this buggy behavior, or just something I
should expect? It just seems that this makes the queries take a lot
longer - especially once you start dealing with a lot of different
contexts.
To be honest, I'm not sure I fully understand how contexts work. What
I'd like to be able to do is have one Statement, and have attached to it
a list of different contexts that the Statement exists in. Currently, it
seems as if each Statement is repeated a number of times equivilant to
the number of contexts it exists in.
Am I misunderstanding something?
Thanks.
--
Christopher Schmidt
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://lists.gnomehack.com/pipermail/redland-dev/attachments/20050313/2bcc7a02/attachment.pgp
More information about the redland-dev
mailing list