[redland-dev] [Raptor RDF Syntax Library 0000593]: RDF/XML and RDF/XML-abbrev serialisers produce inconsistent output (and I think the latter is wrong)
Mantis Bug Tracker
mantis-bug-sender at librdf.org
Mon Feb 2 10:58:56 EST 2015
The following issue has been SUBMITTED.
======================================================================
http://bugs.librdf.org/mantis/view.php?id=593
======================================================================
Reported By: normang
Assigned To:
======================================================================
Project: Raptor RDF Syntax Library
Issue ID: 593
Category: api
Reproducibility: always
Severity: major
Priority: normal
Status: new
Syntax Name:
======================================================================
Date Submitted: 2015-02-02 07:58
Last Modified: 2015-02-02 07:58
======================================================================
Summary: RDF/XML and RDF/XML-abbrev serialisers produce
inconsistent output (and I think the latter is wrong)
Description:
Consider the following:
% cat xml.ttl
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix test: <urn:foo#>.
test:foo
test:prop "name";
test:value """<greeting>Hello, world!</greeting>"""^^rdf:XMLLiteral.
Converting this to RDF/XML produces:
% rapper -iturtle -ordfxml xml.ttl
rapper: Parsing URI file:///checkouts/me/clouds/latex-and-xmp/xml.ttl with
parser turtle
rapper: Serializing with serializer rdfxml
<?xml version="1.0" encoding="utf-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:test="urn:foo#">
<rdf:Description rdf:about="urn:foo#foo">
<test:prop>name</test:prop>
</rdf:Description>
<rdf:Description rdf:about="urn:foo#foo">
<test:value rdf:parseType="Literal"><greeting>Hello,
world!</greeting></test:value>
</rdf:Description>
</rdf:RDF>
rapper: Parsing returned 2 triples
But converting it to RDF/XML with abbreviations produces:
% rapper -iturtle -ordfxml-abbrev xml.ttl
rapper: Parsing URI file:///checkouts/me/clouds/latex-and-xmp/xml.ttl with
parser turtle
rapper: Serializing with serializer rdfxml-abbrev
<?xml version="1.0" encoding="utf-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:test="urn:foo#">
<rdf:Description rdf:about="urn:foo#foo">
<test:prop>name</test:prop>
<test:value
rdf:datatype="http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral"><greeting>Hello,
world!</greeting></test:value>
</rdf:Description>
</rdf:RDF>
rapper: Parsing returned 2 triples
%
Notice that in the first case, the XML literal appears as a
rdf:parseType='Literal' element containing XML elements, but in the second it
appears as a string, with the angle brackets escaped.
Unfortunately, I'm not _convinced_ I know what should be happening here, but
this _does_ appear to be at least inconsistent.
The -ordfxml output appears to be correct (comparing it with
http://www.w3.org/TR/rdf-syntax-grammar/#section-Syntax-XML-literals), and the
text of section 7.2.17 appears to indicate that there is no need for any
rdf:XMLLiteral typing.
The -ordfxml-abbrev output, however, is I think wrong, because it's _different_
XML from what appears in the -ordfxml case: in the rdfxml case, the result is a
single 'greeting' element, in the rdfxml-abbrev case, the result is a sequence
of characters, which includes (escaped) '<' and '>' characters. This does not
strictly contradict http://www.w3.org/TR/rdf11-concepts/#section-XMLLiteral
because a sequence of characters _is_ 'XML content', but it is not, I think, the
same XML content (ie, an element) that is indicated by """<greeting>Hello,
world!</greeting>"""^^rdf:XMLLiteral
Raptor is partly consistent, here, because the -ordfxml and -ordfxml-abbrev
outputs (and -ordfxml-xmp) produce the same -oturtle output. However the
-ordfxml and -ordfxml-abbrev outputs are inconsistent with each other. That is,
they can't both be right, and I think it's the rdfxml-abbrev output which is
wrong.
It's true that the -ordfxml output uses rdf:parseType='Literal', whereas the
-ordfxml-abbrev output
rdf:datatype="http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral". I _think_
these are equivalent to each other (the Note in
http://www.w3.org/TR/rdf11-concepts/#section-XMLLiteral isn't completely clear);
if so, the content should be the same.
This last point implies (again, I think) that raptor's parsing of the
-ordfxml-abbrev output is incorrect. Referring again to
<http://www.w3.org/TR/rdf11-concepts/#section-XMLLiteral>, the content
"<greeting..." is indeed in the lexical space of the rdf:XMLLiteral datatype
(good), but the lexical-to-value mapping is (the normalized version of) "a DOM
DocumentFragment node [DOM4] corresponding to the input string". The problem is
that that 'corresponding' is slightly vague, but the only thing I think it can
mean is the result of doing an XML parse on the input string (namely
"<greeting...") which produces a series of character nodes, not an element.
======================================================================
Issue History
Date Modified Username Field Change
======================================================================
2015-02-02 07:58 normang New Issue
======================================================================
More information about the redland-dev
mailing list