[redland-dev] [Raptor RDF Syntax Library 0000593]: RDF/XML and RDF/XML-abbrev serialisers produce inconsistent output (and I think the latter is wrong)

Mantis Bug Tracker mantis-bug-sender at librdf.org
Mon Feb 2 10:58:56 EST 2015


The following issue has been SUBMITTED. 
====================================================================== 
http://bugs.librdf.org/mantis/view.php?id=593 
====================================================================== 
Reported By:                normang
Assigned To:                
====================================================================== 
Project:                    Raptor RDF Syntax Library
Issue ID:                   593
Category:                   api
Reproducibility:            always
Severity:                   major
Priority:                   normal
Status:                     new
Syntax Name:                 
====================================================================== 
Date Submitted:             2015-02-02 07:58
Last Modified:              2015-02-02 07:58
====================================================================== 
Summary:                    RDF/XML and RDF/XML-abbrev serialisers produce
inconsistent output (and I think the latter is wrong)
Description: 
Consider the following:

% cat xml.ttl
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix test: <urn:foo#>.

test:foo
    test:prop "name";
    test:value """<greeting>Hello, world!</greeting>"""^^rdf:XMLLiteral.

Converting this to RDF/XML produces:

% rapper -iturtle -ordfxml xml.ttl
rapper: Parsing URI file:///checkouts/me/clouds/latex-and-xmp/xml.ttl with
parser turtle
rapper: Serializing with serializer rdfxml
<?xml version="1.0" encoding="utf-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:test="urn:foo#">
  <rdf:Description rdf:about="urn:foo#foo">
    <test:prop>name</test:prop>
  </rdf:Description>
  <rdf:Description rdf:about="urn:foo#foo">
    <test:value rdf:parseType="Literal"><greeting>Hello,
world!</greeting></test:value>
  </rdf:Description>
</rdf:RDF>
rapper: Parsing returned 2 triples

But converting it to RDF/XML with abbreviations produces:

% rapper -iturtle -ordfxml-abbrev xml.ttl
rapper: Parsing URI file:///checkouts/me/clouds/latex-and-xmp/xml.ttl with
parser turtle
rapper: Serializing with serializer rdfxml-abbrev
<?xml version="1.0" encoding="utf-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
   xmlns:test="urn:foo#">
  <rdf:Description rdf:about="urn:foo#foo">
    <test:prop>name</test:prop>
    <test:value
rdf:datatype="http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral"><greeting>Hello,
world!</greeting></test:value>
  </rdf:Description>
</rdf:RDF>
rapper: Parsing returned 2 triples
% 

Notice that in the first case, the XML literal appears as a
rdf:parseType='Literal' element containing XML elements, but in the second it
appears as a string, with the angle brackets escaped.

Unfortunately, I'm not _convinced_ I know what should be happening here, but
this _does_ appear to be at least inconsistent.

The -ordfxml output appears to be correct (comparing it with
http://www.w3.org/TR/rdf-syntax-grammar/#section-Syntax-XML-literals), and the
text of section 7.2.17 appears to indicate that there is no need for any
rdf:XMLLiteral typing.

The -ordfxml-abbrev output, however, is I think wrong, because it's _different_
XML from what appears in the -ordfxml case: in the rdfxml case, the result is a
single 'greeting' element, in the rdfxml-abbrev case, the result is a sequence
of characters, which includes (escaped) '<' and '>' characters.  This does not
strictly contradict http://www.w3.org/TR/rdf11-concepts/#section-XMLLiteral
because a sequence of characters _is_ 'XML content', but it is not, I think, the
same XML content (ie, an element) that is indicated by """<greeting>Hello,
world!</greeting>"""^^rdf:XMLLiteral

Raptor is partly consistent, here, because the -ordfxml and -ordfxml-abbrev
outputs (and -ordfxml-xmp) produce the same -oturtle output.  However the
-ordfxml and -ordfxml-abbrev outputs are inconsistent with each other.  That is,
they can't both be right, and I think it's the rdfxml-abbrev output which is
wrong.

It's true that the -ordfxml output uses rdf:parseType='Literal', whereas the
-ordfxml-abbrev output
rdf:datatype="http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral".  I _think_
these are equivalent to each other (the Note in
http://www.w3.org/TR/rdf11-concepts/#section-XMLLiteral isn't completely clear);
if so, the content should be the same.

This last point implies (again, I think) that raptor's parsing of the
-ordfxml-abbrev output is incorrect.  Referring again to
<http://www.w3.org/TR/rdf11-concepts/#section-XMLLiteral>, the content
"<greeting..." is indeed in the lexical space of the rdf:XMLLiteral datatype
(good), but the lexical-to-value mapping is (the normalized version of) "a DOM
DocumentFragment node [DOM4] corresponding to the input string".  The problem is
that that 'corresponding' is slightly vague, but the only thing I think it can
mean is the result of doing an XML parse on the input string (namely
"<greeting...") which produces a series of character nodes, not an element.

====================================================================== 

Issue History 
Date Modified    Username       Field                    Change               
====================================================================== 
2015-02-02 07:58 normang        New Issue                                    
======================================================================



More information about the redland-dev mailing list