[redland-dev] [Raptor RDF Syntax Library 0000495]: RDFa parser produces unexpected results with CDATA sections and entity references
Mantis Bug Tracker
mantis-bug-sender at librdf.org
Sun Feb 19 16:21:58 EST 2012
The following issue has been SUBMITTED.
======================================================================
http://bugs.librdf.org/mantis/view.php?id=495
======================================================================
Reported By: normang
Assigned To:
======================================================================
Project: Raptor RDF Syntax Library
Issue ID: 495
Category: api
Reproducibility: always
Severity: major
Priority: normal
Status: new
Syntax Name: RDFa & Turtle
======================================================================
Date Submitted: 2012-02-19 21:21
Last Modified: 2012-02-19 21:21
======================================================================
Summary: RDFa parser produces unexpected results with CDATA
sections and entity references
Description:
Consider the examples below.
Tests content1, 2, 4 and 5 are, I think wrong.
For content1, 2, 4 and 5, the CDATA marked section is simply omitted. Although
http://www.w3.org/TR/rdfa-syntax/ doesn't mention CDATA marked sections, there's
nothing there that seems to warrant ignoring them.
Tests content1, 2 and 5 produce XMLLiteral data which includes both elements and
entities. However in each of the three cases, the Turtle output has the
characters denoted by entities (the &<>) appearing literally in the
rdf:XMLLiteral, making it not valid XML. Ie they're not escaped in any way. I
can't find anything, in either http://www.w3.org/TR/REC-rdf-syntax/ (which I
suppose is the definition of rdf:XMLLiteral) or
http://www.w3.org/TeamSubmission/turtle/ which spells out what the content of an
rdf:XMLLiteral should be, but I would be surprised if invalid XML is allowed.
I don't know whether this is an RDFa parsing error or a Turtle serialisation
error.
Steps to Reproduce:
% cat /tmp/try.xml
<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE html PUBLIC '-//W3C//DTD XHTML+RDFa 1.0//EN'
'http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd'>
<html xmlns='http://www.w3.org/1999/xhtml' xmlns:ns='urn:ns#'
xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'>
<head>
<title property='ns:title'>T</title>
<meta about='' property='ns:abstract' content='Abstract <>&%' />
</head>
<body>
<!-- for cases below, see http://www.w3.org/TR/rdfa-syntax/ Sect. 6.3.1.3 -->
<!-- explicit XMLLiteral @datatype -->
<p property='ns:content1'
datatype='rdf:XMLLiteral'
>content1: <![CDATA[cdata<>&]]> <span>not</span>&<></p>
<!-- no @datatype, presence of elements implies it -->
<p property='ns:content2'
>content2: <![CDATA[cdata<>&]]> <span>not</span>&<></p>
<!-- no @datatype, but no XML elements, so plain literal -->
<p property='ns:content3'
>content3: plain content</p>
<!-- explicit empty @datatype, so interpreted as a plain literal -->
<p property='ns:content4'
datatype=''
>content4: <![CDATA[cdata<>&]]> <span>not</span>&<></p>
<!-- basically same as content2 above -->
<div property='ns:content5'
><p>content5: <![CDATA[cdata<>&]]>
<span>not</span>&<></p></div>
</body></html>
% rapper -irdfa -oturtle /tmp/try.xml
rapper: Parsing URI file:///tmp/try.xml with parser rdfa
rapper: Serializing with serializer turtle
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix : <http://www.w3.org/1999/xhtml> .
@prefix ns: <urn:ns#> .
<file:///tmp/try.xml>
ns:abstract "Abstract <>&%" ;
ns:content1 "content1: <span xmlns=\"http://www.w3.org/1999/xhtml\"
xmlns:ns=\"urn:ns#\"
xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">not</span>&<>"^^rdf:XMLLiteral
;
ns:content2 "content2: <span xmlns=\"http://www.w3.org/1999/xhtml\"
xmlns:ns=\"urn:ns#\"
xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">not</span>&<>"^^rdf:XMLLiteral
;
ns:content3 "content3: plain content" ;
ns:content4 "content4: not&<>" ;
ns:content5 "<p xmlns=\"http://www.w3.org/1999/xhtml\" xmlns:ns=\"urn:ns#\"
xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">content5:
<span>not</span>&<></p>"^^rdf:XMLLiteral ;
ns:title "T" .
rapper: Parsing returned 7 triples
% rapper --version
2.0.4
%
======================================================================
Issue History
Date Modified Username Field Change
======================================================================
2012-02-19 21:21 normang New Issue
======================================================================
More information about the redland-dev
mailing list