[redland-dev] [Redland librdf RDF API 0000404]: Parsing errors when parsing RDFa

Mantis Bug Tracker mantis-bug-sender at librdf.org
Thu Dec 2 12:00:00 CET 2010


The following issue has been SUBMITTED. 
====================================================================== 
http://bugs.librdf.org/mantis/view.php?id=404 
====================================================================== 
Reported By:                normang
Assigned To:                
====================================================================== 
Project:                    Redland librdf RDF API
Issue ID:                   404
Category:                   api
Reproducibility:            always
Severity:                   major
Priority:                   normal
Status:                     new
====================================================================== 
Date Submitted:             2010-12-02 10:59
Last Modified:              2010-12-02 10:59
====================================================================== 
Summary:                    Parsing errors when parsing RDFa
Description: 
Consider the attached file, and its output, included below for a build with
librdf 1.0.12 and raptor 1.9.1.

The output is below.  I believe this output is incorrect in the following ways:

* Each of the RDFa parses is done with the same base-URI given to
librdf_parser_parse_string_into_model, namely <urn:base1#>, and tests 0 and 1
provide a <base> element with <urn:foo#> in the HTML.  The resulting RDF should
therefore be about='' <urn:base1#>, except tests 0 and 1, where it should be
about <urn:foo#>.  However tests 1, 2 and 4 are about <urn:base1>, and test 0 is
about <urn:foo>.  That is the trailing '#' has been lost, and the <base> element
ignored in test 1.

* Tests 3 and 4 should be identical, but test 3 is parsed with a garbage base
URI.  It appears that single quotes confuse the XML parser!

* The parsing is done with a new model and (memory) storage each time.  The
output is done with librdf_serializer_serialize_model_to_file_handle, but it
seems to remember the contents of the previous parses, so that test 4, for
example, has the results of all five tests in it.  The display using
librdf_model_as_stream doesn't show this, nor does serialising with the
"ntriples" serialiser, so this appears specific to the "turtle" serialiser.
(??dumping the storage rather than the model)




librdf version 1.0.12; raptor version 1.9.1
--------
Test 0:
  (<urn:foo> , <http://purl.org/dc/terms/abstract> , "Abstract 1")
@base <urn:base0#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

<urn:foo>
    <http://purl.org/dc/terms/abstract> "Abstract 1" .

--------
Test 1:
  (<urn:base1> , <http://purl.org/dc/terms/abstract> , "Abstract 2")
@base <urn:base1#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

<urn:base1>
    <http://purl.org/dc/terms/abstract> "Abstract 2" .

<urn:foo>
    <http://purl.org/dc/terms/abstract> "Abstract 1" .

--------
Test 2:
  (<urn:base1> , <http://purl.org/dc/terms/abstract> , "Abstract 3")
@base <urn:base2#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

<urn:base1>
    <http://purl.org/dc/terms/abstract> "Abstract 2", "Abstract 3" .

<urn:foo>
    <http://purl.org/dc/terms/abstract> "Abstract 1" .

--------
Test 3:
  (<' /\u003E</head\u003E<body\u003E<h1 property=> ,
<http://purl.org/dc/terms/title> , "doc1")
@base <urn:base3#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

<' /\u003E</head\u003E<body\u003E<h1 property=>
    <http://purl.org/dc/terms/title> "doc1" .

<urn:base1>
    <http://purl.org/dc/terms/abstract> "Abstract 2", "Abstract 3" .

<urn:foo>
    <http://purl.org/dc/terms/abstract> "Abstract 1" .

--------
Test 4:
  (<urn:base1> , <http://purl.org/dc/terms/title> , "doc2")
@base <urn:base4#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

<' /\u003E</head\u003E<body\u003E<h1 property=>
    <http://purl.org/dc/terms/title> "doc1" .

<urn:base1>
    <http://purl.org/dc/terms/abstract> "Abstract 2", "Abstract 3" ;
    <http://purl.org/dc/terms/title> "doc2" .

<urn:foo>
    <http://purl.org/dc/terms/abstract> "Abstract 1" .


Steps to Reproduce: 
See attached test file

% gcc -o redland-test -I$T/librdf-2010-12-02/include{,/raptor2,/rasqal}
-L$T/librdf-2010-12-02/lib -lrdf -lraptor2 redland-test.c && ./redland-test


Additional Information: 
Section 3.9 of the RDFa spec <http://www.w3.org/TR/rdfa-syntax/#sec_3.9.> talks
about XHTML fragments, but doesn't mention URI fragments at all, so I don't
believe there's any special case here.
====================================================================== 

Issue History 
Date Modified    Username       Field                    Change               
====================================================================== 
2010-12-02 10:59 normang        New Issue                                    
2010-12-02 10:59 normang        File Added: redland-test.c                    
======================================================================



More information about the redland-dev mailing list