[redland-dev] XML special characters

Jason Johnston redland at lojjic.net
Tue Mar 18 19:31:04 GMT 2003


I'm running into a problem when parsing and re-serializing XML Literals, 
where special characters such as ampersands and angle brackets are not 
serialized as entity references as they should be, so the result is 
invalid XML.

See the example below.  The file parsed in is identical to the 
serialized result, except the "&amp; &gt; &lt;" becomes "& > <".

Any ideas?  Thanks in advance.
--Jason



== test.pl ==

use RDF::Redland;

my $storage=new RDF::Redland::Storage("hashes", "temp",
                                       "new='yes',hash-type='memory'");
my $model=new RDF::Redland::Model($storage, "");

# Parse it in:
my $parser=new RDF::Redland::Parser("raptor", "application/rdf+xml");
my $uri=new RDF::Redland::URI("file:test-in.rdf");
$stream=$parser->parse_as_stream($uri,$uri);
while(!$stream->end) {
   $model->add_statement($stream->current);
   $stream->next;
}

# Serialize it out:
my $serializer=new RDF::Redland::Serializer("rdfxml");
$serializer->serialize_model_to_file("test-out.rdf", undef, $model);
$serializer=undef;


== test-in.rdf ==

<?xml version="1.0" encoding="utf-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
   <rdf:Description rdf:about="http://test">
     <ns0:content xmlns:ns0="http://test#" rdf:parseType="Literal">
       <doc> &amp; &gt; &lt; </doc>
     </ns0:content>
   </rdf:Description>
</rdf:RDF>


== test-out.rdf ==

<?xml version="1.0" encoding="utf-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
   <rdf:Description rdf:about="http://test">
     <ns0:content xmlns:ns0="http://test#" rdf:parseType="Literal">
       <doc> & > < </doc>
     </ns0:content>
   </rdf:Description>
</rdf:RDF>




More information about the redland-dev mailing list