Unfortunately w3c frequently chooses not to serve their documents, i.e. there is<br>w3c policy of rejecting requests when it is too much for their servers.<br><br>So while this makes sense in theory, in practice loading the DTD can destroy performance.<br>
<br>How w3c can get away with this, I don't really understand, but it is our current reality.<br><br>Benno<br><br><div class="gmail_quote">On Thu, Jan 5, 2012 at 1:06 PM, Richard Smith <span dir="ltr"><<a href="mailto:richard@ex-parrot.com">richard@ex-parrot.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br>
There was a thread last June about XML entities in RDF:<br>
<br>
<a href="http://lists.librdf.org/pipermail/redland-dev/2011-June/002306.html" target="_blank">http://lists.librdf.org/<u></u>pipermail/redland-dev/2011-<u></u>June/002306.html</a><br>
<br>
The conclusion seemed to be that they were rarely used and not high on the list of things to support.<br>
<br>
However, I've come across situation in RDFa instead of RDF where the use of entities is much more common practice. Here is a complete test case:<br>
<br>
<?xml version="1.0" encoding="ASCII"?><br>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN"<br>
"<a href="http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd" target="_blank">http://www.w3.org/MarkUp/DTD/<u></u>xhtml-rdfa-1.dtd</a>"><br>
<html xmlns="<a href="http://www.w3.org/1999/xhtml" target="_blank">http://www.w3.org/1999/<u></u>xhtml</a>"<br>
version="XHTML+RDFa 1.0" xml:lang="en"><br>
<head><br>
<title>Test</title><br>
</head><br>
<body><br>
<p>This page was written by<br>
<span xmlns:dc="<a href="http://purl.org/dc/elements/1.1/" target="_blank">http://purl.org/dc/<u></u>elements/1.1/</a>"<br>
property="dc:creator">Jos&<u></u>eacute;</span>.</p><br>
</body><br>
</html><br>
<br>
Other than the declaration of the dc namespace, this is entirely valid XML and similar examples are found all over the web. However rapper parses it incorrectly because of the entity in the RDF literal:<br>
<br>
richard@nevis:~$ rapper --version<br>
2.0.6<br>
<br>
richard@nevis:~$ rapper -q -i rdfa test.html<br>
rapper: Error - - XML parser error: Entity 'eacute' not defined<br>
<file:///home/richard/test.<u></u>html><br>
<<a href="http://purl.org/dc/elements/1.1/creator" target="_blank">http://purl.org/dc/elements/<u></u>1.1/creator</a>> "Jos"@en .<br>
<br>
Note the missing e-acute in the name.<br>
<br>
I think fixing this (at least of builds using libxml2) is as simple as adding the XML_PARSE_DTDLOAD flag to libxml_options in raptor_grddl.c and raptor_sax2.c. Probably it should be done by way of a new raptor option that by default is disabled, much like RAPTOR_OPTION_NO_NET is.<br>
<br>
Does this seem a worthwhile change? And would it help if I knocked up a patch for it?<br>
<br>
<br>
<br>
On an unrelated issue, the property attribute in RDFa is defined as a CURI rather than a QName. In other words<br>
<br>
<span property="<a href="http://purl.org/dc/elements/1.1/creator" target="_blank">http://purl.org/dc/<u></u>elements/1.1/creator</a>"><br>
<br>
ought to be equivalent to<br>
<br>
<span xmlns:dc="<a href="http://purl.org/dc/elements/1.1/" target="_blank">http://purl.org/dc/<u></u>elements/1.1/</a>"<br>
property="dc:creator"><br>
<br>
but it seems that full URIs are not supported, only QNames. I'm not necessarily volunteering to write a patch for that as it's not inconveniencing me too much, but I thought I'd report it anyway. (The advantage of full URIs is that they can make the document valid against the DTD.)<br>
<br>
Richard<br>
______________________________<u></u>_________________<br>
redland-dev mailing list<br>
<a href="mailto:redland-dev@lists.librdf.org" target="_blank">redland-dev@lists.librdf.org</a><br>
<a href="http://lists.librdf.org/mailman/listinfo/redland-dev" target="_blank">http://lists.librdf.org/<u></u>mailman/listinfo/redland-dev</a><br>
</blockquote></div><br><br clear="all"><br>-- <br>Dr. M. Benno Blumenthal <a href="mailto:benno@iri.columbia.edu">benno@iri.columbia.edu</a><br>International Research Institute for climate and society<br>The Earth Institute at Columbia University<br>
Lamont Campus, Palisades NY 10964-8000 (845) 680-4450 <br>