[redland-dev] Entities in RDFa

Benno Blumenthal benno at iri.columbia.edu
Thu Jan 5 13:25:55 EST 2012


Unfortunately w3c frequently chooses not to serve their documents, i.e.
there is
w3c policy of rejecting requests when it is too much for their servers.

So while this makes sense in theory, in practice loading the DTD can
destroy performance.

How w3c can get away with this, I don't really understand, but it is our
current reality.

Benno

On Thu, Jan 5, 2012 at 1:06 PM, Richard Smith <richard at ex-parrot.com> wrote:

>
> There was a thread last June about XML entities in RDF:
>
>  http://lists.librdf.org/**pipermail/redland-dev/2011-**June/002306.html<http://lists.librdf.org/pipermail/redland-dev/2011-June/002306.html>
>
> The conclusion seemed to be that they were rarely used and not high on the
> list of things to support.
>
> However, I've come across situation in RDFa instead of RDF where the use
> of entities is much more common practice. Here is a complete test case:
>
>  <?xml version="1.0" encoding="ASCII"?>
>  <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN"
>      "http://www.w3.org/MarkUp/DTD/**xhtml-rdfa-1.dtd<http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd>
> ">
>  <html xmlns="http://www.w3.org/1999/**xhtml<http://www.w3.org/1999/xhtml>
> "
>        version="XHTML+RDFa 1.0" xml:lang="en">
>    <head>
>      <title>Test</title>
>    </head>
>    <body>
>      <p>This page was written by
>        <span xmlns:dc="http://purl.org/dc/**elements/1.1/<http://purl.org/dc/elements/1.1/>
> "
>              property="dc:creator">Jos&**eacute;</span>.</p>
>    </body>
>  </html>
>
> Other than the declaration of the dc namespace, this is entirely valid XML
> and similar examples are found all over the web.  However rapper parses it
> incorrectly because of the entity in the RDF literal:
>
>  richard at nevis:~$ rapper --version
>  2.0.6
>
>  richard at nevis:~$ rapper -q -i rdfa test.html
>  rapper: Error -  - XML parser error: Entity 'eacute' not defined
>  <file:///home/richard/test.**html>
>    <http://purl.org/dc/elements/**1.1/creator<http://purl.org/dc/elements/1.1/creator>>
> "Jos"@en .
>
> Note the missing e-acute in the name.
>
> I think fixing this (at least of builds using libxml2) is as simple as
> adding the XML_PARSE_DTDLOAD flag to libxml_options in raptor_grddl.c and
> raptor_sax2.c. Probably it should be done by way of a new raptor option
> that by default is disabled, much like RAPTOR_OPTION_NO_NET is.
>
> Does this seem a worthwhile change?  And would it help if I knocked up a
> patch for it?
>
>
>
> On an unrelated issue, the property attribute in RDFa is defined as a CURI
> rather than a QName.  In other words
>
>  <span property="http://purl.org/dc/**elements/1.1/creator<http://purl.org/dc/elements/1.1/creator>
> ">
>
> ought to be equivalent to
>
>  <span xmlns:dc="http://purl.org/dc/**elements/1.1/<http://purl.org/dc/elements/1.1/>
> "
>        property="dc:creator">
>
> but it seems that full URIs are not supported, only QNames. I'm not
> necessarily volunteering to write a patch for that as it's not
> inconveniencing me too much, but I thought I'd report it anyway.  (The
> advantage of full URIs is that they can make the document valid against the
> DTD.)
>
> Richard
> ______________________________**_________________
> redland-dev mailing list
> redland-dev at lists.librdf.org
> http://lists.librdf.org/**mailman/listinfo/redland-dev<http://lists.librdf.org/mailman/listinfo/redland-dev>
>



-- 
Dr. M. Benno Blumenthal          benno at iri.columbia.edu
International Research Institute for climate and society
The Earth Institute at Columbia University
Lamont Campus, Palisades NY 10964-8000   (845) 680-4450
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.librdf.org/pipermail/redland-dev/attachments/20120105/3c6f0441/attachment.html>


More information about the redland-dev mailing list