[redland-dev] raptor rdf/xml parsing and encoding

Dave Beckett dave at dajobe.org
Fri May 30 00:43:00 BST 2008


Sebastian Trüg wrote:
> The raptor API says that all strings (URIs and literals) are utf8. However, 
> when parsing a file with encoding UTF-8 or encoding ISO8859-1 containing a 
> literal with a german umlaut, I do not get utf8 in either case.

Can you file a bug and attach that file (or something minimal that 
demonstrates it)?

> So before searching though the raptor code and trying to figure it out by 
> myself two questions:
> 
> - Does raptor ALWAYS produce utf8 strings?

Yes.

> 
> - Is the following code acceptable:
> 
> void raptorTriplesHandler( void* userData, const raptor_statement* triple )
> {
>    [...]
>    switch( triple->object_type ) {
>    case RAPTOR_IDENTIFIER_TYPE_LITERAL:
>        fromUtf8( (const char*)triple->object );
>    [...]
>    }
>    [...]

I don't know what that does, but every raptor (& redland) literal string
and URI string are all UTF-8.  Everywhere you see unsigned char*, basically.

Dave


More information about the redland-dev mailing list