[redland-dev] raptor rdf/xml parsing and encoding
Sebastian Trüg
strueg at mandriva.com
Fri May 30 09:38:19 BST 2008
On Friday 30 May 2008 01:43:00 Dave Beckett wrote:
> Sebastian Trüg wrote:
> > The raptor API says that all strings (URIs and literals) are utf8.
> > However, when parsing a file with encoding UTF-8 or encoding ISO8859-1
> > containing a literal with a german umlaut, I do not get utf8 in either
> > case.
>
> Can you file a bug and attach that file (or something minimal that
> demonstrates it)?
I did not want to do that before being sure that the problem is not self-made.
> > - Does raptor ALWAYS produce utf8 strings?
>
> Yes.
>
> > - Is the following code acceptable:
> >
> > void raptorTriplesHandler( void* userData, const raptor_statement* triple
> > ) {
> > [...]
> > switch( triple->object_type ) {
> > case RAPTOR_IDENTIFIER_TYPE_LITERAL:
> > fromUtf8( (const char*)triple->object );
> > [...]
> > }
> > [...]
>
> I don't know what that does, but every raptor (& redland) literal string
> and URI string are all UTF-8. Everywhere you see unsigned char*,
> basically.
See the "fromUtf8" as some method that takes utf8 data. The question is just
if I handle the raptor_statement parameter correctly. The raptor
documentation is pretty thin and the examples do not help either.
For example: the docu says "Representation of RDF triples inside Raptor. They
are a sequence of three raptor_identifier"
But that seems not true since raptor_statement does not use raptor_identifier.
So I am confused.
And before I am not sure that I use the API correctly I don't want to dig into
the problem.
Cheers,
Sebastian
More information about the redland-dev
mailing list