[redland-dev] [rapper] Use random names for blank nodes

Nicholas Humfrey njh at aelius.com
Sun Apr 23 08:40:15 EDT 2017


Dear F,


There isn't any functionality in the rapper command line tool for 
choosing different identifier generator.



However libraptor does have support for custom generation of blank node 
identifiers:

     raptor_world_set_generate_bnodeid_handler()

There is some documentation for this here:
http://librdf.org/raptor/api/raptor2-section-world.html#raptor-world-set-generate-bnodeid-handler

And the default implementation is defined here:
https://github.com/dajobe/raptor/blob/master/src/raptor_general.c#L283

librdf uses a seperate implementation, to ensure that every blank node 
in a world is unique:
https://github.com/dajobe/librdf/blob/master/src/rdf_raptor.c#L139
https://github.com/dajobe/librdf/blob/master/src/rdf_init.c#L532



The challenge with identifier generation is ensuring that they are 
unique.
I guess you could write some kind of hash function that hashes the 
integer identifier, but that wouldn't result in different bnode 
identifiers in different documents. Creating a random seed also wouldn't 
guarantee it, but would significantly reduce the risk.

The advantage of blank node identifiers like _:genid1, is that they 
don't pretend that there won't be conflicts between documents.


nick.



On 2017-04-21 01:56, F wrote:
> Hello,
> 
>     I'm trying to use rapper to convert a bunch of rdf/xml files to nt.
> Currently, it's assigning sequential names to blank nodes, like
> _:genid1, _:genid2, and so forth. I'd like to know if it's possible to
> use more randomized names instead (something like
> _:olret5ry67uyhgrt5zcx).
> 
> Thank you.
> 
> _______________________________________________
> redland-dev mailing list
> redland-dev at lists.librdf.org
> http://lists.librdf.org/mailman/listinfo/redland-dev



More information about the redland-dev mailing list