[redland-dev] Using Python file objects

Dave Beckett dave.beckett at bristol.ac.uk
Wed Sep 14 11:28:21 BST 2005


On Wed, 2005-09-14 at 11:27 +1200, John C Barstow wrote:
> I've been working with the Python bindings quite a lot and one of the 
> frustrations I have is that I can't pass file-like objects to the 
> Redland API.  Enough so that I've taken a look at the problem.
> 
> SWIG 1.1 (and 1.3) allow Python objects to be mapped to FILE* pointers. 
> The typemap looks like this:
> 
> %typemap (python, in) FILE* {
>   if(!PyFile_Check($source)) {
>     PyErr_SetString(PyExc_TypeError, "Need a file!");
>     return NULL;
>   }
>   $target = PyFile_AsFile($source);
> }
> 
> So all we really need to do is to expose an API (in the parser and the 
> serializer, last I looked) that takes a FILE* as a parameter. Really, 
> there's no reason we couldn't refactor the existing code to delegate to 
> such a function - the existing code ultimately opens a file anyway.
> 
> Doing this would open up a lot of good opportunties in Python - 
> including Gnome-VFS and WebDAV support, without having to make any 
> substantial changes or introduce new dependencies.
> 
> Given the limited scope of the changes, I think this would be an easy 
> win. Presumably other such typemaps exist (or could be built following a 
> similar model) for other bindings as well.  Unfortunately I'm not well 
> placed to actually produce a patch at the moment; however will look at 
> doing so in the not very distant future if someone doesn't beat me to 
> it.

I've some good news and bad news.

Good news first, you can already serialize to a passed-in FILE* in two
ways.

The first way is using raptor_serialize_start_to_file_handle() which
wraps the second method, using a raptor_iostream* that writes to a file
handle, created by raptor_new_iostream_to_file_handle().

The simplest code is like this:

serializer=raptor_new_serializer(name)
raptor_serialize_start_to_file_handle(serializer, base_uri, handle)
while got triples 
  raptor_serialize_statement(serializer, triple)
endwhile
raptor_serialize_end(serializer)

Now, the redland API wraps the raptor one and provides bindings to it,
and doesn't expose all the functionality as it's not needed, and
sometimes can't be used portably across systems and bindings.

FILE* parameters are one of the tricky cases to handle portably.

In particular, later versions of Perl do not use FILE* for their I/O but
have a different abstraction called PerlIO.  I invented the
raptor_iostream* abstraction to someday allow this kind of thing to be
done as well as the primary use case, efficient serializing to strings.

So if you can create a raptor_iostream* around some object, you can
serialize to it.  This is done with raptor_new_iostream_from_handler():

// fill in fields of raptor_iostream_handler* handler with methods
// that returns the data from your object. user_data is a pointer
// that is passed back to your functions.
iostream=raptor_new_iostream_from_handler(user_data, handler)

raptor_serialize_start(serializer, base_uri, iostream)
while got triples 
  raptor_serialize_statement(serializer, triple)
endwhile
raptor_serialize_end(serializer)

The above method should allow you to write to a PerlIO or other
non-standard data sink.

However, that's at the raptor level.  Redland doesn't yet have an API
that allows you to write to a raptor_iostream*.  It would be something
like librdf_serialize_model_to_iostream() and I can take your message as
a request to add it.

Hmm, maybe that is bad news.


The real Bad News: this is only for serializing i.e. writing data to an
output and you asked about reading.

For reading from a data source, there are two choices - strings or using
raptor_www* to read from URIs.  There is no use of a general reading
iostream-style abstraction at present, although I guess I could extend
raptor_iostream* to handle it.  

It might be easier though for me to add a helper function to the
raptor_www class so that you can make them from FILE*.  Something like:
  www=raptor_www_new_from_file_handle(FILE* handle, raptor_uri*
base_uri)

and again, pass that up to the redland interface as something like
  librdf_parser_parse_file_handle(librdf_parser* parser, FILE *handle,
librdf_uri* base_uri);

OK, that's 2 x bad news :(

and this seems to have turned into a tutorial essay, oops.

Dave





More information about the redland-dev mailing list