[redland-dev] Redland RSS changes (was Re: Fwd: [rss-media] Re: Media extensions)

Dave Beckett dave.beckett at bristol.ac.uk
Wed Jul 27 17:30:44 BST 2005


On Tue, 2005-07-26 at 17:28 +0200, Suzan Foster wrote to the rss-media
list:

> > 
> >  I have been tweaking the rss-tag-soup parser (librdf) which I can
> > get to do a reasonable job on multiple media:content elements and
> > loose properties [2], but can't handle the <media:group> element.
> > Nor the <media:credit>, <media:category> and <media:text> elements
> > properly yet.

This is a warning, I've made huge changes to the RSS Tag Soup parser in
CVS :)  Copied to redland-dev since this is of probably interest there.

Firstly, I ripped the code into three parts - parser, serializer
(raptor_serializer_rss.c) and common (raptor_rss_common.c).  The parser
now does atom 1.0 input and rewrites atom: terms into rss: ones where
they overlap.  This is note complete as it ignores the atom:content for
now, as that needs scraping from XML into a large RDF literal. Tedious.

The serializer can now do atom 1.0 output separate from rss1.0.  (You
will have to configure with --enable-maintainer-mode to get this
activated)

This means you can do (rss any|atom 0.3|atom 1.0) in and (rss 1.0|atom
1.0) out.  With redland/rasqal in between you can query atom1.0 direct
with sparql - see example #10 at http://librdf.org/query/ .  Check your
mime types if you don't get what you expect.

The new common code is an internal rss_model class - this is NOT in the
public API.  The model so far is:
   1. the common items (channel aka atom:feed, image, textinput, ...)
   2. the sequence of rss:item (aka atom:entry).

Additions like the media parts, atom:link (another sequence),
atom:content and so on would best be added to the common rss_model code
and then updated by the parser (input) and read by the serializer
(output).

Dave

> 



More information about the redland-dev mailing list