[redland-dev] Raptor 0.9.9 new API features - parsing URIs, aborting parsing

Dave Beckett dave.beckett at bristol.ac.uk
Fri Mar 28 22:44:09 GMT 2003


I forgot to describe some new API features when I released Raptor
0.9.9 earlier today. 


raptor_parse_uri

You can now parse any URI given an appropriate WWW library
that deals with it for you.  This is at present either libcurl (most
any URL) or libxml2 (http URLs).   To use either of these, you need
the new method:
   int raptor_parse_uri(rdf_parser* parser, raptor_uri *uri,
                        raptor_uri *base_uri)
which works like raptor_parse_file but for any URI that the WWW
library supports.  base_uri can be NULL, in which case uri is used.

I'll be adding another API call raptor_parse_uri_with_connection that
lets you re-use an existing WWW connection, such as previously
initialised curl handle.


raptor_parser_abort, er raptor_parse_abort

You can now abort parsing in the statement callback routine (or any
other) to make it stop generating triples and end as soon as it can.
Somebody asked for this, but I can't find the email.

This is useful when some system or application reason means you want
no more triples, and need the thread of control back from the
raptor_parse_uri or raptor_parse_file methods.

This is done with a new method:
  raptor_parser_abort(rdf_parser* parser, char *reason);
which I intended to allow passing back a reason for the aborting, to
be done later.  

However I've realised that given this is the user code doing this
aborting, it presumably already knows why so the reason seems rather
useless.  So I'll be promoting the alternative in the next release:

  raptor_parse_abort(rdf_parser* parser)

unless somebody can think why the former is worth keeping.


I attach an example program (now in examples/raptor_abort.c) that
uses both of these new methods and aborts parsing after the 5th
triple.  Under the usual LGPL/GPL/MPL license.

Dave

======================================================================

/* -*- Mode: c; c-basic-offset: 2 -*-
 *
 * raptor_abort.c - Raptor abort example code
 *
 * $Id: raptor_abort.c,v 1.1 2003/03/28 22:40:39 cmdjb Exp $
 *
 * Copyright (C) 2003 David Beckett - http://purl.org/net/dajobe/
 * Institute for Learning and Research Technology - http://www.ilrt.org/
 * University of Bristol - http://www.bristol.ac.uk/
 * 
 * This package is Free Software or Open Source available under the
 * following licenses (these are alternatives):
 *   1. GNU Lesser General Public License (LGPL)
 *   2. GNU General Public License (GPL)
 *   3. Mozilla Public License (MPL)
 * 
 * See LICENSE.html or LICENSE.txt at the top of this package for the
 * full license terms.
 * 
 */


#ifdef HAVE_CONFIG_H
#include <config.h>
#endif

#ifdef WIN32
#include <win32_config.h>
#endif

#include <stdio.h>
#include <string.h>
#include <stdarg.h>
#ifdef HAVE_STDLIB_H
#include <stdlib.h>
#endif

/* for the memory allocation functions */
#if defined(HAVE_DMALLOC_H) && defined(RAPTOR_MEMORY_DEBUG_DMALLOC)
#include <dmalloc.h>
#endif

/* Raptor includes */
#include <raptor.h>


static void handle_statements(void *user_data, const raptor_statement *statement);
int main(int argc, char *argv[]);


typedef struct 
{
  raptor_parser *parser;
  FILE *stream;
  int count;
  int max;
  int stopped;
} my_data;


static
void handle_statements(void *user_data, const raptor_statement *statement) 
{
  my_data* me=(my_data*)user_data;
  
  me->count++;
  if(me->count > me->max) {
    fprintf(me->stream, "Reached %d statements, stopping\n", me->max);
    raptor_parser_abort(me->parser, NULL);
    /* 0.9.10 onwards? raptor_parse_abort(me->parser); */
    me->stopped=1;
    return;
  }

  fprintf(me->stream, "Saw statement %d\n", me->count);
}


int
main (int argc, char *argv[]) 
{
  raptor_parser* rdf_parser;
  raptor_uri* uri;
  my_data* me;
  const char *program;
  int rc;
  
  program=argv[0];

  if(argc != 2) {
    fprintf(stderr, "%s: USAGE [RDF-XML content URI]\n", program);
    exit(1);
  }

  raptor_init();

  me=(my_data*)malloc(sizeof(my_data));
  if(!me) {
    fprintf(stderr, "%s: Out of memory\n", program);
    exit(1);
  }

  me->stream=stderr;
  me->count=0;
  me->max=5;

  uri=raptor_new_uri(argv[1]);
  rdf_parser=raptor_new_parser("rdfxml");

  me->parser=rdf_parser;

  raptor_set_statement_handler(rdf_parser, me, handle_statements);

  me->stopped=0;
  rc=raptor_parse_uri(rdf_parser, uri, NULL);

  fprintf(stderr, "%s: Parser returned status %d, stopped? %s\n", program, rc,
          (me->stopped ? "yes" : "no"));

  free(me);
  
  raptor_free_parser(rdf_parser);

  raptor_free_uri(uri);

  raptor_finish();

#ifdef RAPTOR_WWW_LIBCURL
  curl_global_cleanup();
#endif

  return 0;
}



More information about the redland-dev mailing list