[redland-dev] Model serialization bug

Dave Beckett dave.beckett at bristol.ac.uk
Tue Jan 4 14:43:20 PST 2005


On Tue, 4 Jan 2005 14:28:56 -0500
Christopher Schmidt <crschmidt at crschmidt.net> wrote:

...

> Before and after the patch, the test works fine. However, I'm still 
> getting the crash from raptor_free_statement.

I assume the other errors are fixed then?

> If I set the  MALLOC_CHECK_ environment variable to 0, the python 
> example script gets farther:
> 
> [1 crschmidt at peanut ~]$ python fail.py
> <?xml version="1.0" encoding="utf-8"?>
> <rdf:RDF xmlns:atom="http://purl.org/atom/ns#" 
> xmlns:dc="http://purl.org/dc/elements/1.1/" 
> xmlns:enc="http://purl.oclc.org/net/rss_2.0/enc#" 
> xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
> xmlns:rss091="http://purl.org/rss/1.0/modules/rss091#" 
> xmlns="http://purl.org/rss/1.0/"></rdf:RDF>
> 
> However, it then just sits taking up 100% CPU, and doesn't ever seem
> to finish. If I run it in GDB and kill the process, I get a bt that
> looks like:
> 
> #0  0x401ebc7e in mallopt () from /lib/libc.so.6
> #1  0x401ebb23 in mallopt () from /lib/libc.so.6
> #2  0x401ea6bf in free () from /lib/libc.so.6
> #3  0x080fc2bf in PyGrammar_RemoveAccelerators (g=0x11) at 
> Parser/acceler.c:47
> 
> I'm not at all sure this is a Redland problem at all, but I don't know
> where else to go.

That's some of the many python bad memory uses.  Try
  valgrind --tool=memcheck python /dev/null
to see what I mean.

But I digress...

> With malloc_check_ explicitly set to 1, I get a large number of errors
> 
> like:
> 
> *** glibc detected *** free(): invalid pointer: 0x081db1b8 ***
> *** glibc detected *** free(): invalid pointer: 0x081d6338 ***
> 
> and an empty RDF/XML serialization (as copy pasted above).
> 
> With malloc_check_ set explicitly to 2 or 3, I get the same backtrace
> as I copy pasted previously.
> 
> I'm not sure what else I can do to help out here, but the patch given 
> did not solve my problems. 

Memory errors such as this give different responses on each system,
so that wasn't much help to me.

However, I found some dubious memory use in raptor_free_statement
(and elsewhere in raptor_rss.c which you aren't using here).

I've made an updated patch against 1.4.3 as released, attached.

With this, 'print m' should work.  At least it works for me ;)

Dave
-------------- next part --------------
diff -urN raptor-1.4.3/raptor_general.c raptor-cvs/raptor_general.c
--- raptor-1.4.3/raptor_general.c	2005-01-03 19:43:36.000000000 +0000
+++ raptor-cvs/raptor_general.c	2005-01-04 22:19:24.000000000 +0000
@@ -2,7 +2,7 @@
  *
  * raptor_general.c - Raptor general routines
  *
- * $Id: raptor_general.c,v 1.308 2005/01/03 19:34:42 cmdjb Exp $
+ * $Id: raptor_general.c,v 1.309 2005/01/04 22:13:29 cmdjb Exp $
  *
  * Copyright (C) 2000-2005, David Beckett http://purl.org/net/dajobe/
  * Institute for Learning and Research Technology http://www.ilrt.bristol.ac.uk/
@@ -297,7 +297,8 @@
   }
 
   if(statement->predicate) {
-    if(statement->predicate_type == RAPTOR_IDENTIFIER_TYPE_PREDICATE)
+    if(statement->predicate_type == RAPTOR_IDENTIFIER_TYPE_PREDICATE ||
+       statement->predicate_type == RAPTOR_IDENTIFIER_TYPE_RESOURCE)
       raptor_free_uri((raptor_uri*)statement->predicate);
     else
       RAPTOR_FREE(cstring, statement->predicate);
diff -urN raptor-1.4.3/raptor_rss.c raptor-cvs/raptor_rss.c
--- raptor-1.4.3/raptor_rss.c	2005-01-03 19:43:36.000000000 +0000
+++ raptor-cvs/raptor_rss.c	2005-01-04 22:19:50.000000000 +0000
@@ -2,7 +2,7 @@
  *
  * raptor_rss.c - Raptor RSS tag soup parser
  *
- * $Id: raptor_rss.c,v 1.65 2005/01/03 19:14:42 cmdjb Exp $
+ * $Id: raptor_rss.c,v 1.66 2005/01/04 22:15:15 cmdjb Exp $
  *
  * Copyright (C) 2003-2004, David Beckett http://purl.org/net/dajobe/
  * Institute for Learning and Research Technology http://www.ilrt.bristol.ac.uk/
@@ -1518,7 +1518,8 @@
         if(!raptor_rss_fields_info[field].uri)
           continue;
 
-        if(s->predicate_type == RAPTOR_IDENTIFIER_TYPE_PREDICATE &&
+        if((s->predicate_type == RAPTOR_IDENTIFIER_TYPE_RESOURCE ||
+            s->predicate_type == RAPTOR_IDENTIFIER_TYPE_PREDICATE) &&
            raptor_uri_equals((raptor_uri*)s->predicate,
                              raptor_rss_fields_info[field].uri)) {
           /* found field this triple to go in 'item' so move the
@@ -1604,7 +1605,8 @@
       if(!raptor_rss_fields_info[field].uri)
         continue;
 
-      if(s->predicate_type == RAPTOR_IDENTIFIER_TYPE_PREDICATE &&
+      if((s->predicate_type == RAPTOR_IDENTIFIER_TYPE_RESOURCE ||
+          s->predicate_type == RAPTOR_IDENTIFIER_TYPE_PREDICATE) &&
          raptor_uri_equals((raptor_uri*)s->predicate,
                            raptor_rss_fields_info[field].uri)) {
         /* found field this triple to go in 'item' so move the
@@ -1804,7 +1806,9 @@
                                                   0);
 
   qname=raptor_new_qname_from_namespace_local_name(rss_serializer->rdf_nspace, (const unsigned char*)"RDF",  NULL);
-  element=raptor_new_xml_element(qname, NULL, raptor_uri_copy(base_uri));
+  if(base_uri)
+    base_uri=raptor_uri_copy(base_uri);
+  element=raptor_new_xml_element(qname, NULL, base_uri);
   rss_serializer->rdf_RDF_element=element;
 
   raptor_xml_element_declare_namespace(element, rss_serializer->rdf_nspace);
@@ -1866,7 +1870,9 @@
   if(!item->fields_count)
     return;
 
-  element=raptor_new_xml_element(raptor_qname_copy(item->node_type->qname), NULL, raptor_uri_copy(base_uri));
+  if(base_uri)
+    base_uri=raptor_uri_copy(base_uri);
+  element=raptor_new_xml_element(raptor_qname_copy(item->node_type->qname), NULL, base_uri);
   attrs=(raptor_qname **)RAPTOR_CALLOC(qnamearray, 1, sizeof(raptor_qname*));
   attrs[0]=raptor_new_qname_from_namespace_local_name(rss_serializer->rdf_nspace, (const unsigned char*)"about",  raptor_uri_as_string(item->uri));
   raptor_xml_element_set_attributes(element, attrs, 1);
@@ -1887,7 +1893,9 @@
     if(!raptor_rss_fields_info[f].uri)
       continue;
     
-    predicate=raptor_new_xml_element(raptor_qname_copy(raptor_rss_fields_info[f].qname), NULL, raptor_uri_copy(base_uri));
+    if(base_uri)
+      base_uri=raptor_uri_copy(base_uri);
+    predicate=raptor_new_xml_element(raptor_qname_copy(raptor_rss_fields_info[f].qname), NULL, base_uri);
 
     if(item->fields[f]) {
       raptor_xml_writer_raw_counted(xml_writer, (const unsigned char*)"    ", 4);
@@ -1913,11 +1921,15 @@
     raptor_xml_element* rss_items_predicate;
     int i;
     raptor_qname *rdf_Seq_qname=raptor_new_qname_from_namespace_local_name(rss_serializer->rdf_nspace, (const unsigned char*)"Seq",  NULL);
-    raptor_xml_element *rdf_Seq_element=raptor_new_xml_element(rdf_Seq_qname, NULL, raptor_uri_copy(base_uri));
+    if(base_uri)
+      base_uri=raptor_uri_copy(base_uri);
+    raptor_xml_element *rdf_Seq_element=raptor_new_xml_element(rdf_Seq_qname, NULL, base_uri);
 
     /* make the <rss:items><rdf:Seq><rdf:li /> .... </rdf:Seq></rss:items> */
 
-    rss_items_predicate=raptor_new_xml_element(raptor_qname_copy(raptor_rss_fields_info[RAPTOR_RSS_FIELD_ITEMS].qname), NULL, raptor_uri_copy(base_uri));
+    if(base_uri)
+      base_uri=raptor_uri_copy(base_uri);
+    rss_items_predicate=raptor_new_xml_element(raptor_qname_copy(raptor_rss_fields_info[RAPTOR_RSS_FIELD_ITEMS].qname), NULL, base_uri);
 
     raptor_xml_writer_raw_counted(xml_writer, (const unsigned char*)"    ", 4);
     raptor_xml_writer_start_element(xml_writer, rss_items_predicate);
@@ -1933,7 +1945,9 @@
       raptor_xml_element *rdf_li_element;
       
       rdf_li_qname=raptor_new_qname_from_namespace_local_name(rss_serializer->rdf_nspace, (const unsigned char*)"li",  NULL);
-      rdf_li_element=raptor_new_xml_element(rdf_li_qname, NULL, raptor_uri_copy(base_uri));
+      if(base_uri)
+        base_uri=raptor_uri_copy(base_uri);
+      rdf_li_element=raptor_new_xml_element(rdf_li_qname, NULL, base_uri);
       attrs=(raptor_qname **)RAPTOR_CALLOC(qnamearray, 1, sizeof(raptor_qname*));
       attrs[0]=raptor_new_qname_from_namespace_local_name(rss_serializer->rdf_nspace, (const unsigned char*)"resource",  raptor_uri_as_string(item_item->uri));
       raptor_xml_element_set_attributes(rdf_li_element, attrs, 1);
diff -urN raptor-1.4.3/raptor_serialize.c raptor-cvs/raptor_serialize.c
--- raptor-1.4.3/raptor_serialize.c	2005-01-03 19:43:36.000000000 +0000
+++ raptor-cvs/raptor_serialize.c	2005-01-04 22:19:56.000000000 +0000
@@ -2,7 +2,7 @@
  *
  * raptor_serialize.c - Serializers
  *
- * $Id: raptor_serialize.c,v 1.28 2005/01/03 19:10:24 cmdjb Exp $
+ * $Id: raptor_serialize.c,v 1.29 2005/01/04 22:16:19 cmdjb Exp $
  *
  * Copyright (C) 2004, David Beckett http://purl.org/net/dajobe/
  * Institute for Learning and Research Technology http://www.ilrt.bristol.ac.uk/
@@ -1003,7 +1003,9 @@
   
   qname=raptor_new_qname_from_namespace_local_name(context->rdf_nspace,
                                                    (const unsigned char*)"RDF",  NULL);
-  element=raptor_new_xml_element(qname, NULL, raptor_uri_copy(base_uri));
+  if(base_uri)
+    base_uri=raptor_uri_copy(base_uri);
+  element=raptor_new_xml_element(qname, NULL, base_uri);
   context->rdf_RDF_element=element;
   for(i=0; i< raptor_sequence_size(context->namespaces); i++) {
     raptor_namespace* ns=(raptor_namespace*)raptor_sequence_get_at(context->namespaces, i);
@@ -1045,6 +1047,7 @@
   raptor_xml_element* predicate_element=NULL;
   raptor_qname **attrs;
   int attrs_count=0;
+  raptor_uri* base_uri=NULL;
 
   if(statement->predicate_type == RAPTOR_IDENTIFIER_TYPE_ORDINAL) {
     predicate_ns=context->rdf_nspace;
@@ -1093,8 +1096,10 @@
   
   rdf_Description_qname=raptor_new_qname_from_namespace_local_name(context->rdf_nspace,
                                                                    (unsigned const char*)"Description",  NULL);
+  if(serializer->base_uri)
+    base_uri=raptor_uri_copy(serializer->base_uri);
   rdf_Description_element=raptor_new_xml_element(rdf_Description_qname, NULL,
-                                                 raptor_uri_copy(serializer->base_uri));
+                                                 base_uri);
 
   attrs=(raptor_qname **)RAPTOR_CALLOC(qnamearray, 3, sizeof(raptor_qname*));
   attrs_count=0;
@@ -1143,8 +1148,9 @@
   /* predicate */
   predicate_qname=raptor_new_qname_from_namespace_local_name(predicate_ns,
                                                              name,  NULL);
-  predicate_element=raptor_new_xml_element(predicate_qname, NULL,
-                                           raptor_uri_copy(serializer->base_uri));
+  if(serializer->base_uri)
+    base_uri=raptor_uri_copy(serializer->base_uri);
+  predicate_element=raptor_new_xml_element(predicate_qname, NULL, base_uri);
 
 
   /* object */
diff -urN raptor-1.4.3/raptor_uri.c raptor-cvs/raptor_uri.c
--- raptor-1.4.3/raptor_uri.c	2005-01-03 19:43:36.000000000 +0000
+++ raptor-cvs/raptor_uri.c	2005-01-04 17:31:49.000000000 +0000
@@ -2,7 +2,7 @@
  *
  * raptor_uri.c - Raptor URI resolving implementation
  *
- * $Id: raptor_uri.c,v 1.79 2005/01/03 19:33:25 cmdjb Exp $
+ * $Id: raptor_uri.c,v 1.81 2005/01/04 17:31:49 cmdjb Exp $
  *
  * Copyright (C) 2002-2004, David Beckett http://purl.org/net/dajobe/
  * Institute for Learning and Research Technology http://www.ilrt.bristol.ac.uk/
@@ -937,12 +937,17 @@
   if(!strncmp((const char*)base_detail->buffer,
               (const char*)reference_detail->buffer, prefix_len)) {
     
+    if(!base_detail->path)
+      goto buildresult;
+    
     /* Find the file name components */
     base_file = (const unsigned char*)strrchr((const char*)base_detail->path, '/');
     if(!base_file)
       goto buildresult;
     base_file++;
-    
+
+    if(!reference_detail->path)
+      goto buildresult;
     reference_file=(const unsigned char*)strrchr((const char*)reference_detail->path, '/');
     if(!reference_file)
       goto buildresult;
@@ -1329,6 +1334,9 @@
   failures += assert_uri_to_relative("http://example.com/base/foo", "http://example.com/base/#foo", ".#foo");
   failures += assert_uri_to_relative("http://example.com/base/foo", "http://example2.com/base/bar", "http://example2.com/base/bar");
   failures += assert_uri_to_relative("http://example.com/base/one?path=/should/be/ignored", "http://example.com/base/two?path=/should/be/ignored", "two?path=/should/be/ignored");
+  failures += assert_uri_to_relative("http://example.org/base#", "http://www.foo.org", "http://www.foo.org");
+  failures += assert_uri_to_relative("http://example.org", "http://a.example.org/", "http://a.example.org/");
+  failures += assert_uri_to_relative("http://example.org", "http://a.example.org", "http://a.example.org");
 
   return failures ;
 }


More information about the redland-dev mailing list