[redland-dev] [Raptor RDF Parsing and Serializing Library 0000309]: Raptor starts to consume large amounts of memory processing uniparc.rdf (raptor_avltree_sprout)

Mantis Bug Tracker mantis-bug-sender at librdf.org
Mon Aug 10 13:41:03 CEST 2009


The following issue has been SUBMITTED. 
====================================================================== 
http://bugs.librdf.org/mantis/view.php?id=309 
====================================================================== 
Reported By:                AlisdairO
Assigned To:                
====================================================================== 
Project:                    Raptor RDF Parsing and Serializing Library
Issue ID:                   309
Category:                   api
Reproducibility:            always
Severity:                   major
Priority:                   normal
Status:                     new
Parsing/Serializing Syntax: rdfxml 
====================================================================== 
Date Submitted:             2009-08-10 11:41
Last Modified:              2009-08-10 11:41
====================================================================== 
Summary:                    Raptor starts to consume large amounts of memory
processing uniparc.rdf (raptor_avltree_sprout)
Description: 
I'm making a program to process RDF files and produce some stats on them. 
I've noticed that on a couple of xml/rdf files that my program's heap
usage started to increase, despite the fact that in my code I do only a
few large allocations at the start of the program.  On further
investigation with Massif, it appears that raptor has started to consume
large quantities of memory.  I've attached the massif file, but the
pertinent bit is inline below:

97.94% (2,439,701,327B) (heap allocation functions) malloc/new/new[],
--alloc-fns, etc.
->36.13% (900,000,002B) 0x401632: examinerdf_new_hashtable
(examineRDF_hashtable.c:19)
| ->28.91% (720,000,001B) 0x4049AC: main (examineRDF.c:555)
| |
| ->07.23% (180,000,001B) 0x404983: main (examineRDF.c:554)
|
->20.88% (520,000,000B) 0x404880: main (examineRDF.c:515)
|
->16.06% (400,000,000B) 0x401481: examinerdf_new_hashtable
(examineRDF_hashtable.c:16)
| ->12.85% (320,000,000B) 0x4049AC: main (examineRDF.c:555)
| |
| ->03.21% (80,000,000B) 0x404983: main (examineRDF.c:554)
|
->12.04% (300,000,002B) 0x401553: examinerdf_new_hashtable
(examineRDF_hashtable.c:18)
| ->09.64% (240,000,001B) 0x4049AC: main (examineRDF.c:555)
| |
| ->02.41% (60,000,001B) 0x404983: main (examineRDF.c:554)
|
->10.04% (250,000,000B) 0x4047C9: main (examineRDF.c:514)
|
->02.37% (58,921,080B) 0x4C2B898: raptor_avltree_sprout
(raptor_avltree.c:589)
| ->01.94% (48,356,560B) 0x4C2B6DC: raptor_avltree_sprout
(raptor_avltree.c:490)
| | ->01.43% (35,731,960B) 0x4C2B6DC: raptor_avltree_sprout
(raptor_avltree.c:490)
| | | ->01.02% (25,407,480B) 0x4C2B6DC: raptor_avltree_sprout
(raptor_avltree.c:490)
| | | | ->01.02% (25,407,480B) in 3 places, all below massif's threshold
(01.00%)
| | | |
| | | ->00.41% (10,324,480B) in 1+ places, all below ms_print's threshold
(01.00%)
| | |
| | ->00.51% (12,624,600B) in 1+ places, all below ms_print's threshold
(01.00%)
| |
| ->00.42% (10,564,520B) in 1+ places, all below ms_print's threshold
(01.00%)
|
->00.43% (10,780,243B) in 1+ places, all below ms_print's threshold
(01.00%)


Now, obviously 2.37% is not that much RAM, but if I let the program run
for longer (which I didn't do with Massif since it runs so slowly), it
ends up eating all of the 8 gig on the system.  I get this issue with the
uniparc.rdf file (subset of the uniprot dataset) available from
ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/rdf/ . 
Initial investigation suggests I get it with uniprot.rdf too.  I don't get
it with uniref.rdf.
====================================================================== 

Issue History 
Date Modified    Username       Field                    Change               
====================================================================== 
2009-08-10 11:41 AlisdairO      New Issue                                    
2009-08-10 11:41 AlisdairO      File Added: raptor_memusage_bad                 
  
2009-08-10 11:41 AlisdairO      Parsing/Serializing Syntax => rdfxml          
======================================================================



More information about the redland-dev mailing list