[redland-dev] [Raptor RDF Parsing and Serializing Library 0000290]: Parsing turtle files with lots of namespaces is very slow

Mantis Bug Tracker mantis-bug-sender at librdf.org
Mon Nov 24 19:41:57 CET 2008


The following issue has been SUBMITTED. 
====================================================================== 
http://bugs.librdf.org/mantis/view.php?id=290 
====================================================================== 
Reported By:                anonymous
Assigned To:                
====================================================================== 
Project:                    Raptor RDF Parsing and Serializing Library
Issue ID:                   290
Category:                   api
Reproducibility:            always
Severity:                   minor
Priority:                   normal
Status:                     new
Parsing/Serializing Syntax:  
====================================================================== 
Date Submitted:             2008-11-24 18:41
Last Modified:              2008-11-24 18:41
====================================================================== 
Summary:                    Parsing turtle files with lots of namespaces is very
slow
Description: 
Turtle documents with lots of @prefix headers are very low to parse. Of eg.
the first 9M triples of the 25M triples BSBM dataset takes 7m58s to parse
on a 2GHz 16GB linux machine.

This is largely down to the way namespaces are repesented. A quick hack to
use a simple hashtable instead of a list cuts the parse time down to
1m47s.

A patch that implements the quick hack is attached. It passes as many test
as before (as far as I can see), but may leak memory, and is a little more
memory hungry on small files.
====================================================================== 

Issue History 
Date Modified    Username       Field                    Change               
====================================================================== 
2008-11-24 18:41 anonymous      New Issue                                    
2008-11-24 18:41 anonymous      File Added: raptor-ns-hash.patch                
   
======================================================================



More information about the redland-dev mailing list