[redland-dev] Redland scalability

Wed Aug 20 00:12:05 BST 2008

On Tue Aug 19, 2008 at 05:16:07PM -0300, Bruno Barberi Gnecco wrote:
> Hi,
>
> 	Has anyone here used Redland with a large number of triples (>10 million)?
> How does it scale?

once or twice i loaded some small dbpedia sets in that range

i cant really say, as rasqal has seen some commits since then.

perhaps dajobe can comment on the dataloads Y! has put redland through?

proper indexing is essential at that size, and did get performance closer to Virtuoso (which usually still won)

be aware Virtuoso is a gigantic monolithic beast with XML processing stuff, a SQL DB, etc..

on that note, i think the best solution towards more scalable semweb stuff is more modularity

eg, acces to internal rasqal set-intersection stuff so one can do offline (ahead of time) aggregations hinted on use patterns

or overload functions using some class-inheritance/super() technique and provide optimized SQL, etc

currently i switched to a FS based store to support flexible optimization/aggregation strategy and remove the 'black box beast' components from the system. if god took away my FS, id defintiely look at redland again, before anything else

hope that helps