TripleCloud: An infrastructure for exploratory querying over Web-scale RDF data

Christophe Guéret, Spyros Kotoulas, Paul Groth

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

9 Scopus citations

Abstract

As the availability of large scale RDF data sets has grown, there has been a corresponding growth in researchers' and practitioners' interest in analyzing and investigating these data sets. However, given their size and messiness, there is significant overhead in setting up the infrastructure to store and query them. In this paper, we present TripleCloud, a system that aims to lower the entry cost to exploring Web-scale RDF data sets. The system takes advantage of existing cloud based key-value stores (e.g. BigTable, HBase) to both enable scalability as well as hide the complexities of infrastructure deployment and maintenance. It layers over these key-value stores a robust query engine able to return approximate answers. We test the scalability of the approach scaling to over 3 billion triples for complex queries. In addition to an implementation over HBase, TripleCloud runs over the Google App Engine, allowing us to perform a cost evaluation of the approach.

Original languageEnglish
Title of host publicationProceedings - 2011 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Workshops, WI-IAT 2011
Pages245-248
Number of pages4
DOIs
StatePublished - 2011
Externally publishedYes
Event2011 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Workshops, WI-IAT 2011 - Lyon, France
Duration: Aug 22 2011Aug 27 2011

Publication series

NameProceedings - 2011 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Workshops, WI-IAT 2011
Volume3

Conference

Conference2011 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Workshops, WI-IAT 2011
Country/TerritoryFrance
CityLyon
Period08/22/1108/27/11

Keywords

  • Cloud computing
  • Key-value stores
  • RDF
  • SPARQL

Fingerprint

Dive into the research topics of 'TripleCloud: An infrastructure for exploratory querying over Web-scale RDF data'. Together they form a unique fingerprint.

Cite this