TripleProv: Efficient processing of lineage queries in a native RDF store

Marcin Wylot, Philippe Cudré-Mauroux, Paul Groth

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

36 Scopus citations

Abstract

Given the heterogeneity of the data one can find on the Linked Data cloud, being able to trace back the provenance of query results is rapidly becoming a must-have feature of RDF systems. While provenance models have been extensively discussed in recent years, little attention has been given to the efficient implementation of provenance-enabled queries inside data stores. This paper introduces TripleProv: a new system extending a native RDF store to efficiently handle such queries. TripleProv implements two different storage models to physically co-locate lineage and instance data, and for each of them implements algorithms for tracing provenance at two granularity levels. In the following, we present the overall architecture of our system, its different lineage storage models, and the various query execution strategies we have implemented to efficiently answer provenance-enabled queries. In addition, we present the results of a comprehensive empirical evaluation of our system over two different datasets and workloads. Copyright is held by the International World Wide Web Conference Committee (IW3C2).

Original languageEnglish
Title of host publicationWWW 2014 - Proceedings of the 23rd International Conference on World Wide Web
PublisherAssociation for Computing Machinery
Pages455-465
Number of pages11
ISBN (Electronic)9781450327442
DOIs
StatePublished - Apr 7 2014
Externally publishedYes
Event23rd International Conference on World Wide Web, WWW 2014 - Seoul, Korea, Republic of
Duration: Apr 7 2014Apr 11 2014

Publication series

NameWWW 2014 - Proceedings of the 23rd International Conference on World Wide Web

Conference

Conference23rd International Conference on World Wide Web, WWW 2014
Country/TerritoryKorea, Republic of
CitySeoul
Period04/7/1404/11/14

Keywords

  • Linked Open Data
  • Provenance Polynomials
  • Provenance Queries
  • RDF

Fingerprint

Dive into the research topics of 'TripleProv: Efficient processing of lineage queries in a native RDF store'. Together they form a unique fingerprint.

Cite this