TY - GEN
T1 - Looking inside the black-box
T2 - 5th International Provenance and Annotation Workshop, IPAW 2014
AU - Stamatogiannakis, Manolis
AU - Groth, Paul
AU - Bos, Herbert
N1 - Publisher Copyright:
© Springer International Publishing Switzerland 2015.
PY - 2015
Y1 - 2015
N2 - Knowing the provenance of a data item helps in ascertaining its trustworthiness. Various approaches have been proposed to track or infer data provenance. However, these approaches either treat an executing program as a black-box, limiting the fidelity of the captured provenance, or require developers to modify the program to make it provenance-aware. In this paper, we introduce DataTracker, a new approach to capturing data provenance based on taint tracking, a technique widely used in the security and reverse engineering fields. Our system is able to identify data provenance relations through dynamic instrumentation of unmodified binaries, without requiring access to, or knowledge of, their source code. Hence, we can track provenance for a variety of well-known applications. Because DataTracker looks inside the executing program, it captures high-fidelity and accurate data provenance.
AB - Knowing the provenance of a data item helps in ascertaining its trustworthiness. Various approaches have been proposed to track or infer data provenance. However, these approaches either treat an executing program as a black-box, limiting the fidelity of the captured provenance, or require developers to modify the program to make it provenance-aware. In this paper, we introduce DataTracker, a new approach to capturing data provenance based on taint tracking, a technique widely used in the security and reverse engineering fields. Our system is able to identify data provenance relations through dynamic instrumentation of unmodified binaries, without requiring access to, or knowledge of, their source code. Hence, we can track provenance for a variety of well-known applications. Because DataTracker looks inside the executing program, it captures high-fidelity and accurate data provenance.
KW - Data provenance
KW - Dynamic
KW - PROV
KW - Taint analysis
KW - Taint tracking
UR - http://www.scopus.com/inward/record.url?scp=84928778150&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-16462-5_12
DO - 10.1007/978-3-319-16462-5_12
M3 - Contribución a la conferencia
AN - SCOPUS:84928778150
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 155
EP - 167
BT - Provenance and Annotation of Data and Processes - 5th International Provenance and Annotation Workshop, IPAW 2014, Revised Selected Papers
A2 - Plale, Beth
A2 - Ludäscher, Bertram
A2 - Ludäscher, Bertram
PB - Springer Verlag
Y2 - 10 June 2014 through 11 June 2014
ER -