TY - GEN
T1 - Connecting scientific data to scientific experiments with provenance
AU - Miles, Simon
AU - Deelman, Ewa
AU - Groth, Paul
AU - Vahi, Karan
AU - Mehta, Gaurang
AU - Moreau, Luc
PY - 2007
Y1 - 2007
N2 - As scientific workflows and the data they operate on, grow in size and complexity, the task of defining how those workflows should execute (which resources to use, where the resources must be in readiness for processing etc.) becomes proportionally more difficult. While "workflow compilers", such as Pegasus, reduce this burden, a further problem arises: since specifying details of execution is now automatic, a workflow's results are harder to interpret, as they are partly due to specifics of execution. By automating steps between the experiment design and its results, we lose the connection between them, hindering interpretation of results. To reconnect the scientific data with the original experiment, we argue that scientists should have access to the full provenance of their data, including not only parameters, inputs and intermediary data, but also the abstract experiment, refined into a concrete execution by the "workflow compiler". In this paper, we describe preliminary work on adapting Pegasus to capture the process of workflow refinement in the PASOA provenance system.
AB - As scientific workflows and the data they operate on, grow in size and complexity, the task of defining how those workflows should execute (which resources to use, where the resources must be in readiness for processing etc.) becomes proportionally more difficult. While "workflow compilers", such as Pegasus, reduce this burden, a further problem arises: since specifying details of execution is now automatic, a workflow's results are harder to interpret, as they are partly due to specifics of execution. By automating steps between the experiment design and its results, we lose the connection between them, hindering interpretation of results. To reconnect the scientific data with the original experiment, we argue that scientists should have access to the full provenance of their data, including not only parameters, inputs and intermediary data, but also the abstract experiment, refined into a concrete execution by the "workflow compiler". In this paper, we describe preliminary work on adapting Pegasus to capture the process of workflow refinement in the PASOA provenance system.
UR - http://www.scopus.com/inward/record.url?scp=42449095840&partnerID=8YFLogxK
U2 - 10.1109/E-SCIENCE.2007.22
DO - 10.1109/E-SCIENCE.2007.22
M3 - Contribución a la conferencia
AN - SCOPUS:42449095840
SN - 0769530648
SN - 9780769530642
T3 - Proceedings - e-Science 2007, 3rd IEEE International Conference on e-Science and Grid Computing
SP - 179
EP - 186
BT - Proceedings - e-Science 2007, 3rd IEEE International Conference on e-Science and Grid Computing
T2 - E-Science 2007, 3rd IEEE International Conference on E-Science and Grid Computing
Y2 - 10 December 2007 through 13 December 2007
ER -