Data Integration Landscapes: The Case for Non-optimal Solutions in Network Diffusion Models

James Nevin, Paul Groth, Michael Lees

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

The successful application of computational models presupposes access to accurate, relevant, and representative datasets. The growth of public data, and the increasing practice of data sharing and reuse, emphasises the importance of data provenance and increases the need for modellers to understand how data processing decisions might impact model output. One key step in the data processing pipeline is that of data integration and entity resolution, where entities are matched across disparate datasets. In this paper, we present a new formulation of data integration in complex networks that incorporates integration uncertainty. We define an approach for understanding how different data integration setups can impact the results of network diffusion models under this uncertainty, allowing one to systematically characterise potential model outputs in order to create an output distribution that provides a more comprehensive picture.

Original languageEnglish
Title of host publicationComputational Science – ICCS 2023 - 23rd International Conference, Proceedings
EditorsJiří Mikyška, Clélia de Mulatier, Valeria V. Krzhizhanovskaya, Peter M.A. Sloot, Maciej Paszynski, Jack J. Dongarra
PublisherSpringer Science and Business Media Deutschland GmbH
Pages494-508
Number of pages15
ISBN (Print)9783031359941
DOIs
StatePublished - 2023
Externally publishedYes
Event23rd International Conference on Computational Science, ICCS 2023 - Prague, Czech Republic
Duration: Jul 3 2023Jul 5 2023

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume14073 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference23rd International Conference on Computational Science, ICCS 2023
Country/TerritoryCzech Republic
CityPrague
Period07/3/2307/5/23

Keywords

  • Complex networks
  • Data integration
  • Entity resolution
  • Network diffusion models

Fingerprint

Dive into the research topics of 'Data Integration Landscapes: The Case for Non-optimal Solutions in Network Diffusion Models'. Together they form a unique fingerprint.

Cite this