Managing the academic data lifecycle: A case study of HPCC

Michael E. Payne, Linh B. Ngo, Flavio Villanustre, Amy W. Apon

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Scopus citations

Abstract

Academic data can be classified into multiple categories and come from a large number of sources. Many research areas require combining data from different sources into a unified set on which analytical techniques can be applied. In this research paper the authors introduce the High Performance Computing Cluster (HPCC) as a platform to streamline the process of ingesting, curating, integrating and transforming scholarly data from multiple sources and in varying formats, particularly when several of these datasets lack common attributes to support the integration process.

Original languageEnglish
Title of host publicationProceedings - 2014 IEEE International Conference on Big Data, IEEE Big Data 2014
EditorsWo Chang, Jun Huan, Nick Cercone, Saumyadipta Pyne, Vasant Honavar, Jimmy Lin, Xiaohua Tony Hu, Charu Aggarwal, Bamshad Mobasher, Jian Pei, Raghunath Nambiar
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages22-30
Number of pages9
ISBN (Electronic)9781479956654
DOIs
StatePublished - 2014
Externally publishedYes
Event2nd IEEE International Conference on Big Data, IEEE Big Data 2014 - Washington, United States
Duration: Oct 27 2014Oct 30 2014

Publication series

NameProceedings - 2014 IEEE International Conference on Big Data, IEEE Big Data 2014

Conference

Conference2nd IEEE International Conference on Big Data, IEEE Big Data 2014
Country/TerritoryUnited States
CityWashington
Period10/27/1410/30/14

Keywords

  • Academic research
  • Big data
  • Data integration
  • HPCC
  • Scalable platform

Fingerprint

Dive into the research topics of 'Managing the academic data lifecycle: A case study of HPCC'. Together they form a unique fingerprint.

Cite this