Abstract
This paper introduces the LOD Laundromat meta-dataset, a continuously updated RDF meta-dataset-that describes the documents crawled, cleaned and (re)published by the LOD Laundromat. This meta-dataset-of over 110 million triples contains structural information for more than 650,000-documents (and growing). Dataset meta-data-is often not provided alongside published data, it is incomplete or it is incomparable given the way they were generated. The LOD Laundromat-meta-dataset-provides a wide range of structural dataset properties, such as the number of triples in LOD Laundromat-documents, the average degree in documents, and the distinct number of Blank Nodes, Literals and IRIs. This makes it a particularly useful dataset for data comparison and analytics, as well as for the global study of the Web of Data. This paper presents the dataset, its requirements, and its impact.
Original language | English |
---|---|
Pages (from-to) | 1067-1080 |
Number of pages | 14 |
Journal | Semantic Web |
Volume | 8 |
Issue number | 6 |
DOIs | |
State | Published - 2017 |
Externally published | Yes |