Abstract
This paper introduces the LOD Laundromat meta-dataset, a continuously updated RDF meta-dataset-that describes the documents crawled, cleaned and (re)published by the LOD Laundromat. This meta-dataset-of over 110 million triples contains structural information for more than 650,000-documents (and growing). Dataset meta-data-is often not provided alongside published data, it is incomplete or it is incomparable given the way they were generated. The LOD Laundromat-meta-dataset-provides a wide range of structural dataset properties, such as the number of triples in LOD Laundromat-documents, the average degree in documents, and the distinct number of Blank Nodes, Literals and IRIs. This makes it a particularly useful dataset for data comparison and analytics, as well as for the global study of the Web of Data. This paper presents the dataset, its requirements, and its impact.
| Original language | English |
|---|---|
| Pages (from-to) | 1067-1080 |
| Number of pages | 14 |
| Journal | Semantic Web |
| Volume | 8 |
| Issue number | 6 |
| DOIs | |
| State | Published - 2017 |
| Externally published | Yes |