Dataset of biomedical entities extracted from the CORD-19 dataset (2020-08-28 and 2020-09-28) using trained NER (trained against CRAFT, JNLPBA, BC5CDR, and BioNLP) and NERL models (UMLS, MeSH, GO, HPO, and RxNorm) from the SciSpaCy project, provided as structured Parquet files. Dataset may be useful for downstream tasks around entity linking and relationship extraction. The work was carried out using Dask on the Saturn Cloud platform, and was a joint effort between Elsevier Labs and Saturn Cloud.
|Original language||American English|
|State||Published - Oct 24 2020|