TY - JOUR
T1 - Co-occurrence of Cell Lines, Basal Media and Supplementation in the Biomedical Research Literature
AU - Cox, Jessica
AU - McBeath, Darin
AU - Harper, Corey A.
AU - Daniel, Ronald
N1 - Publisher Copyright:
© 2020 2020 Jessica Cox et al., published by Sciendo.
PY - 2020/8/1
Y1 - 2020/8/1
N2 - The use of in vitro cell culture and experimentation is a cornerstone of biomedical research, however, more attention has recently been given to the potential consequences of using such artificial basal medias and undefined supplements. As a first step towards better understanding and measuring the impact these systems have on experimental results, we use text mining to capture typical research practices and trends around cell culture. To measure the scale of in vitro cell culture use, we have analyzed a corpus of 94,695 research articles that appear in biomedical research journals published in ScienceDirect from 2000-2018. Central to our investigation is the observation that studies using cell culture describe conditions using the typical sentence structure of cell line, basal media, and supplemented compounds. Here we tag our corpus with a curated list of basal medias and the Cellosaurus ontology using the Aho-Corasick algorithm. We also processed the corpus with Stanford CoreNLP to find nouns that follow the basal media, in an attempt to identify supplements used. Interestingly, we find that researchers frequently use DMEM even if a cell line's vendor recommends less concentrated media. We see long-tailed distributions for the usage of media and cell lines, with DMEM and RPMI dominating the media, and HEK293, HEK293T, and HeLa dominating cell lines used. Our analysis was restricted to documents in ScienceDirect, and our text mining method achieved high recall but low precision and mandated manual inspection of many tokens. Our findings document current cell culture practices in the biomedical research community, which can be used as a resource for future experimental design. No other work has taken a text mining approach to surveying cell culture practices in biomedical research.
AB - The use of in vitro cell culture and experimentation is a cornerstone of biomedical research, however, more attention has recently been given to the potential consequences of using such artificial basal medias and undefined supplements. As a first step towards better understanding and measuring the impact these systems have on experimental results, we use text mining to capture typical research practices and trends around cell culture. To measure the scale of in vitro cell culture use, we have analyzed a corpus of 94,695 research articles that appear in biomedical research journals published in ScienceDirect from 2000-2018. Central to our investigation is the observation that studies using cell culture describe conditions using the typical sentence structure of cell line, basal media, and supplemented compounds. Here we tag our corpus with a curated list of basal medias and the Cellosaurus ontology using the Aho-Corasick algorithm. We also processed the corpus with Stanford CoreNLP to find nouns that follow the basal media, in an attempt to identify supplements used. Interestingly, we find that researchers frequently use DMEM even if a cell line's vendor recommends less concentrated media. We see long-tailed distributions for the usage of media and cell lines, with DMEM and RPMI dominating the media, and HEK293, HEK293T, and HeLa dominating cell lines used. Our analysis was restricted to documents in ScienceDirect, and our text mining method achieved high recall but low precision and mandated manual inspection of many tokens. Our findings document current cell culture practices in the biomedical research community, which can be used as a resource for future experimental design. No other work has taken a text mining approach to surveying cell culture practices in biomedical research.
KW - Biomedical research
KW - Cell culture
KW - Text mining
UR - http://www.scopus.com/inward/record.url?scp=85087994150&partnerID=8YFLogxK
U2 - 10.2478/jdis-2020-0016
DO - 10.2478/jdis-2020-0016
M3 - Article
SN - 2096-157X
VL - 5
SP - 161
EP - 177
JO - Journal of Data and Information Science
JF - Journal of Data and Information Science
IS - 3
ER -