Knowledge database assisted gene marker selection for chronic lymphocytic leukemia

Xixi Xiang, Yu Ping Wang, Hongbao Cao, Xi Zhang

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Objective: To investigate whether previously curated chronic lymphocytic leukemia (CLL) risk genes could be leveraged in gene marker selection for the diagnosis and prediction of CLL. Methods: A CLL genetic database (CLL_042017) was developed through a comprehensive CLL-gene relation data analysis, in which 753 CLL target genes were curated. Expression values for these genes were used for case-control classification of four CLL datasets, with a sparse representation-based variable selection (SRVS) approach employed for feature (gene) selection. Results were compared with outcomes obtained by using analysis of variance (ANOVA)-based gene selection approaches. Results: For each of the four datasets, SRVS selected a subset of genes from the 753 CLL target genes, resulting in significantly higher classification accuracy, compared with randomly selected genes (100%, 100%, 93.94%, 89.39%). The SRVS method outperformed ANOVA in terms of classification accuracy. Conclusion: Gene markers selected from the 753 CLL genes could enable significantly greater accuracy in the prediction of CLL. SRVS provides an effective method for gene marker selection.

Original languageEnglish
Pages (from-to)3358-3364
Number of pages7
JournalJournal of International Medical Research
Volume46
Issue number8
DOIs
StatePublished - Aug 1 2018
Externally publishedYes

Keywords

  • case-control classification
  • Chronic lymphocytic leukemia (CLL)
  • disease prediction
  • gene markers
  • genetic databases
  • sparse representation
  • variable selection

Fingerprint

Dive into the research topics of 'Knowledge database assisted gene marker selection for chronic lymphocytic leukemia'. Together they form a unique fingerprint.

Cite this