TY - JOUR
T1 - CyclinPred
T2 - A SVM-based method for predicting cyclin protein sequences
AU - Kalita, Mridul K.
AU - Nandal, Umesh K.
AU - Pattnaik, Ansuman
AU - Sivalingam, Anandhan
AU - Ramasamy, Gowthaman
AU - Kumar, Manish
AU - Raghava, Gajendra P.S.
AU - Gupta, Dinesh
N1 - Copyright:
Copyright 2008 Elsevier B.V., All rights reserved.
PY - 2008/7/2
Y1 - 2008/7/2
N2 - Functional annotation of protein sequences with low similarity to well characterized protein sequences is a major challenge of computational biology in the post genomic era. The cyclin protein family is once such important family of proteins which consists of sequences with low sequence similarity making discovery of novel cyclins and establishing orthologous relationships amongst the cyclins, a difficult task. The currently identified cyclin motifs and cyclin associated domains do not represent all of the identified and characterized cyclin sequences. We describe a Support Vector Machine (SVM) based classifier. CyclinPred, which can predict cyclin sequences with high efficiency. The SVM classifier was trained with features of selected cyclin and non cyclin protein sequences. The training features of the protein sequences include amino acid composition, dipeptide composition, secondary structure composition and PSI-BLAST generated Position Specific Scoring Matrix (PSSM) profiles. Results obtained from Leave-One-Out cross validation or jackknife test, self consistency and holdout tests prove that the SVM classifier trained with features of PSSM profile was more accurate than the classifiers based on either of the other features alone or hybrids of these features. A cyclin prediction server- CyclinPred has been setup based on SVM model trained with PSSM profiles. CyclinPred prediction results prove that the method may be used as a cyclin prediction tool, complementing conventional cyclin prediction methods. Copyright:
AB - Functional annotation of protein sequences with low similarity to well characterized protein sequences is a major challenge of computational biology in the post genomic era. The cyclin protein family is once such important family of proteins which consists of sequences with low sequence similarity making discovery of novel cyclins and establishing orthologous relationships amongst the cyclins, a difficult task. The currently identified cyclin motifs and cyclin associated domains do not represent all of the identified and characterized cyclin sequences. We describe a Support Vector Machine (SVM) based classifier. CyclinPred, which can predict cyclin sequences with high efficiency. The SVM classifier was trained with features of selected cyclin and non cyclin protein sequences. The training features of the protein sequences include amino acid composition, dipeptide composition, secondary structure composition and PSI-BLAST generated Position Specific Scoring Matrix (PSSM) profiles. Results obtained from Leave-One-Out cross validation or jackknife test, self consistency and holdout tests prove that the SVM classifier trained with features of PSSM profile was more accurate than the classifiers based on either of the other features alone or hybrids of these features. A cyclin prediction server- CyclinPred has been setup based on SVM model trained with PSSM profiles. CyclinPred prediction results prove that the method may be used as a cyclin prediction tool, complementing conventional cyclin prediction methods. Copyright:
UR - http://www.scopus.com/inward/record.url?scp=49749089288&partnerID=8YFLogxK
U2 - 10.1371/journal.pone.0002605
DO - 10.1371/journal.pone.0002605
M3 - Article
C2 - 18596929
AN - SCOPUS:49749089288
SN - 1932-6203
VL - 3
JO - PLoS ONE
JF - PLoS ONE
IS - 7
M1 - e2605
ER -