Albayrak, Aydın and Sezerman, Uğur (2012) Discrimination of thermophilic and mesophilic proteins using reduced amino acid alphabets with n-grams. Current Bioinformatics, 7 (2). pp. 152-158. ISSN 1574-8936
This is the latest version of this item.
MS Word (This is a RoMEO yellow journal -- author can archive pre-print (ie pre-refereeing))
AlbayrakCurrentbioinformaticc..docx
Download (114kB)
AlbayrakCurrentbioinformaticc..docx
Download (114kB)
Abstract
Protein thermostabilization has been the focus of recent research due to growing interest in the production of enzymes that can operate at temperatures that are industrially beneficial. Understanding the determinants of thermostabilization at the level of sequence and structure is important to design such enzymes. A bioinformatical approach was used to determine the extent by which reduced amino acid alphabets (RAAA) with n-grams (subsequences of length n) that were subjected to a t-test-based feature selection procedure can be used to discriminate proteins from thermophiles and mesophiles. Classification performance of 65 different protein alphabets with 3 different n-gram sizes was systematically evaluated using support vector machines in a test set that contained 707 proteins from mesophilic Xylella fastidosa and thermophilic Aquifex aeolicus. A classification accuracy of 91.796% was achieved with Hsdm16 RAAA with 13 features: EK-ILV-ST-A-G-F-H-Q-N-R-M-W-Y. The t-test-based feature selection procedure reduced the classification time without significantly affecting classification accuracy. The overall combination of methods in this paper is useful and computationally fast for classifying protein sequences from thermophiles and mesophiles using sequence information alone.
Item Type: | Article |
---|---|
Uncontrolled Keywords: | Amino acid composition, dipeptide, N-grams, reduced amino acid alphabets, statistically significant features, thermostability, tripeptide |
Subjects: | Q Science > Q Science (General) |
Divisions: | Faculty of Engineering and Natural Sciences > Academic programs > Biological Sciences & Bio Eng. Faculty of Engineering and Natural Sciences |
Depositing User: | Uğur Sezerman |
Date Deposited: | 22 Jun 2012 10:02 |
Last Modified: | 31 Jul 2019 11:03 |
URI: | https://research.sabanciuniv.edu/id/eprint/19121 |
Available Versions of this Item
-
Discrimination of thermophilic and mesophilic proteins using reduced amino acid alphabets with n-grams. (deposited 23 Dec 2011 10:44)
- Discrimination of thermophilic and mesophilic proteins using reduced amino acid alphabets with n-grams. (deposited 22 Jun 2012 10:02) [Currently Displayed]