Pakatcı, Kemal İsa (2008) Protein secondary structure prediction with classifier fusion. [Thesis]
PDF
PakatciIsaKemal.pdf
Download (635kB)
PakatciIsaKemal.pdf
Download (635kB)
Official URL: http://192.168.1.20/record=b1226363 (Table of Contents)
Abstract
The number of known protein sequences is increasing very rapidly. However, experimentally determining protein structure is costly and slow, so the number of proteins with known sequence but unknown structure is increasing. Thus, computational methods for prediction of structure of a protein from its amino acid sequence are very useful. In this thesis, we focus on the problem of a special type of protein structure prediction called secondary structure prediction. The problem of structure prediction can be analyzed in categories. Some sequences can be enriched by forming multiple alignment profiles, whereas some are single sequences where one cannot form profiles. We look into different aspects of both cases in this thesis. The first case we focus in this thesis is when multiple sequence alignment information exists. We introduce a novel feature extraction technique that extracts unigram, bigram and positional features from profiles using dimension reduction and feature selection techniques. We use both these novel features and regular raw features for classification. We experimented with the following types of first level classifiers: Linear Discriminant Classifier (LDCs), Support Vector Machines (SVMs) and Hidden Markov Models (HMMs). A novel method that combines these classifiers is introduced. Secondly, we focus on protein secondary structure prediction of single sequences. We explored different methods of training set reduction in order to increase the prediction accuracy of the IPSSP (Iterative Protein Secondary Structure Prediction) algorithm that was introduced before [34]. Results show that composition-based training set reduction is useful in prediction of secondary structures of orphan proteins.
Item Type: | Thesis |
---|---|
Uncontrolled Keywords: | Protein. -- Structure. -- Secondary structure prediction. -- Protein profile. -- Protein analysis. -- Protein. -- Yapı. -- İkincil yapı kestirimi. -- Protein profili. -- Protein analizleri. |
Subjects: | T Technology > TK Electrical engineering. Electronics Nuclear engineering > TK7800-8360 Electronics |
Divisions: | Faculty of Engineering and Natural Sciences > Academic programs > Electronics Faculty of Engineering and Natural Sciences |
Depositing User: | IC-Cataloging |
Date Deposited: | 01 Jun 2010 16:15 |
Last Modified: | 26 Apr 2022 09:51 |
URI: | https://research.sabanciuniv.edu/id/eprint/13992 |