Computational approaches to protein structure prediction
Işık, Zerrin (2003) Computational approaches to protein structure prediction. [Thesis]
One of the most promising problems in bioinformatics is still the protein folding problem which tries to predict the native 3D fold (shape) of a protein from its amino acid sequence. The native fold information of proteins provide to understand their functions in the cell. In order to determine the 3D structure of the huge amount of protein sequence, the development of efficient computational techniques is needed. The thesis studies the computational approaches to provide new solutions for the secondary structure prediction of proteins. The 3D structure of a protein is composed of the secondary structure elements: α-helices, β-sheets, β-turns, and loops. The secondary structures of proteins have a high impact on the formation of their 3D structures. Two subproblems within secondary structure prediction have been studied in this thesis. The first study is for identifying the structural classes (all-α, all-β, α/β, α+β) of proteins from their primary sequences. The structural class information could provide a rough description of a protein’s 3D structure due to the high effects of the secondary structures on the formation of 3D structure. This approach assembles the statistical classification technique, Support Vector Machines (SVM), and the variations of amino acid composition information. The performance results demonstrate that the utilization of neighborhood information between amino acids and the high classification ability of the SVM provides a significant improvement for the structural classification of proteins. The second study in thesis is for predicting one of the secondary structure element, β-turns, through primary sequence. The formation of β-turns has been thought to have critical roles as much as other secondary structures in the protein folding pathway. Hence, Hidden Markov Models (HMM) and Artificial Neural Networks (ANN) have been developed to predict the location and type of β-turns from its amino acid sequence. The neighborhood information between β-turns and other secondary structures has been introduced by designing the suitable HMM topologies. One of the amino acid similarity matrices is used to give the evolutionary information between proteins. Although applying HMMs and usage of amino acid similarity matrix is a new approach to predict β-turns through its protein sequence, the initial results for the prediction of β-turns and type classification are promising.
Repository Staff Only: item control page