Prediction of permissive insertion sites in proteins
Tayeh, Husamaldin H. A. (2013) Prediction of permissive insertion sites in proteins. [Thesis]
The procedure of domain insertion is proven to be very effective in the process of creating modified proteins that can be used for different protein engineering applications. Domain insertion alters the functionality of the protein by inserting gene or genes into certain domains. Proteins usually tolerate insertions in specific sites only, therefore identifying those permissive insertion sites is crucial for any successful insertion attempt. Normally, determining permissive insertion sites is performed experimentally by a genetic approach. However an educated guess can assist in predicting the potential permissive insertion sites. In this work, we introduced a method for predicting permissive insertion sites through the utilization of machine learning and data mining techniques. We have adopted an educated guess approach to predict permissive sites by extracting distinctive features from the amino acids surrounding the insertion site included within any captured amino acid window. The window size was made adjustable and can capture any odd number of amino acids. We used a number of features related to amino acids obtained from this window and then used a machine learning based approach to construct a trained SVM model using 135 permissive and non-permissive sites obtained from 10 different proteins. Our trained model was used to predict permissive insertion sites in Outer membrane usher protein FasD, Lactose operon repressor LacI, Type II secretion system protein XpsD, and Maltose periplasmic protein MalE and 70.59%, 61.11%, 61.90% and 90.00% accuracies were achieved respectively.
Repository Staff Only: item control page