Sumonet: Deep sequential prediction of sumoylation sites

Dilekoğlu, Berke (2022) Sumonet: Deep sequential prediction of sumoylation sites. [Thesis]

[thumbnail of 10487537.pdf] PDF
10487537.pdf

Download (2MB)

Abstract

SUMOylation is a reversible post-translational protein modification in which SUMOs (small ubiquitin-like modifiers) covalently attach to a specific lysine residue of the target protein. This process is vital for many cellular events such as protein binding, subcellular transport, DNA repair, and cellular signaling. Aberrant SUMOylation is linked with several diseases, including Alzheimer’s, cancer, and diabetes. Therefore, accurate identification of SUMOylation sites is essential to understanding cellular processes and pathologies that arise with their disruption. In this thesis, we present three deep neural architectures, SUMOnets, that take the peptide sequence centered on the candidate SUMOlylation site as input and predict whether the lysine could be SUMOylated. Each of these models, SUMOnet-1, -2 and -3, relies on different compositions of deep sequential learning architectural units, such as Bidirectional Gated Recurrent Units(biGRUs) and convolutional layers. We evaluate these models on the benchmark dataset with three different input peptide representations of the input sequence. SUMOnet-3 achieves 75.8% AUPR and 87% AUC scores, corresponding to approximately 5% improvement over the closest state-ofthe- art SUMOylation predictor. We also create a challenging subset of the test data based on the absence and presence of known SUMOylation motifs. Even though the performances of all methods degrade in these cases, SUMOnet-3 remains the best predictor in these challenging cases, and the current methods’ predictive abilities decrease significantly. The SUMOnet-3 framework is available as an open source project and a Python library at https://github.com/berkedilekoglu/SUMOnet.
Item Type: Thesis
Uncontrolled Keywords: Deep sequential learning. -- SUMOylation. -- Post-translational Modifications. -- CNNs, Transformers. -- Makine Ögrenmesi. -- PTM.
Subjects: T Technology > TK Electrical engineering. Electronics Nuclear engineering > TK7800-8360 Electronics > TK7885-7895 Computer engineering. Computer hardware
Divisions: Faculty of Engineering and Natural Sciences > Academic programs > Computer Science & Eng.
Faculty of Engineering and Natural Sciences
Depositing User: Dila Günay
Date Deposited: 27 Apr 2023 14:05
Last Modified: 27 Apr 2023 14:05
URI: https://research.sabanciuniv.edu/id/eprint/47187

Actions (login required)

View Item
View Item