Sunar, Emine Ayşe and Işık, Zeynep and Pekey, Mert and Cinbiş, Ramazan Gökberk and Taştan, Öznur (2025) DARKIN: a zero-shot benchmark for phosphosite-dark kinase association using protein language models. Bioinformatics, 41 (11). ISSN 1367-4803 (Print) 1367-4811 (Online)
DARKIN.pdf
Available under License Creative Commons Attribution.
Download (1MB)
Official URL: https://dx.doi.org/10.1093/bioinformatics/btaf480
Abstract
Motivation: Protein language models (pLMs) have emerged as powerful tools for capturing the intricate information encoded in protein sequences, facilitating various downstream protein prediction tasks. With numerous pLMs available, there is a critical need for diverse benchmarks to systematically evaluate their performance across biologically relevant tasks. Here, we introduce DARKIN, a zero-shot classification benchmark designed to assign phosphosites to understudied kinases, termed dark kinases. Kinases, which catalyze phosphorylation, are central to cellular signaling pathways. While phosphoproteomics enables the large-scale identification of phosphosites, determining the cognate kinase responsible for the phosphorylation event remains an experimental challenge. Results: In DARKIN, we prepared training, validation, and test folds that respect the zero-shot nature of this classification problem, incorporating stratification based on kinase groups and sequence similarity. We evaluated multiple pLMs using two zero-shot classifiers: a novel, training-free k-NN-based method, and a bilinear classifier. Our findings indicate that ESM, ProtT5-XL, and SaProt exhibit superior performance on this task. DARKIN provides a challenging benchmark for assessing pLM efficacy and fosters deeper exploration of under-characterized (dark) kinases by offering a biologically relevant test bed. Availability and implementation The DARKIN benchmark data and the scripts for generating additional splits are publicly available at: https://github.com/tastanlab/darkin
| Item Type: | Article |
|---|---|
| Additional Information: | This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited |
| Divisions: | Faculty of Engineering and Natural Sciences |
| Depositing User: | Öznur Taştan |
| Date Deposited: | 04 Feb 2026 15:44 |
| Last Modified: | 04 Feb 2026 15:44 |
| URI: | https://research.sabanciuniv.edu/id/eprint/53050 |

