Çandır Soydemir, Emine Beyza and Kuru, Halil İbrahim and Rattray, Magnus and Çiçek, A. Ercüment and Taştan, Öznur (2026) One-hot news: drug synergy models shortcut molecular features. Bioinformatics, 42 (3). ISSN 1367-4803 (Print) 1367-4811 (Online)
One-hot.pdf
Available under License Creative Commons Attribution.
Download (2MB)
Official URL: https://dx.doi.org/10.1093/bioinformatics/btag040
Abstract
Motivation: Combinatorial drug therapy holds great promise for tackling complex diseases, but the vast number of possible drug combinations makes exhaustive experimental testing infeasible. Computational models have been developed to guide experimental screens by assigning synergy scores to drug pair–cell line combinations, where they take input structural and chemical information on drugs and molecular features of cell lines. The premise of these models is that they leverage this biological and chemical information to predict synergy measurements. Results: In this study, we demonstrate that replacing drug and cell line representations with simple one-hot encodings results in comparable or even slightly improved performance across diverse published drug combination models. This unexpected finding suggests that current models use these representations primarily as identifiers and exploit covariation in the synergy labels. Our synthetic data experiments show that models can learn from the true features; however, when drugs and cell lines recur across drug–drug–cell triplets, this repeating structure impairs feature-based learning. While the current synergy prediction models can aid in prioritizing drug pairs within a panel of tested drugs and cell lines, our results highlight the need for better strategies to learn from intended features and to generalize to unseen drugs and cell lines. Availability and implementation: The scripts to run the experiments are available at: https://github.com/tastanlab/ohe
| Item Type: | Article |
|---|---|
| Additional Information: | This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
| Divisions: | Center of Excellence in Data Analytics Faculty of Engineering and Natural Sciences |
| Depositing User: | Öznur Taştan |
| Date Deposited: | 30 Apr 2026 16:38 |
| Last Modified: | 30 Apr 2026 16:38 |
| URI: | https://research.sabanciuniv.edu/id/eprint/53959 |

