Back

Minisymposium Presentation

Leveraging Large Datasets to Assess the Potential of Machine Learning for Drug Target Prediction through Reverse Screening

Monday, June 3, 2024
15:30
-
16:00
CEST
Climate, Weather and Earth Sciences
Climate, Weather and Earth Sciences
Climate, Weather and Earth Sciences
Chemistry and Materials
Chemistry and Materials
Chemistry and Materials
Computer Science and Applied Mathematics
Computer Science and Applied Mathematics
Computer Science and Applied Mathematics
Humanities and Social Sciences
Humanities and Social Sciences
Humanities and Social Sciences
Engineering
Engineering
Engineering
Life Sciences
Life Sciences
Life Sciences
Physics
Physics
Physics

Description

Estimating protein targets of compounds based on the similarity principle is a long-standing strategy in drug discovery. Building upon prior quantification of this principle, the large-scale assessment of its predictive power was performed using an unprecedented vast external test set of more than 300’000 active small molecules against another bioactivity set of more than 500’000 compounds. It was found that machine-learning can predict the correct targets, with the highest probability among 2069 proteins, for more than 51% of the external molecules. The strong enrichment thus obtained demonstrates its usefulness in supporting phenotypic screens, polypharmacology, or repurposing. Moreover, the impact of the bioactivity knowledge available for proteins in terms of number and diversity of actives was investigated. This study advocates for the adoption of application-oriented benchmarking strategies to prevent accidental overestimation of their predictive ability, and the use of large, high-quality, non-overlapping datasets.

Authors