Protein-ligand binding prediction with machine learning models: current status

Protein-ligand binding prediction with machine learning models: current status


Drug discovery is a long journey. Given the complexity of a new drug design project, only through the highly organized cooperations between different people, the goal of developing a new drug could be achieved.

In the whole process, the binding affinity prediction between a target (a protein in general) and a small compound would be useful before the cell model or animal model experiments. We hope to discover tightly bound small molecules to a specific protein. Improving the bind affinity prediction could help us short list a set of useful molecules (lead-like compounds). Traditionally, binding affinity prediction could be achieved by absolute binding energy calculation, MMGBSA and scoring functions (in virtual screening and docking). More and more machine learning based methods have been developed to perform the prediction (Table 1).

Table 1. Current ML-based binding affinity prediction models

SN
Model
Year
Training (~11k)
Testing (290)
RMSE
R
Prediction
1
RF-score
2010
PDBBind v2016
V2016 coreset
1.39
0.8
pKd
2
Kdeep
2018
PDBBind v2016
V2016 coreset
1.27
0.82
pKd
3
TopBP
2018
PDBBind v2016
V2016 coreset
1.65
0.86
Energy
4
Pafnucy
2018
PDBBind v2016
V2016 coreset
1.42
0.78
pKd
5
OnionNet
2019
PDBBind v2016
V2016 coreset
1.28
0.82
pKd



Why do binding affinity prediction?



What are the traditional methods?



What we could do with machine-learning method?



What are the dataset?

Majorly two datasets could be used. One is BindingDB, which contains all available binding affinity data.
Another is the PDBBind database, which contains all experimentally determined protein-ligand complexes with their binding affinity data (Ki, Kd, or IC50).


What features could be used?




Comments

Popular posts from this blog

Fixing bugs in FF14SB port for Gromacs

A step-by-step tutorial to perform PCA with Gromacs MD trajectory

Install new amber force fields ports in Gromacs