DeltaDock: A Unified Framework for Accurate, Efficient, and Physically Reliable Molecular Docking
摘要
评审与讨论
This paper proposed a model named DeltaDock for molecular docking. DeltaDock integrates protein pocket prediction and protein-ligand binding refinement. The pocket prediction is modeled as a pocket-ligand alignment problem with candidates of pockets given by other pocket prediction methods. The docking is formulated as a refinement problem given the initial docking conformation. Comparison with computational chemistry docking tools and learning-based methods show the advantages of DeltaDock. DeltaDock runs much faster than most of the baselines.
优点
- The model is faster than the diffusion-based method DiffDock.
- The experimental results show that DeltaDock has a good performance in blind docking.
缺点
- The testing set is not clearly described. Training and test proteins should have a maximum sequence identity of 40% to test the generalization.
- in figure2, the compared methods are not consistent. For example, vina is in Figure 2 (a) but not in Figure 2 (b). This may be sometimes misleading.
- in Line 60, "a GPU-accelerated pose sampling algorithm generating high-quality initial structure" is just DSDP, which limits the novelty of the paper.
- what if you combine the pocket prediction of DeltaDock and autodock vina? Whether the improvement is due to the pocket prediction or the refinement is not clear.
问题
what is the distance threshold used for graph construction?
局限性
Yes
Dear Reviewer, thank you for your insightful comments on our dataset, figures, and methods. Below are our responses to your questions.
For Weaknesses 1
Thank you for raising this important question. This work utilizes the established approach of training on the PDBbind time-split training set and testing on both the PDBbind time-split testing set and the PoseBusters dataset.
While the time-split strategy of PDBbind, first employed by EquiBind, aims to reflect real-world application scenarios with varying data quality, a detailed sequence identity analysis for this approach was not provided by the authors.
To address this, we employed mmseqs2 to analyze the sequence identity between the PDBbind training and testing sets. Our findings, presented in the table below, indicate an average sequence identity exceeding 0.4 for both test sets. Notably, the "unseen test set," where protein UniProt IDs are absent from the training set, exhibits a lower average sequence identity compared to the complete testing set. This observation sheds light on the significant performance drop observed for all previous GDL docking methods on the unseen test set, highlighting the increased difficulty posed by novel proteins.
| PDBbind Set | Average Sequence Identity |
|---|---|
| Train-Test Set | 0.87 |
| Train-Unseen Test Set | 0.67 |
Turning to the PoseBusters dataset, a comprehensive sequence identity analysis was conducted by the dataset creators. They further categorized the dataset into three subsets based on sequence identity: 0-0.3, 0.3-0.95, and 0.95-1.0, containing 129, 111, and 188 data points, respectively.
The subsequent table presents a comparative analysis of docking performance across these subsets for various baseline methods. Notably, DeltaDock consistently outperforms other methods across all three subsets, demonstrating robustness against varying levels of sequence similarity.
| Method | % RMSD < 2.0 Å | % RMSD < 2.0 Å & PB-Valid | ||||
|---|---|---|---|---|---|---|
| [0,0.3] | (0.3,0.95] | (0.95,1] | [0,0.3] | (0.3,0.95] | (0.95,1] | |
| EquiBind | 0 | 2 | 5 | 0 | 0 | 0 |
| TankBind | 2 | 12 | 26 | 0 | 1 | 5 |
| DiffDock | 16 | 42 | 51 | 1 | 12 | 24 |
| DSDP | 38 | 45 | 56 | 38 | 43 | 55 |
| Vina | 39 | 50 | 44 | 38 | 50 | 43 |
| Smina | 53 | 51 | 48 | 52 | 51 | 47 |
| DeltaDock | 47 | 54 | 66 | 47 | 53 | 65 |
For Weaknesses 2
Thank you for your valuable suggestions. Here, we update the baselines on the PoseBusters dataset(Fig2b) to be the same as the baselines on the PDBbind dataset (Fig2a). Below, we list the docking success rate (% of RMSD < 2.0 Å) of these baselines on both datasets. We have updated this figure in the PDF we submitted.
| Dataset | EquiBind | TankBind | DiffDock | Vina | Smina | DSDP | DeltaDock |
|---|---|---|---|---|---|---|---|
| PDBbind | 5.5 | 23.4 | 38.2 | 45.0 | 46.0 | 51.6 | 56.5 |
| PoseBuster | 2.6 | 15.0 | 38.0 | 43.9 | 50.4 | 47.0 | 57.0 |
| PoseBuster(PB-Valid) | 0.0 | 2.6 | 14.0 | 43.2 | 49.8 | 46.0 | 56.0 |
For Weaknesses 3
Thank you for your valuable question. In this work, we prioritized efficiency and therefore employed DSDP for pose initialization due to its speed. While other docking methods like VINA and SMINA can provide accurate results, they require significantly more computational time.
To address your point, we conducted additional experiments using VINA and SMINA for initialization. It's important to note that our model was trained on DSDP-generated structures. Even without specific training on VINA/SMINA output, DeltaDock demonstrates an ability to consistently improves performance across all initialization methods, highlighting its effectiveness in refining poses regardless of the initial docking method used.
| Methods | % RMSD < 2.0 Å | % RMSD < 2.0 Å & PB-Valid | ||||
|---|---|---|---|---|---|---|
| [0,0.3] | (0.3,0.95] | (0.95,1] | [0,0.3] | (0.3,0.95] | (0.95,1] | |
| DSDP | 38 | 45 | 56 | 38 | 43 | 55 |
| DSDP+DeltaDock Refinement | 47 | 54 | 66 | 47 | 53 | 65 |
| Smina | 53 | 51 | 48 | 52 | 51 | 47 |
| Smina+DeltaDock Refinement | 54 | 59 | 49 | 53 | 59 | 46 |
| Vina | 39 | 50 | 44 | 38 | 50 | 43 |
| Vina+DeltaDock Refinement | 41 | 54 | 51 | 40 | 54 | 48 |
For Weaknesses 4
Thank you for raising this important point. To assess the performance gains achieved by combining Vina with DeltaDock's pocket prediction, we conducted blind docking experiments using the PDBbind dataset. The results, summarized in the table below, clearly demonstrate that incorporating DeltaDock's CPLA pocket prediction module significantly enhances Vina's blind docking accuracy.
| Methods | Time Split | Timesplit Unseen | ||||||
|---|---|---|---|---|---|---|---|---|
| % RMSD < 2 Å | % RMSD < 5 Å | % Centroid < 2 Å | % Centroid < 5 Å | % RMSD < 2 Å | % RMSD < 5 Å | % Centroid < 2 Å | % Centroid < 5 Å | |
| Vina | 10 | 36 | 32 | 55 | 8 | 26 | 24 | 42 |
| P2rank+Vina | 30 | 46 | 50 | 66 | 22 | 35 | 40 | 52 |
| CPLA+Vina | 35 | 55 | 61 | 80 | 28 | 46 | 55 | 76 |
As for where the improvement comes from, we performed a series of in-depth analyses. This included ablation studies, discussed in Section 4.5.2, and additional experiments presented earlier. Our findings consistently indicate that both the pocket prediction module and the refinement module contribute significantly to the observed performance.
For Questions 1
Thank you for your valuable Question. For the ligand graph and protein atomic graph , the distance threshold is setted as 5.0 Å. For the protein residue graph , the distance threshold is setted as 30.0 Å. When constructing the protein residue graph, we follow the EquiDock[1] to build the K-NN (k=10) graph.
[1]. Independent SE(3)-Equivariant Models for End-to-End Rigid Protein Docking, ICLR 2022.
Once again, thank you for reviewing our paper and providing valuable suggestions, please let us know if you have any further concerns, and we are willing to answer any further questions you have about our paper.
Best regards,
Paper Authors
Thanks for the rebuttal. This work is actually about pocket prediction and refining the result of an existing docking tool (DSDP), which should be made clearer. It is not proper to conflate these with a new docking framework. The PoseBusters results on varying levels of sequence similarity are different from those in the PoseBuster paper.
Dear Reviwer,
Thank you for taking the time to read our response, and provide valuable feedback. Here’s our response to your further concerns:
The PoseBusters results on varying levels of sequence similarity are different from those in the PoseBuster paper.
This discrepancy likely stems from the different versions of the PoseBusters dataset used. Our initial results were based on PoseBusters v1 (428 data examples), while your analysis might be based on PoseBusters v2, a subset of v1 containing 308 data points, divided into three similarity subsets of 109, 76, and 123 data points.
All baseline performance metrics, except for VINA and SMINA, were directly extracted from the PoseBusters paper. We independently ran VINA and SMINA to obtain the generated poses for downstream refinement experiments with DeltaDock. We carefully followed the experimental settings outlined in the PoseBusters paper to ensure consistency and reproducibility.
A bounding box with a side length of 25 Å was created around the centroid of the crystal ligand. 40 poses were generated with an exhaustiveness setting of 32, and the top-ranked pose was selected. For VINA, which only accepts PDBQT files as input, we followed the PoseBusters methodology, preparing ligand PDBQT files with Meeko and protein PDBQT files with the ADFR prepare_receptor script.
For clarity and comparison, the following Table presents the experimental results on the PoseBusters v2 dataset. This table includes both our reproduced VINA results (denoted as VINA) and the VINA results reported in the PoseBusters paper (denoted as VINA*). As you can see, the performance of VINA, SMINA, and DSDP is comparable. While we aimed to replicate the VINA* results precisely, achieving identical performance can be challenging due to potential variations in computational environments. Importantly, DeltaDock consistently maintains a slight performance advantage over VINA*.
| Dataset | EquiBind | TankBind | DiffDock | VINA | SMINA | VINA* | DSDP | DeltaDock |
|---|---|---|---|---|---|---|---|---|
| PoseBuster | 2.0 | 16.0 | 38.0 | 50 | 51 | 60 | 52 | 61 |
| PoseBuster(PB-Valid) | 0.0 | 3.3 | 12.0 | 49 | 50 | 58 | 51 | 60 |
| Method | % RMSD < 2.0 Å | % RMSD < 2.0 Å & PB-Valid | ||||
|---|---|---|---|---|---|---|
| [0,0.3] | (0.3,0.95] | (0.95,1] | [0,0.3] | (0.3,0.95] | (0.95,1] | |
| EquiBind | 0 | 1.3 | 4.1 | 0 | 0 | 0 |
| TankBind | 1.8 | 13 | 30 | 0 | 1.3 | 7.4 |
| DiffDock | 15 | 45 | 54 | 0.92 | 11 | 24 |
| DSDP | 40 | 46 | 65 | 40 | 43 | 63 |
| VINA | 43 | 55 | 52 | 42 | 54 | 52 |
| SMINA | 52 | 47 | 51 | 51 | 47 | 50 |
| VINA* | 56 | 57 | 65 | 54 | 53 | 65 |
| DeltaDock | 49 | 58 | 73 | 48 | 58 | 72 |
This work is actually about pocket prediction and refining the result of an existing docking tool (DSDP), which should be made clearer. It is not proper to conflate these with a new docking framework
We appreciate your point and understand the concern regarding the framing of our work. While we acknowledge it builds upon existing docking tools like DSDP, our primary contributions lie in reframing pocket prediction and developing an effective iterative refinement model that generalizes well across different docking tools. In our response, we have demonstrated its applicability to DSDP, VINA, and SMINA, showcasing its broader impact. For example, the CPLA pocket prediction model can combined with VINA to perform blind docking, and VINA/SMINA poses can be further refined by DeltaDock refinement model.
We believe the evaluation of a work's could be comprehensive. While we have not developed a novel end-to-end docking framework, our framework offers a new perspective on pocket prediction and poses refinement, ultimately enhancing the accuracy and efficiency of molecular docking.
Thank you again for your valuable feedback. We hope our revised explanation addresses your concerns and provides a better understanding of our contributions. We believe our work offers valuable advancements in the field and kindly hope you can reconsider its significance.
Best regards,
Paper Authors
Dear Reviwer,
We would like to kindly ask if our answers to your questions were satisfying? We are happy to discuss further if you have other questions.
Best regards,
Paper Authors
This manuscript introduces DeltaDock, a novel two-stage framework for molecular docking. Similar to previous works that use geometric deep learning methods, this work also takes the geometric dl network to do the modeling and the prediction is based on a regression problem. The main contribution is a contrastive based learning method for pocket prediction and also a bi-level update of the ligand pose prediction. The experiments are conducted on common datasets and also many studies are performed.
优点
- The overall framework makes sense and is valid and useful for blind molecular docking.
- The proposed contrastive learning approach for pocket prediction and the bi-level refinement module are novel contributions to the field. CPLA, in particular, reframes pocket prediction as a selection problem from a candidate set.
- Lots of small tricks designed in the framework are beneficial to hold the correct structure and also help the final prediction. The detailed ablation studies, generalization analysis, and assessment of pose validity highlight the contribution of each component.
- The experiments results are good compared to previous works.
缺点
- Though the process and the framework are reasonable, the overview of the work is quite similar to FABind, which implements a pocket prediction and a docking prediction molecules, The authors should put more comparison and describe more differences between these two works.
- As the prediction module takes other external methods/tools, this framework is not a fully end-to-end one, though this is not a thing that we would purse must, I would like to hear my thoughts on this point. Why not do in an end-to-end way? The external tools indeed cost more and slow down the docking process.
- The manuscript focuses on rigid protein docking, neglecting the inherent flexibility of protein side chains. Can the method get rid of the assumption that the protein side is rigid, modeling the side chain flexibility like previous works? [1, 2]
[1] Qiao, Zhuoran, et al. "State-specific protein–ligand complex structure prediction with a multiscale deep generative model." Nature Machine Intelligence 6.2 (2024): 195-208.
[2] Plainer, Michael, et al. "Diffdock-pocket: Diffusion for pocket-level docking with sidechain flexibility." (2023).
问题
Please refer to Weakness Section
局限性
Please refer to Weakness Section
Dear Reviewer,thank you for your insightful comments on our methods. Below are our responses to your questions.
For Weakness 1:
Thank you for your valuable comment. FABind is an effective method, particularly its inspiring framework that first predicts pockets and then performs docking. Recently, FABind+, an updated version of FABind, has also been proposed. We compare DeltaDock with both FABind and FABind+ to provide a thorough analysis.
Key Differences and Advantages of DeltaDock:
-
Pocket Prediction: While FABind/FABind+ predict pocket residues, DeltaDock redefines pocket prediction as a pocket-ligand alignment task. This contrastive approach leverages established pocket prediction methods, leading to improved accuracy. As shown below, DeltaDock achieves higher accuracy in predicting ligand binding sites:
Method % of DCC < 3.0 Å % of DCC < 4.0 Å CPLA Top-1 54.8 70.0 FABind/FABind+ 42.7 56.5 -
Structural Detail: FABind/FABind+ prioritize speed by focusing on residue-level structure. In contrast, DeltaDock adopts a bi-level strategy, modeling both residue and atomic-level protein structure. This enables a more accurate representation of binding interactions.
-
Physical Constraints: DeltaDock incorporates physical constraints in its training loss and utilizes a fast pose correction step. This ensures the physical validity of predicted poses, a feature absent in FABind/FABind+.
Performance Comparison:
The aforementioned differences translate into superior performance for DeltaDock on both the PDBbind and PoseBusters datasets.
PDBbind: While FABind+ demonstrates strong performance on PDBbind, particularly in achieving a high percentage of predictions with RMSD < 5 Å, DeltaDock consistently achieves higher accuracy in predicting poses with RMSD < 2 Å, a more stringent and relevant metric for accurate binding mode prediction.
| Methods | Time Split | Timesplit Unseen | ||||||
|---|---|---|---|---|---|---|---|---|
| % RMSD < 2 Å | % RMSD < 5 Å | % Centroid < 2 Å | % Centroid < 5 Å | % RMSD < 2 Å | % RMSD < 5 Å | % Centroid < 2 Å | % Centroid < 5 Å | |
| FAbind+ | 43.5 | 71.1 | 67.5 | 84.0 | 34.7 | 63.2 | 58.3 | 77.1 |
| DeltaDock | 47.4 | 66.9 | 66.7 | 83.2 | 40.8 | 61.3 | 60.6 | 78.9 |
PoseBusters: The performance of FABind+ significantly declines on the PoseBusters dataset, which is specifically designed to challenge docking methods with poses that appear reasonable but are physically implausible. Notably, FABind+ achieves a 0% success rate when considering physical validity (PB-Valid). This suggests potential overfitting to the PDBbind dataset. DeltaDock, on the other hand, maintains high accuracy, demonstrating its robustness and ability to generalize to challenging cases.
| Method | RMSD < 2.0 Å | RMSD< 2.0 Å & PB-Valid |
|---|---|---|
| FAbind+ | 25.0 | 0.0 |
| DeltaDock | 50.5 | 48.8 |
Conclusion:
While FABind and FABind+ are valuable contributions to the field, DeltaDock's innovative approach to pocket prediction, its consideration of detailed structural information, and the incorporation of physical constraints result in a more accurate and robust method for protein-ligand docking.
For Weakness 2:
Thank you for raising this important point regarding the choice between end-to-end and hybrid frameworks for molecular docking. While end-to-end deep learning models have gained significant traction, their direct application to molecular docking, particularly for site-specific scenarios, presents unique challenges.
For instance, blind docking methods like FABind and FABind+, while innovative, often struggle with site-specific docking. Their reliance on pre-predicted pocket information for feature embedding limits their ability to accurately model the complex interactions within a defined binding site.
DeltaDock, in contrast, adopts a hybrid approach that leverages the strengths of both deep learning and established molecular docking tools. This strategic integration offers several advantages:
- Versatility: DeltaDock seamlessly handles both blind and site-specific docking scenarios, overcoming the limitations observed in purely end-to-end methods.
- Accuracy and Generalizability: By incorporating well-established tools and physics-based principles, DeltaDock achieves superior predictive accuracy and demonstrates robust generalization capabilities across diverse datasets.
- Balance between Speed and Accuracy: While not as rapid as some end-to-end methods, DeltaDock achieves a docking time of 2-3 seconds per ligand, striking a balance between computational efficiency and predictive power. This speed, coupled with its enhanced accuracy, makes DeltaDock a practical solution for high-throughput virtual screening campaigns.
In conclusion, while the pursuit of a fully end-to-end docking framework remains a worthwhile endeavor for the field, hybrid approaches like DeltaDock offer a currently more effective solution for molecular docking.
For Weakness 3:
Thank you for your valuable suggestion. Addressing the complexities of flexible docking is undoubtedly crucial. DeltaDock's framework possesses the flexibility to accommodate such scenarios. One viable approach is integrating sampling techniques specifically designed to account for flexibility, such as DSDP-flex[1]. Furthermore, enabling protein coordinate updates during the structure refinement phase allows for greater conformational exploration. Indeed, expanding DeltaDock to directly incorporate flexible docking is a high priority in our future research endeavors.
[1]. DSDPFlex: An Improved Flexible-Receptor Docking Method with GPU Acceleration, ChemRxiv, 2023.
Once again, thank you for reviewing our paper and providing valuable suggestions, please let us know if you have any further concerns, and we are willing to answer any further questions you have.
Best regards,
Paper Authors
I appreciate the authors' hard work in the rebuttal. It has addressed all my concerns. Although the improvements are somewhat trivial, the authors have successfully demonstrated that this is a solid pipeline. I will raise my score.
Dear Reviewer,
We thank you for your insightful feedback on our manuscript. We appreciate you taking the time to thoroughly review our work and provide valuable suggestions. We are pleased that our responses have adequately addressed your initial concerns and inquiries.
The recommendations for further comparison between FABind/FABind+ and DeltaDock, as well as the insights on an end-to-end approach, will undoubtedly contribute to the clarity and comprehensiveness of our work. We will incorporate these additions into the final revision of our manuscript. Regarding flexible docking, we agree that extending DeltaDock to accommodate this paradigm represents a natural and critical next step for this research. Our ultimate goal is to develop a model capable of handling diverse docking scenarios, encompassing blind and site-specific docking, as well as rigid and flexible configurations. While further investigation is required, we hypothesize that several of our current observations, though perhaps not all, will likely remain relevant within the context of flexible docking.
Thank you again for your constructive feedback. We believe your suggestions will significantly improve the quality and impact of our work.
Best regards,
Paper Authors
A new method for molecular docking based on neural networks, called DeltaDock, is introduced. DeltaDock uses a two-step procedure. The first step is finding the binding pocket for a given ligand, which is implemented by aligning the molecular structure with the pocket embedding. The alignment is conducted with the use of contrastive learning. The second step is the placement of the ligand in the binding pocket, which is defined as a regression task on atom positions. This step includes predicting interactions on a coarse and fine level of protein representation. A fast pose correction is implemented using torsion alignment and SMINA-based energy minimization. The experimental section shows that the proposed method outperforms other state-of-the-art methods in terms of the RMSD of generated poses. Moreover, these structures are more realistic than those produced by other deep learning approaches in terms of the PoseBuster filters.
优点
Originality:
- The idea of pocket alignment using contrastive learning is interesting and novel in this context. This approach is usually used to predict drug-target interactions. It is interesting to see it applied to find binding pockets based on the ligand input.
- The torsion alignment proposed in this paper is a new and effective method of correcting conformations produced by regression models.
Quality:
- The experimental section contains results on two data splits and compares both classical and neural-network-based molecular docking models, which places the DeltaDock among the state-of-the-art methods.
- The average time of inference is provided for each method, which demonstrates the difference in execution time between DeltaDock and other methods.
Clarity:
- Overall, the paper is written clearly and is easy to follow.
Significance:
- The experiments show the proposed method's strong performance, outperforming other approaches in terms of pose quality and alignment with the ground-truth pose.
- The full docking protocol includes some constraints that ensure there are no clashes or torsional strain. This is important due to the recent criticism of deep-learning docking methods.
- At the same time, DeltaDock is faster than diffusion methods like DiffDock, which can further accelerate virtual screening.
缺点
Originality:
- The Bi-EGMN is based on a few architectural decisions that may have influenced the model's strong performance. For example, in Equations 7 and 8, the messages are weighted by the interatomic distances. In Equations 9 and 10, skip connections are used. Have you compared this architecture with one that uses other message weights or does not use skip connections? It would be interesting to see it in an ablation study.
- FABind+ should also be described in the related work as it follows a similar approach, where first the binding pocket is predicted, and then the molecule is docked inside this binding pocket. Interestingly, the size of the binding pocket is also predicted in this improved version of FABind, which speaks to the claim that current GDL methods have “difficulties in handling large binding pockets.” Also, FABind+ shows very promising results in their paper. Now, the code of this method has been published, so it would be interesting to see this model in the comparison if possible in such a short discussion period.
Quality:
- I am wondering why the average (or median) RMSD is not provided in Table 1. Other papers, including FABind and EquiBind also provided percentiles.
Clarity:
- Figure 1b could be improved to better depict the coarse level and atom level pocket representation. I am unsure if the model block shown in between refinement steps helps in the comprehension of this figure.
- The sentence in line 31 is imprecise. These methods use binding pockets in some sense because they encode the whole protein and use it to condition the generative or predictive model.
问题
- Is it possible to generate diverse poses with DeltaDock, e.g. by sampling different initial conformations? How diverse can these poses be, given that the networks are E(3)-equivariant?
- Do you have any mechanism to estimate the predicted pose confidence? For example, some docking models can also provide affinity prediction or likelihood of the predicted poses.
局限性
The limitations have been described.
Dear Reviewer, thank you for your insightful comments on our papers and methods. Below are our responses to your questions.
For Originality 1
Thank you for your insightful observation and suggestion. We acknowledge that Equations 7 and 8 in the manuscript may have been misleading regarding the weighting by interatomic distances. Taking as an example, this term can be further written as . is a unit vector, whose direction is defined by . Therefore,The actual weighting of is determined by . We will revise the manuscript to explicitly clarify this point.
As for the skip connection (SK), it is an important component for DeltaDock's performance. Here, we present the ablation study of SK in the following table. It is obvious that the model without SK performs poorly.
| Methods | Time Split | Timesplit Unseen | ||||||
|---|---|---|---|---|---|---|---|---|
| % RMSD < 2 Å | % RMSD < 5 Å | % Centroid < 2 Å | % Centroid < 5 Å | % RMSD < 2 Å | % RMSD < 5 Å | % Centroid < 2 Å | % Centroid < 5 Å | |
| DeltaDock | 47.4 | 66.9 | 66.7 | 83.2 | 40.8 | 61.3 | 60.6 | 78.9 |
| DeltaDock w/o SK | 41.9 | 63.4 | 62.5 | 80.7 | 38.7 | 57.0 | 56.3 | 75.4 |
For Originality 2
Thanks for your valuable suggestion. FAbind+ is an updated version of FABind, with improvements in larger model size, pocket radius prediction, and so on. Despite this improvement, FAbind+ still considers residue structure only. In our work, we emphasize the "difficulties in handling large binding pockets" for methods that take atomic structure into consideration. Here, we analyze the pocket radius predicted by FAbind+ and we find that the radius predicted is generally smaller than 20.0 Å (about 93% data). According to the paper, the radius less than 20 Å will be adjusted to 20.0 Å, which indicates that the radius prediction of FAbind+ only works for about 7% of data points.
| Method | Predicted Pocket Radius | |||
|---|---|---|---|---|
| Mean | 25% Percentiles | 50% Percentiles | 75% Percentiles | |
| Fabind+ | 13.0 | 10.9 | 12.4 | 14.6 |
Then, we perform experiments on the PDBbind and PoseBuster datasets. According to the result on the PDBbind dataset, although DeltaDock outperforms FAbind+ on % RMSD < 2 Å, we observe the promising result of FAbind+, especially for % RMSD < 5 Å.
| Methods | Time Split | Timesplit Unseen | ||||||
|---|---|---|---|---|---|---|---|---|
| % RMSD < 2 Å | % RMSD < 5 Å | % Centroid < 2 Å | % Centroid < 5 Å | % RMSD < 2 Å | % RMSD < 5 Å | % Centroid < 2 Å | % Centroid < 5 Å | |
| FAbind+ | 43.5 | 71.1 | 67.5 | 84.0 | 34.7 | 63.2 | 58.3 | 77.1 |
| DeltaDock | 47.4 | 66.9 | 66.7 | 83.2 | 40.8 | 61.3 | 60.6 | 78.9 |
However, when it comes to the PoseBusters dataset, the performance of FABind+ drops significantly, which may be caused by the ignoring of atomic structure and physical constraints.
| Methods | RMSD < 2.0 Å | RMSD< 2.0 Å & PB-Valid |
|---|---|---|
| FAbind+ | 25.0 | 0.0 |
| DeltaDock | 50.5 | 48.8 |
In summary, DeltaDock shows significantly more robust and promising results than FAbind+.
For Quality 1
Thanks for your valuable suggestion. We acknowledge the space constraints and have opted to present a concise table with key metrics in the main manuscript. Detailed tables, encompassing all metrics, are provided in the submitted PDF. We plan to incorporate the full table into the appendix of future versions as well.
For Clarify 1
Thanks for your valuable suggestion. We agree that removing the model blocks would enhance the clarity of our framework figure, and we are actively working on this improvement. As for line 31, we want to express that these methods mainly focus on the blind docking setting, which can be used to find new drugable binding sites and explore unseen proteins. We will revise the sentence in the next iteration of our manuscript to ensure greater clarity and precision.
For Question 1
Thanks for your valuable suggestion. Previously, we directly selected the top-1 poses as the initial conformations. Here, we try to generate diverse poses by sampling different conformations. We sample top-n (n<=10) poses and select the best structures (minimum RMSD) predicted by DeltaDock-SC to calculate the docking success rate. Our findings indicate a significant improvement in the performance of the best-performing poses with an increase in the sampling number. This suggests that DeltaDock exhibits the capability to generate diverse poses.
| Samples | PoseBusters | PDBbind | ||
|---|---|---|---|---|
| DeltaDock-SC | DeltaDock | DeltaDock-SC | DeltaDock | |
| Top-1 | 57 | 56 | 48 | 47 |
| Top-2 | 64 | 63 | 57 | 55 |
| Top-3 | 66 | 65 | 60 | 56 |
| Top-4 | 69 | 68 | 61 | 57 |
| Top-5 | 70 | 69 | 61 | 57 |
| Top-6 | 71 | 70 | 62 | 58 |
| Top-7 | 73 | 71 | 62 | 58 |
| Top-8 | 74 | 72 | 64 | 58 |
| Top-9 | 74 | 72 | 64 | 59 |
| Top-10 | 74 | 73 | 64 | 59 |
For Question 2
Thanks for your insightful suggestion. Existing docking methods like DiffDock and FABind+ utilize pose confidence evaluation modules to identify optimal poses from a set of candidates. In our response to Question 1, DeltaDock exhibits the capability to generate diverse poses. At this time, we hypothesize that integrating a confidence model for pose selection after the generation of diverse poses by DeltaDock could further enhance docking accuracy. Inspired by FAbind+ and DiffDock, we propose training a confidence model using a classification loss which aims to predict whether the RMSD between a pose and ground truth pose less than 2.0 Å. The training data for this model would be generated using the trained DeltaDock. We are actively pursuing the implementation of this idea. While we anticipate promising results, please allow us a few more days to finalize our experiments.
Once again, thank you for reviewing our paper and providing valuable suggestions, please let us know if you have any further concerns, and we are willing to answer any further questions you have.
Best regards,
Paper Authors
Dear Reviewer, thank you again for your insightful and detailed review. We conducted additional analyses and provide the results below. We believe these findings offer further clarification of the paper.
Figure 1b could be improved to better depict the coarse level and atom level pocket representation. I am unsure if the model block shown in between refinement steps helps in the comprehension of this figure.
We appreciate your insightful suggestion regarding the framework figure. In response, we have revised the figure to enhance clarity. The updated figure, which incorporates your feedback by removing the model blocks and adding a visualization of fast structure correction, can be found at this anonymous link. We believe this revised version provides a clearer representation of our framework. Thank you for this valuable suggestion.
Do you have any mechanism to estimate the predicted pose confidence? For example, some docking models can also provide affinity prediction or likelihood of the predicted poses.
Thank you for your valuable suggestion about confidence model. We are actively pursuing two avenues for incorporating confidence estimation into DeltaDock.
- For one hand, we are developing a dedicated binary classification model for confidence estimation.
- For the other hand, we also explored using SMINA directly as a confidence model.
The second approach leverages SMINA's scoring function to guide pose selection from a candidate list. Specifically, we applied SMINA scoring to the top-10 poses predicted by DeltaDock, selecting the pose with the highest SMINA score.
This simple strategy, denoted as "Top-10-SMINA," resulted in encouraging performance improvements. As shown in the table, "Top-10-SMINA" achieved a 3% improvement in docking success rate (Top-1 metric) on the PoseBusters dataset compared to using DeltaDock alone. While a slight performance decrease was observed for the PDBbind dataset, these results highlight the potential of combining diverse pose generation with confidence-based selection to achieve better docking accuracy.
| Methods | PoseBusters | PDBbind | ||
|---|---|---|---|---|
| DeltaDock-SC | DeltaDock | DeltaDock-SC | DeltaDock | |
| Top-1 | 57 | 56 | 48 | 47 |
| Top-10-Best | 74 | 73 | 64 | 59 |
| Top-10-SMINA | 55 | 59 | 41 | 40 |
Future Work: We believe that integrating a robust confidence model is crucial for improving DeltaDock's performance and will be a central focus of our future work. We will continue to explore dedicated confidence models and the utilization of scoring functions like SMINA for enhanced pose selection.
Once again, thank you for reviewing our paper and providing valuable suggestions, please let us know if you have any further concerns, and we are willing to answer any further questions you have.
Best regards,
Paper Authors
Thank you for the additional results and explanations. My comments have been properly addressed. I will keep my positive score.
Dear Reviewer,
Thank you for taking me to read our response. We appreciate your careful consideration and are pleased that our responses have adequately addressed your initial concerns and inquiries.
We particularly value your recommendations regarding ablation studies on the skip connection, further comparison between FABind+ and DeltaDock, the generation of diverse poses, and the insightful suggestions for improving the framework figure. These additions will undoubtedly enhance the clarity and comprehensiveness of our work, and we will incorporate them into the final revision. Regarding the generation of diverse poses followed by selection using a confidence model, we agree that this is an important avenue for future research. We plan to explore this approach in greater detail in subsequent work.
Thank you again for your constructive feedback. We believe your suggestions will significantly improve the quality and impact of our manuscript.
Best regards,
Paper Authors
Dear Reviewers,
Thanks again for your insightful comments and valuable suggestions, which are of great help to improve our work.
In the appended PDF file, we present additional experiments conducted in accordance with the reviewers' suggestions. These experiments aim to further strengthen our findings and address your valuable insights.
We sincerely hope that our responses effectively address your concerns. If not, please let us know your further concerns, and we will continue actively responding to your comments and improving our paper. We are looking forward to your further responses and comments.
Best regards,
Paper Authors
Dear Reviewers,
Please kindly review the authors' response with care, engage in discussions to clarify any uncertainties, and, if necessary, revise your comments accordingly.
Thanks, AC
The paper propose a new AI docking method. Briefly, the method can be decomposed into three steps:
-
pocket prediction;
-
docking with an existing tool and in this paper, DSDP;
-
refinement.
The method itself is a valuable attempt for the research community. Two additional study should be included in the next version: (i) the effectiveness of pocket prediction; trying the pocket prediction method used in FAbind (ii) a table showing the performance of refinement (by redocking or fixing the pocket prediction).