6.4

/10

Poster4 位审稿人

最低3最高5标准差0.7

3.8

置信度

创新性2.5

质量2.5

清晰度2.5

重要性2.3

NeurIPS 2025

Rationalized All-Atom Protein Design with Unified Multi-Modal Bayesian Flow

Hanlin Wu,Yuxuan Song,Zhe Zhang,Zhilong Zhang,Hao Zhou,Wei-Ying Ma,Jingjing Liu

OpenReview PDF

提交: 2025-04-29更新: 2025-10-29

摘要

关键词

Bayesian Flow NetworksProtein Generation

评审与讨论

审稿意见

评分: 4置信度: 32025-06-23

This paper proposes a novel all-atomic Bayesian flow framework for SO(3) generation. It has a rational information flow design to avoid the information shortcut from the side-chain to the sequence in joint generation. The work provides theoretical proofs and achieves good results on multiple benchmarks.

优缺点分析

Strengths:

Introduces Bayesian flow on the rotation group by mapping SO(3) generation to antipodally symmetric hypersphere (S³) sampling, avoiding the numerical discretization errors typical of diffusion or flow-matching approaches.
The authors identify and eliminate the “side-chain→sequence” shortcut problem by masking side-chain inputs in the sequence branch. Their ablation demonstrates that this design prevents degenerate learning.
Both the hypersphere and SO(3) bijection (Proposition 1) and the vMF Bayesian update rule (Eq. 5) are stated and proved, giving strong mathematical grounding.

Weaknesses:

The motivation for applying Bayesian Flow Networks to protein design is important but not adequately explained. The details of the discretization errors and the problems they introduce to protein design should be discussed.

问题

The theory demands $f(-q)=-f(q)$ , how the network architecture or training enforces this?
Bayesian flows avoid fine-grained ODE/SDE integration and thus should be faster or use fewer steps. Is there any efficiency comparison to diffusion baselines?
How sensitive are results to the choices of $\alpha_i$ for the vMF noise, which controls how “noisy” each update is?
Fully excluding side-chain inputs from sequence prediction removes the shortcut. Would this also discards useful structures like sequence correlations? Is there a balance of properly using side-chain hints?

局限性

The work has discussed its limitations in the appendix.

最终评判理由

I would like to keep my score of 4 for acceptance.

格式问题

作者回复

2025-07-31

Thank you for your thoughtful comments and suggestions. We greatly appreciate your feedback, which will help us improve our manuscript. Our responses to your concerns are outlined below:

W1: Discussion of discretization error and the motivation for applying BFN to protein design

The details of discretization error

Due to the unknown analytical solution, using ODE/SDE requires numerically solving the continuous equations, which produces only an approximation of the true continuous trajectory.

However, at each step, the solver introduces discretization error that accumulates over $n$ steps. The local error typically scales super-linearly with respect to step size. The commonly used Euler's method is a first-order ODE solver with $O(h^2)$ local error relative to step size $h$ [2].

As for BFN, the Bayesian update is analytical given conjugate priors, which avoids such numerical approximation.
The problem discretization error introduce to protein design

Unlike images which are relatively robust to noise, protein structure and function can deviate significantly when atoms' coordinates undergo small disruptions [3]. For example, in enzyme active sites, minor structural perturbations involving just a few key atoms at the angstrom scale can disrupt the geometric configuration of the active site, resulting in dramatic reduction or complete loss of enzymatic activity and catalysis [4].

Therefore, it would be ideal to avoid such sources of systematic error.
Motivation for applying BFN to protein design
- Theoretically, BFN would bypass the discretization error, resulting in higher protein design quality.
- Practically, BFN has already demonstrated success in the general molecular generation domain, including applications with small molecules [5] and crystals [6].

Q1: How does the network architecture $\Psi$ enforce $f(-q)=-f(q)$ ?

Sorry for causing confusion. We find that such antipodal equivariance $f(-q)=-f(q)$ can be easily ensured by implementing the Invariant Point Attention (IPA) module [1] based on quaternion (instead of the original rotation matrix) without extra operations or complexity. We denote this module as IPAq module. We provide a detailed explanation of this IPAq module with quaternion as the rotation operator in Algorithm 3 in Appendix F.

Here we explain the content related to antipodal equivariance.

For each point $**x**=(x,y,z)$ with its homogeneous representation $**p**=[x,y,z,0]^T$ , each rotation operation $**R**\circ**x**=**Rx**$ based on the rotation matrix $**R**$ is replaced by a quaternion-based rotation $**q** \circ **p**$ which is antipodal invariant:

**q** \circ **p**=**q** \times **p** \times (**q**)^{-1} = **-q** \times **p** \times (**-q**)^{-1}=(-**q**) \circ**p**

Therefore, the node embedding $**s**_i$ based on this IPAq module is antipodal invariant, which is mapped as an update quaternion:

**q**\_{\text{update}}=\text{MLP}(**s**\_i)=\text{MLP}(\text{IPAq}(**q**\_{i},...))

We design a residual update based on this update quaternion:

**q**\_{i+1} = f(**q**\_{i},...)=**q**\_{i} + **q**\_{i}\times \text{MLP}(\text{IPAq}(**q**\_{i},...))

Hence, the update transformation is antipodal equivariant:

f(-**q**_{i},...)\&=-**q**\_{i} - **q**\_{i}\times \text{MLP}(\text{IPAq}(-**q**\_{i},...)))\\ =-**q**\_{i} - **q**\_{i}\times \text{MLP}(\text{IPAq}(**q**\_{i},...)))\\ =-(**q**\_{i} + **q**\_{i}\times \text{MLP}(\text{IPAq}(**q**\_{i},...)))\\ = -**q**\_{i+1}=-f(**q**\_{i})

Q2: Efficiency comparison to diffusion baselines?

Thank you for your suggestion, which will allow us to more thoroughly demonstrate the efficiency advantages of our proposed model.

We compared our SO(3) generation algorithm with the SO(3) diffusion algorithm proposed by FrameDiff [7] using the synthetic dataset experiment described in Table 5 of Appendix C. To measure generative modeling capability, we calculated the Wasserstein distance between the generated data distribution and the test data distribution. The results are shown in the table below:

steps	10	50	100	150	200
ProteoBayes	0.290	0.155	0.115	0.104	0.099
FrameDiff	0.335	0.254	0.144	0.152	0.129

It can be seen that our model achieves better results with fewer steps compared to diffusion baselines, and it is worth noting that previous experiments in Euclidean space [5] and on periodic variables [6] have also similarly demonstrated that BFN can achieve better results with fewer steps than diffusion baselines.

Q3: How sensitive are results to the choices of $\alpha_i$ for vMF noise?

We appreciate the reviewer's interest in examining the robustness of our method with respect to hyperparameters.

Our experimental results demonstrate that the outcomes are not highly sensitive to the choices of $\alpha_i$ for vMF noise. The experiment is conducted as follows:

We adjust the scale of $\alpha_i, i\in[1,...,200]$ by modifying $\beta_1$ , which represents the expected accumulated accuracy at $t=1$ .

$\beta_1$	$\alpha_1$	$\alpha_2$	…	$\alpha_{199}$	$\alpha_{200}$
100	0.478	0.515	…	3.116	3.149
1000 (used in reported results)	0.610	0.656	…	30.404	31.295
10000	0.720	0.776	…	394.137	410.890

To evaluate the performance of different choices, we conducted experiments on peptide conformation generation using the PepBDB dataset:

Model	RMSD $\text{C}_{\alpha}$ ↓	RMSD $_{\text{atom}}$ ↓	DockQ ↑
DiffAb	13.96	13.12	0.236
PepGLAD	8.87	8.62	0.403
ProteoBayes ( $\beta_1$ =100)	3.54	4.69	0.587
ProteoBayes ( $\beta_1$ =1000, reported)	3.64	4.91	0.563
ProteoBayes ( $\beta_1$ =10000)	3.58	4.70	0.593

Generally, the results remain relatively stable despite significant variations in the scale of $\alpha_i$ .

It is also worth noting that these alternative parameters yielded slightly better results than reported because we did not perform an exhaustive hyper-parameter search process.

Q4: Would fully excluding side-chain inputs from sequence prediction discard useful structures? Is there a balance of properly using side-chain hints?

Would this also discards useful structures like sequence correlations?

Thank you for raising this important methodological question.

Based on our analysis, excluding side-chain information from sequence prediction does not negatively impact the performance ceiling of sequence generation.

This stems from the fundamental unfair relationship between sequence and side-chain in the data distribution: The sequence $\mathcal{S}$ serves as the primary determinant of residue types, while side-chain configurations are dependent on the sequence for both side-chain atom types and conformational specifications.

Consequently, side-chain information is not essential for sequence prediction. On the contrary, the network must develop the capability to modify side-chain states in accordance with sequence parameters.

Nevertheless, we recognize that when properly utilizing side-chain input, the network could leverage side-chain states to guide sequence prediction, potentially leading to better structural alignment between sequences and side-chain.
Is there a balance of properly using side-chain hints?

That’s a quite insightful observation and question!

Although our current information flow architecture has demonstrated effectiveness in generating all-atom geometry, we believe there exist a way to properly incorporate side-chain information into sequence prediction for better performance as a promising future work.

From a technical perspective, this is because the side-chain angles $\chi$ or Euclidean atom coordinates is composed of partial discrete residue type information and partial side-chain geometric information.

A promising approach would be to develop a proper dis-entangled representation for sequence and side-chain, effectively jointly modeling these two information bypassing the information shortcut challenge.

[1] Jumper, John, et al. "Highly accurate protein structure prediction with AlphaFold." nature 596.7873 (2021): 583-589.

[2] Karras, Tero, et al. "Elucidating the design space of diffusion-based generative models." Advances in neural information processing systems 35 (2022): 26565-26577.

[3] Karplus, Martin, and Gregory A. Petsko. "Molecular dynamics simulations in biology." Nature 347.6294 (1990): 631-639.

[4] Frauenfelder, Hans, Gregory A. Petsko, and Demetrius Tsernoglou. "Temperature-dependent X-ray diffraction as a probe of protein structural dynamics." Nature 280.5723 (1979): 558-563.

[5] Song, Yuxuan, et al. "Unified generative modeling of 3d molecules with bayesian flow networks." The Twelfth International Conference on Learning Representations. 2024.

[6] Wu, Hanlin, et al. "A periodic bayesian flow for material generation." arXiv preprint arXiv:2502.02016 (2025).

[7] Yim, Jason, et al. "SE (3) diffusion model with application to protein backbone generation." arXiv preprint arXiv:2302.02277 (2023).

评论- Friendly Reminder — Your Feedback Is Greatly Appreciated By Us!

2025-08-05

Dear Reviewer r2HE,

Thank you once again for your thoughtful comments and suggestions which is immensely valuable and has helped us improve our work substantially.

As the reviewer-author discussion period is nearing its conclusion (August 8, 11:59 PM AoE), we would like to gently remind you that we would greatly appreciate your thoughts on our recent response.

In our reply, we carefully respond to your concerns including discussing the issue of discretization error in protein design, comparing our method's efficiency with diffusion-based baselines, analyzing robustness to the vMF noise parameter $\alpha_i$ , and examining the role of side-chain information in sequence prediction alongside potential approaches for disentangled modeling.

If you have any further questions, suggestions, or would like us to elaborate on any part of our response, please don’t hesitate to let us know — we’d be more than happy to provide additional clarification or conduct further experiments.

Thank you again for your time and valuable feedback. We sincerely look forward to hearing from you!

Best regards,

The Authors

2025-08-05

Thanks for the response. I have no other concerns. One further suggestion is to incorporate the content in W1 into the paper.

评论- Thank You!

2025-08-05

Thank you so much for your response and suggestion! We'll be sure to incorporate the content from W1 into the future version of our paper as suggested :)

审稿意见

评分: 4置信度: 42025-07-02

This paper presents ProteoBayes, an end-to-end framework for all-atom protein design based on a unified multi-modal Bayesian flow. The method jointly generates amino acid sequences, backbone structures, and side-chain conformations while addressing challenges related to modeling inconsistency and information leakage. A key contribution is the formulation of a Bayesian flow over SO(3), achieved by mapping rotations to unit quaternions on the hypersphere $S^3$ and enforcing antipodal symmetry to ensure valid generation. Additionally, the authors propose a rationalized information flow design that prevents side-chain torsion angles from leaking residue-type information during sequence prediction, improving generalization. Experimental results show that ProteoBayes outperforms prior methods in peptide and antibody design tasks.

优缺点分析

Strengths

Strong alignment between BFN and protein design modalities The paper systematically applies the BFN framework to all-atom protein design, effectively modeling diverse data types: vMF distributions on $S^3$ for backbone rotations, circular distributions for side-chain torsions, and discrete BFNs for residue types. This highlights BFN’s flexibility and simplicity across continuous, periodic, and discrete modalities in structural biology.
Theoretically grounded modeling with principled handling of symmetry and information flow The method extends BFN to non-Euclidean spaces via antipodal symmetry constraints, enabling valid generation over SO(3) through the unit quaternion space $S^3$ .

Weaknesses

The information leakage issue stems from the authors’ modeling choice rather than the task itself. By representing all residues with fixed-length $\chi$ vectors padded with zeros, the model implicitly encodes residue-specific torsion patterns into the input. This design choice introduces leakage by construction. The issue could be fundamentally avoided by using dynamic χ lengths or residue-type-based masking.
The SO(3) to $S^3$ transformation lacks novelty. Modeling rotations with quaternions and addressing antipodal symmetry are well-established techniques. The paper does not introduce new mathematical insights, but rather applies standard tools within the Bayesian flow framework—an engineering integration rather than a theoretical contribution.

问题

Why did the authors choose to represent side-chain torsion angles with a fixed-length $\chi$ vector padded to four angles, rather than using a residue-type-dependent dynamic length or masking? This design appears to introduce an artificial information leakage issue rather than reflecting an inherent modeling challenge.
The paper does not clearly specify the training setup: was a single universal model trained across all design tasks, or were separate task-specific models used? Please clarify the datasets, model checkpoints, and training regimes.
In the design setting, what inputs are provided to the model? If receptor structure and sequence are included, the current evaluation does not fully validate the framework in realistic pipeline scenarios—can the authors elaborate on this?
Given that the model predicts backbone positions p and orientations R (or quaternions q), why is it necessary to explicitly model the backbone dihedral angle $\phi$ ? This variable may be redundant unless justified.
Have the authors considered refolding the generated sequences using AlphaFold 3 and computing structural quality metrics (e.g., RMSD, TM-score, binding energy)? Such an experiment could strengthen the validation of both peptide and antibody design results.

局限性

The framework primarily integrates existing techniques—such as quaternion-based rotation modeling and antipodal symmetry handling—without introducing substantial theoretical innovation.
Experimental details are insufficient, and the evaluation would benefit from folding the designed sequences using AlphaFold3 to verify structural and functional fidelity.

最终评判理由

The rebuttal provided clear clarifications on the source of information leakage, demonstrating it is partly intrinsic to the data, and added AlphaFold3 refolding experiments that strengthen validation. Details on datasets, training setups, and input configurations resolved earlier ambiguities. The SO(3) $\rightarrow$ quaternion projection work is a genuine theoretical contribution in this context, though its novelty remains moderate given prior uses in other domains. Overall, most technical concerns were addressed, but scope limitations and moderate novelty lead me to retain my original score.

格式问题

No major formatting issues observed.

作者回复

2025-07-31

Thank you for your thorough review and insightful suggestions. Your feedback will help us significantly improve our work. We address your concerns as follows:

W1 & Q1: About the source of information leakage: author’s modeling choice or the task itself?

We greatly appreciate your insightful observation, which raises an excellent point that we will clarify and address to improve our paper's presentation!

Why do we use a fixed-length $\chi$ vector rather than a dynamic length or masking?

This is because: In the sequence-structure co-design scenario, we face a fundamental constraint: sequence determination only occurs at $t=1$ . Prior to this point, we work with a probability distribution $\theta^{\mathcal{S}}_t$ across 20 possible residue types for $t\in[0,1)$ . Consequently, we cannot definitively establish the length of $\chi_t$ or implement appropriate masking during training. Therefore, current approaches [3, 4] remedy this by modeling the maximum possible number of random variables throughout both training and sampling processes.

Nevertheless, we would like to emphasize that even in the absence of this issue, the information shortcut remains an intrinsic property of the underlying protein data distribution, as detailed below:

The information shortcut can still exist even without paddings / with masks

We would like to further explain that the information shortcut can still exist even without paddings with masks.

This is because the side chain torsion distributions themselves (those non-paddings values) are inherently dependent on the specific residue types. Such dependency even constitutes the idea of the rotamer library [5, 6], where scientists determine a set of possible side-chain conformations according to the sequence and search for the lowest energy combinations to predict the protein conformation.

As a distinct example, the $\chi_2$ distribution of Tyrosine is $\pi$ -rotation-symmetric, i.e., $p^{\text{Tyr}}(\chi_2)=p^{\text{Tyr}}(\chi_2+\pi)$ , while this symmetric property does not hold for Isoleucine $p^{\text{Ile}}(\chi_2) \neq p^{\text{Ile}}(\chi_2+\pi)$ [1, 2].

Therefore, the network can still utilize such sequence-sidechain correlation as an information shortcut to “predict” the residue types even with dynamic $\chi$ length or masking.

Finally, we apologize for any insufficient explanations that may have caused misunderstandings, and we sincerely appreciate your thoughtful feedback.

W2&L1: About the novelty of projecting SO(3) to $S^3$

We thank the reviewer for pointing out the literature to better highlight our contribution.

We would like to clarify that although rotation quaternions are widely used in computational graphics and other areas, how to correctly use them in generative modeling remains unclear. Our theoretical contribution primarily lies in identifying its value in constructing conjugate Bayesian update, find the flaw in naive projection, i.e., $p_{\Psi}(-**q**)\ne p_{\Psi}(-**q**)$ , and addressing this issue by proposing a simple yet effective approach to solidify and ground such transformation in generative modeling scenario. Table 4 demonstrates the necessity and effectiveness of the proposed techniques.

Q2&L2: Clarification on training and evaluation setups: datasets, model checkpoints, and training regimes

We apologize for the caused confusion and thank you for pointing out the need for a more comprehensive explanation of our methodology. The detailed information regarding our training and evaluation setups are provided below:

Training
- For datasets
  
  For peptide design, following [9], we use two datasets: PepBench and PepBDB. For antibody design, we follow prior practices [11] utilizing the SAbDab database of antibody–antigen complexes as our training dataset and evaluate model performance using the RAbD.
- Model checkpoints
  
  We train a randomly initialized neural network for each protein design task from scratch and use every task’s own checkpoint to evaluate.
- Training regimes
  
  Our training regimes follow standard settings in the field. We use the AdamW optimizer with an initial learning rate of 5e-4 that gradually decays to 5e-6. Each training task runs for 100k steps.
Evaluation
- Peptide Co-Design
  
  Our evaluation metrics code is sourced from the official PepGLAD and PPflow repositories
  1. We generate 40 peptides for each receptor in the test set .
  2. After sampling, we perform energy minimization through Rosetta relaxation and calculate binding energy using PepGLAD’s scripts and settings.
  3. We compute metrics like DockQ and RMSD $\text{C}_{\alpha}$ for each generated sample. For design-related metrics, we rely on scripts from PPFlow. Finally, we average these design metrics across the samples and average all metrics across receptors.
- Peptide Binding Conformation Generation
  1. We sample 10 conformations for each receptor-peptide pair in the test set.
  2. After Rosetta relaxation, we calculate binding energy for each sample.
  3. We evaluate structural accuracy using DockQ, RMSD $\text{C}_{\alpha}$ , and all-atom RMSD metrics.
- Antibody Design
  1. We generate 64 CDR-H3 sequences for each antigen.
  2. We apply a side-chain-only relaxation protocol while keeping the backbone fixed.
  3. We evaluate both binding energy and total energy using the ref2015 score function and assess design quality through RMSD and amino acid recovery (AAR) metrics using dyMEAN’s [11] scripts.

If you require any additional information or clarification regarding the methodologies described above, please don't hesitate to let us know!

Q3: Elaboration on the input of the design setting

We appreciate the reviewer's feedback regarding realistic pipeline frameworks.

We understand that there might exist gap between current in-silico protein design approaches and real-world implementation requirements. However, for fair methodological comparison with existing in-silico methods, we maintain consistency with established methodological practices for comparative evaluation purposes.

Peptide Design

Following [8, 9, 12], we use the receptor pocket's sequence and structure as input to our model.
Antibody Design

Following [4, 11], our model takes as input the sequence and structure of the antigen-antibody complex, excluding the CDR-H3 region that is being designed.

Q4: About the usage of the variable $\phi$

We apologize for any confusion caused by our notation.

The backbone dihedral angle we model is $\psi$ , which determines the position of the O atom in the backbone frame as used in [2, 7, 8]. Figure 1 in [7] illustrates that: Although $**R**/**q**$ and $**p**$ can determine the positions of $\text{C}_{\alpha}, \text{C}$ and $\text{N}$ atoms in the backbone, the oxygen atom in the backbone still has a torsional degree of freedom. Therefore, the variable $\psi$ is needed to further determine its position.

In this draft, the notation $\phi$ does not refer to the backbone dihedral angle. Instead, it denotes the surjective homomorphism $\phi:\text{SU}(2)\rightarrow \text{SO}(3)$ (defined in Section 3.2).

To avoid confusion, we will use a different notation for this mapping in the future version.

Q5&L2: Refolding generated sequences using alphafold3

Thank you for your valuable suggestion on improving the quality of our evaluation.

Using the PepBench co-design task’s result, we fold the peptide sequences designed by ProteoBayes and PepFlow along with the reference receptor sequences using AlphaFold3 (no MSA for faster inference), and evaluated the structural similarity between the folded structures and AlphaFold3-folded structures. For the structures generated by AlphaFold3, we use the same relax protocol and evaluate their PyRosetta binding energy:

	RMSD $\text{C}_{\alpha}$ ${\downarrow}$	RMSD $_{\text{atom}}$${\downarrow}$	TMScore ${\uparrow}$	AlphaFold3 dG	Designed dG
PepFlow	3.247	4.619	0.390	-17.88	-23.09
ProteoBayes	2.370	4.070	0.445	-17.93	-28.77

Our AlphaFold3 refolding experiments show ProteoBayes outperforms PepFlow on all structural metrics: lower RMSD values, higher TMScore, confirming ProteoBayes produces structures with better AlphaFold3 alignment than PepFlow.

[1] Zhang, Yangtian, et al. "Diffpack: A torsional diffusion model for autoregressive protein side-chain packing." NeurIPS (2023).

[2] Jumper, John, et al. "Highly accurate protein structure prediction with AlphaFold." nature .

[3] Li, Jiahan, et al. "Full-atom peptide design based on multi-modal flow matching." ICML (2024).

[4] Zhu, Tian, et al. "Antibody design using a score-based diffusion model guided by evolutionary, physical and geometric constraints." ICML (2024).

[5] Dunbrack Jr, Roland L. "Rotamer libraries in the 21st century." Current opinion in structural biology 12.4 (2002).

[6] Bhuyan, Md Shariful Islam, and Xin Gao. "A protein-dependent side-chain rotamer library." BMC bioinformatics.

[7] Yim, Jason, et al. "SE (3) diffusion model with application to protein backbone generation." ICML (2023).

[8] Li, Jiahan, et al. "Full-atom peptide design based on multi-modal flow matching." ICML (2024).

[9] Kong, Xiangzhe, et al. "Full-atom peptide design with geometric latent diffusion." NeurIPS (2024).

[10] Ye, Fei., et al. "A holistic evaluation of protein foundation models." ICLR (2025).

[11] Kong, Xiangzhe, et al. "End-to-end full-atom antibody design." ICML (2023).

[12] Lin, Haitao, et al. "Ppflow: Target-aware peptide design with torsional flow matching." ICML (2024).

2025-08-06

Thank you for the detailed and thoughtful response. I appreciate the clarifications and the additional experiments, which help address several of my concerns. While some aspects could still benefit from further refinement or validation, the overall effort to improve the manuscript is clear and commendable. I will take this into consideration and may adjust my score accordingly.

评论- Kind Reminder — We Sincerely Look Forward to Hearing From You!

2025-08-05

Dear Reviewer XpUT,

Thank you once again for your thorough review and insightful suggestions — your feedback is invaluable in helping us improve our work.

As the reviewer-author discussion period is ending soon (August 8, 11:59 PM AoE), we would like to gently remind you that we are eagerly awaiting your feedback on our response.

In our reply, we further clarify the question regarding the source of information shortcut (not only padding but also the side-chain itself). We also conducted additional experiments, including refolding the generated sequences using AlphaFold 3.

If you have any further questions or would like us to elaborate on any part of our response, please don’t hesitate to let us know — we would be more than happy to provide further clarification or information.

Thank you again for your time and thoughtful feedback. We sincerely look forward to hearing from you!

Best regards,

The Authors

评论- Thank You For Your Encouragement! And Kind Question on "further refinement or validation”

2025-08-06

Dear Reviewer XpUT,

Thank you for your encouragement and for acknowledging our efforts to improve the manuscript! Your feedback is greatly appreciated.

We have carefully considered your valuable suggestion that "some aspects could still benefit from further refinement or validation." To ensure we can follow your idea and possibly address this effectively, we would be very grateful if you could elaborate on this specific refinement or validation.

We are fully prepared and will be committed to conduct additional targeted experiments or enhance our explanations to further strengthen the manuscript based on your guidance.

Thank you once again for your invaluable support and guidance.

Best regards,

Authors

审稿意见

评分: 5置信度: 42025-07-03

The paper proposes a Bayesian flow method for protein sequence-structure co-generation, including sidechains. Authors propose a new Bayesian flow formulation for SO3 generation, needed for protein backbones. They show, that sequence-structure models that incorporate sidechains, can provide a shortcut for sequence prediction, that negatively impacts model training. They adapt their architecture to avoid this, by sequence part of the model being blind to the current sidechain positions. The method is benchmarked on peptide and antibody (CDR H3) design tasks. Where they show superior performance to baselines.

优缺点分析

To the best of my knowledge this is the first work using Bayesian flows for protein generation. The proposed formulation adapting BFNs to S03 looks sound and is new as far as I know. I think trying out new types of generative models for proteins is quite valuable for the field. The paper is well written and easy to follow.

The insight made about shortcut learning is quite valuable and new. It also is nicely backed by in-silico KL divergence experiments.

The experiments are quite standard in the field and reasonable extensive. Including ablations of the proposed method. The antibody experiment (Table 3) experiments could include some newer and harder baselines, like AbX (https://openreview.net/pdf?id=1YsQI04KaN), that now is not cited. But in the end I can't see much wrong or lacking in the paper.

问题

Are there some other (simpler) solutions for the shortcut issue? For example one could use a training mask and pad with arbitrary (or random) values, that would still be further noised/denoised, just without training loss applied to them during training. At least equation 12 would then stop working. Such non-0 padding strategies have been used, like with 'ghost' sidechain atoms in AbDiffuser (https://arxiv.org/pdf/2308.05027, it also makes sense to include it in the related work). Although never with the aim of improving sequence prediction performance. It would be curious to see what fraction of the issue comes from padding, and which from the simple fact that noised non-padded angles always will carry some information about the sequence, even if it's hard to pull out.

Speaking about shortcuts, why in Fig 3 b) the training performance with the shortcut is so much worse? One would assume that a properly tuned model should still fit the training set just as well with extra (shortcut) information that makes the task easier.

局限性

yes

最终评判理由

The authors addressed the main points I raised, thus i recommend acceptance now.

格式问题

None

作者回复

2025-07-31

We sincerely appreciate the reviewer’s recognition of our methodology and the identification of the information shortcut. We address your concerns as follows:

W1: About including AbX as a baseline

Thank you to the reviewer for providing this valuable suggestion, we will add the citation and comparison to AbX in the future version. Our experiment on including AbX is as follows:

We use the checkpoint from AbX's code repository to perform inference, relaxation (only relax side-chain) and calculate metrics under our experimental setup. The results are as follows:

Method	AAR ↑	RMSD ↓	E $_{\text{total}}$ ↓	$\Delta \text{G}$ ↓
Test Set	100.0	0.00	-16.76	-15.33
dyMEAN	40.05	2.36	1239.29	612.75
DiffAb	35.04	2.53	495.69	489.42
AbDPO	31.29	2.79	270.12	116.06
AbDPO++	36.25	2.48	338.14	223.73
AbX	41.27	2.40	281.29	237.22
ProteoBayes	42.12	2.27	54.68	44.12

Based on the table, AbX demonstrates strong performance among the baseline methods, while ProteoBayes still consistently outperforms all competitors across each evaluated metric.

Q1: Will using training masks or other/random padding values work? Are there other simpler solutions for the shortcut issue?

Will using training masks or other/random padding values work?

Thank you for your insightful idea. You've raised an excellent point that we will address to improve our paper's clarity in the future version. And thanks for pointing out a relevant work AbDiffuser, which we will include in the related work in the future version.

For training mask, it is hard to apply an appropriate mask

This is because in the sequence-structure co-design scenario, we face a fundamental challenge, i.e., sequence determination only occurs at $t=1$ . Prior to this point, we work with a probability distribution $\theta^{\mathcal{S}}_t$ across 20 possible residue types for $t\in[0,1)$ . Consequently, we cannot definitively establish the length of $\chi_t$ or implement appropriate masking during training. Therefore, current approaches [3, 4] remedy this by modeling the maximum possible number of random variables throughout both training and sampling processes.
For random padding, it might alleviate the problem while patterns still exist

We acknowledge that the random padding might make it more difficult to directly utilize sequence information in the side chains (making Eq 12 stop working). A fixed random distribution's pattern is still possible to be utilized.

More importantly, we would like to further explain that:

The information shortcut still exists even without the padding issues.

This is because the side chain torsion distributions themselves (those non-paddings values) are inherently dependent on the specific residue types.

For example, the $\chi_2$ distribution of Tyrosine exhibits $\pi$ -rotation symmetry, i.e., $p^{\text{Tyr}}(\chi_2)=p^{\text{Tyr}}(\chi_2+\pi)$ , while this symmetry does not exist for Isoleucine $p^{\text{Ile}}(\chi_2) \neq p^{\text{Ile}}(\chi_2+\pi)$ [1, 2].

Such dependency between residue type and side-chain conformation even forms the foundation of the rotamer library concept [3, 4], where scientists determine possible side-chain conformations based on the sequence and search for the lowest energy combinations to predict protein conformation.

Consequently, the network can still exploit this sequence-sidechain correlation as an information shortcut to "predict" residue types, regardless of whether training masks or alternative padding values are used. In the future version, we will include visualizations of torsion angle distributions from the PepBench training dataset to illustrate this (as providing images is officially forbidden during the rebuttal stage). These visualizations clearly show that side chain torsion distributions display distinct patterns based on residue type, representing another source of information shortcut beyond padding.

Finally, we sincerely appreciate your thoughtful feedback.

Are there other (simpler) solutions for the shortcut issue?

We deeply appreciate your recognition of our identification of this issue.

For now, we believe there are two other possible solutions:

One promising approach would be introducing a neural network module $\Psi^{\text{detach}}(\theta^{\chi})=h^{\text{noseq}}$ to detach and output sequence-agnostic features $h^{\text{noseq}}$ from side-chain embedding $\theta^{\chi}$ . This module could be adversarially trained to minimize the classification accuracy of a discriminator $\text{Classifier}(h^{\text{noseq}})$ that attempts to predict the residue type $\mathcal{S}$ according to the detached sequence-agnostic features $h^{\text{noseq}}$ .
Another approach would be to carefully design a noise schedule or flow path for side-chain information such that the sequence information implied in side-chain input $\theta^{\chi}_t$ will never exceed that contained in sequence input $\theta_t^{\mathcal{S}}$ . Intuitively, this approach aim to moving the cross point in Figure 3(a) to occur at $t=0$ .

However, these two methods also have potential problems:

Adversarial training is often unstable and prone to degradation. How to prevent the adversarial loss from affecting the overall optimization objective is an important issue.
Such a schedule design would be ad hoc, limiting design flexibility and possibly affect the quality of sequence or side-chain generation.

Compared to these alternatives, our current information flow approach is more fundamental, easy to implement, and elegant.

Q2: Why is the training performance with the shortcut so much worse?

This is an excellent question! First, we consider that your intuition is quite reasonable—providing more information to the network should not decrease training performance. However, the optimization dynamics might not be that ideal in practice.

Our explanation for this phenomenon is that: In a single-modality scenario, providing additional information during training would typically lower the loss.

However, in the co-design setting, the sequence prediction head leveraging the shortcut rapidly converges toward a local optimum. Additionally, the representations developed for prediction with the shortcut deviate substantially from those of other modalities that are genuinely attempting to predict the ground truth, resulting in gradient conflicts between the different optimization objectives.

These conflicting gradients make it difficult for the network to compress all inputs for a general meaningful representation, and ultimately leading to suboptimal performance.

This explanation is supported by the loss curve in Fig. 3(b) and the gradient norm curves (which we will include in the future version, as updating images is officially forbidden during the rebuttal stage). During the later stages of training, networks with shortcuts consistently show significant fluctuations in the loss function, accompanied by higher gradient norms. This indicates that the network is actively trying to escape the local optimum but remains trapped within it.

[1] Zhang, Yangtian, et al. "Diffpack: A torsional diffusion model for autoregressive protein side-chain packing." Advances in Neural Information Processing Systems 36 (2023): 48150-48172.

[2] Jumper, John, et al. "Highly accurate protein structure prediction with AlphaFold." nature 596.7873 (2021): 583-589.

[3] Dunbrack Jr, Roland L. "Rotamer libraries in the 21st century." Current opinion in structural biology 12.4 (2002): 431-440.

[4] Bhuyan, Md Shariful Islam, and Xin Gao. "A protein-dependent side-chain rotamer library." BMC bioinformatics 12.Suppl 14 (2011): S10.

评论- Friendly Reminder —— We Truly Look Forward to Hearing From You!

2025-08-05

Dear Reviewer 2sZG,

Thank you once again for your recognition and valuable insights!

As the reviewer-author discussion period concludes soon (August 8, 11:59 PM AoE), we’d like to kindly remind you that we’re eagerly awaiting your feedback on our response.

In our response, we’ve included the AbX baseline and answered your thoughtful questions regarding alternative solutions to information shortcuts, as well as the model’s performance in scenarios involving such shortcuts.

If you have any further questions or require clarification on any aspect of our work, please do not hesitate to let us know.

Thank you for your time and effort, and we truly look forward to hearing from you!

Best regards,

Authors

2025-08-08

Thank you for your thorough rebuttal and addressing my questions. I recommend the paper for acceptance.

评论- Thank you!

2025-08-08

We sincerely appreciate your support and your recommendation for our work!

审稿意见

评分: 3置信度: 42025-07-23

The paper introduces ProteoBayes, a Bayesian Flow Network-based all-atom protein design model. It takes use of the homomorphism from SU(2) to SO(3) and maps the orientation matrix into a 4-dimensional sphere, which makes Bayesian posterior computation tractable. Further, it resolves the issues of information shortcut in all-atom protein generation by a novel information flow. It also achieves significant improvements in peptide design and the antibody design task.

优缺点分析

Strengths

The paper is logically coherent and well written.
It gives a rigorous proof of homomorphic mappings and carefully constructs an Antipodal distribution on S^3.
The work discovers and proves the existence of an information shortcut and designs an information flow that separates information from sequence and that from sidechain torsion.

Weaknesses

This work does not provide the training and sampling time cost and memory cost.
It lacks comparison with other methods in the visualization part.
The paper lacks detailed information regarding the calculation of evaluation metrics, such as the specific parameters used.

问题

Does information shortcut exist in the next time step? The sequence parameter current time will meet the Sidechain Torsion parameter in the next time step.
Most said BFN is equivalent to diffusion in certain conditions. In the last paragraph in the Appendix, it says BFN does not involve differential terms and is discrete-time in this work. Can the differential term cancel out by score matching in the diffusion model? Can a discrete-time BFN be a discrete version of a continuous-time diffusion model?

局限性

yes

格式问题

No Paper Formatting Concerns.

作者回复

2025-07-31

Thank you for your insightful feedback and thoughtful suggestions. Your comments will be invaluable in improving the quality of our work. Our responses to your concerns are as follows:

W1: About the training, sampling and memory cost

We appreciate the reviewer's suggestion in our computational resource specifications which will further improve the quality of our paper. Our framework utilizes four A100 GPUs for each training task and one A100 GPU for inference. Below, we provide comprehensive details regarding the time and memory requirements for both training and sampling processes (these specifications will be incorporated into the revised manuscript):

		Batch Size	Time	GPU Memory	RAM Memory
Peptide Design (PepBench)	Training	40 x 4	~ 55h	~ 25663 x4 MB	~3216 x 4 MB
	Sampling	40	1h 33min	~ 3294 MB	~3122 MB
Antibody Design	Training	12 x 4	~ 51h	~45237 x 4 MB	~3297 x 4 MB
	Sampling	128	4h 53min	~ 49362 MB	~3846 MB

W2: About the comparison with other methods in the visualization part

We are truly grateful for your valuable suggestion, which will enhance the quality of our paper providing readers with a clearer understanding of our approach's advantages. However, in accordance with official policy:

"Because of known concerns on identity leakage, we prohibit using any links in the rebuttal, including but not limited to anonymous or non-anonymous URL links, or updating your submitted GitHub repository. We understand that 1) and 2) effectively prevent using image/video or other rich media as a communication protocol between reviewers and authors. We are sorry about that."

We are unfortunately unable to provide the requested visual comparison at the rebuttal stage.

Nevertheless, we would like to promise that we will include the visual comparison with existing methods in the future version of the paper.

W3: Detailed information on calculating evaluation metrics

Thank you for pointing out the need for a more detailed explanation of our evaluation process. The comprehensive information regarding our evaluation metrics calculation process below: During the training stage, for each task, including peptide co-design, peptide binding conformation generation, and antibody co-design, we re-train a randomly initialized model on the specific training set (PepBench, SAbDab).

Peptide Co-Design

The procedure for calculating metrics follows the approach outlined in [1] and [2]. Specifically, our evaluation metrics code is sourced from the official PepGLAD and PPflow repositories.
1. We sample $N_{samples}=40$ peptides based on the binding pocket of each receptor in the test set, as described in Algorithm 2, Appendix D, with 200 steps.
2. After peptide sampling, we perform Rosetta relaxation (with parameter -relax:default_repeats 2) and use the InterfaceAnalyzer function in PyRosetta (the version we use is 2024.35) to compute the binding energy using the PepGLAD’s script in its official repository.
3. We then use the cal_metrics function in PepGLAD’s script to calculate DockQ and $\text{RMSD}\_{\text{C}\_\alpha}$ .
4. For design-related metrics, we rely on scripts from PPFlow to calculate validity, sequence similarity, and structure similarity. Finally, we average these design metrics across the $N_{samples}$ and average the metrics across receptors.
Peptide Binding Conformation Generation

The procedure for calculating metrics follows the methodology described in [1].
1. We sample $N_{samples}=10$ peptides based on the binding pocket of each receptor and peptide sequence in the test set.
2. After sampling, we conduct Rosetta relaxation and use the InterfaceAnalyzer function in PyRosetta to compute the binding energy following the same protocol as PepGLAD’s script.
3. Additionally, we use the cal_metrics function in PepGLAD’s script to calculate DockQ, $\text{RMSD}\_{\text{C}\_\alpha}$ , and $\text{RMSD}\_{\text{atom}}$ .
Antibody Design

This process follows the procedure described in [3].
1. We sample $N_{samples}=64$ CDR-H3 sequences for each antigen in the test set.
2. Next, we apply the relaxation protocol from [3], which relaxes only the side-chain atoms while keeping the backbone atoms fixed.
3. Subsequently, we use the InterfaceAnalyzer and the create_score_function('ref2015') function in PyRosetta to compute both the binding energy and total energy.
4. To calculate RMSD and AAR, we use the cal_metrics function in dyMEAN’s [8] script.

If you require additional information or clarification on any of the methodologies described above, please do not hesitate to let me know!

Q1: The possibility of an information shortcut in the next time step?

Thank you for your insightful question! We find this question particularly meaningful and worthy of discussion.

We understand your concern: during sampling, given the dependency relationship $\theta^\chi\_{i-1} \overset{\Psi}{\leftarrow} \hat{**p**}\_i \overset{h}{\leftarrow}\theta^{**p**}\_{i} \overset{\Psi}{\leftarrow} \hat{\mathcal{S}}\_{i+1}$ , it appears that the sequence prediction $\hat{\mathcal{S}}\_{i+1}$ might depend on side-chain information via the backbone position $\hat{**p**}\_i$ from two time steps prior $\theta^\chi\_{i-1}$ .

However, this does not actually create an information shortcut learning problem for the following reasons:

During training, the loss function is based solely on a single network prediction $\theta^\chi_{i-1} \overset{\Psi}{\leftarrow} \hat{**p**}_i \overset{**p**}{\leftarrow} \mathcal{L}^{**p**}$ rather than on multiple network predictions. Therefore, the network is unable to learn the correlation across two steps.
During sampling, although the network $\Psi$ may potentially utilize the sequence information from the previous step, the network parameters are not updated (meaning the network can not utilize the side-chain sequence relationship across two steps), so from this perspective, no information leakage occurs.

Q2: Discussion of the equivalence between BFN and diffusion model

That's a great point to discuss where we are very happy to share our views on different generative models!

Can the differential term cancel out by score matching in the diffusion model?

We humbly believe that the differential term can not be cancelled out because the definition of the ground truth score $\nabla_{\mathbf{x}} \log p(\mathbf{x})$ in score matching still involves taking the derivative of a probabilistic density function, while the Bayesian update function $h$ does not include derivative terms.
Can a discrete-time BFN be a discrete version of a continuous-time diffusion model?

From the perspective of marginal distributions, we believe a discrete-time BFN can be approximated by carefully constructing a discretization of SDE that shares the same marginal distribution $p_t(**x**)$ as seen in the approach in [4], particularly for Gaussian distribution-related BFNs.

However, we must emphasize that sharing the same marginal distribution does not establish the exact equivalence between BFNs and diffusion models as generative modeling approach because:
1. BFN offers a discretization-free and analytical approach to precisely update states according to the noised sample and its accuracy based on the Bayesian theorem, whereas diffusion models require carefully designed discretization schedules [5, 6].
2. SDEs cannot fully capture all Bayesian flows. The SO(3) Bayesian flow used in this paper incorporates accumulated accuracy parameters $\kappa_i$ that are inherently stochastic and cannot be fully represented within a conventional SDE framework.
3. Sharing the same marginal distribution does not guarantee equivalent generation quality, as seen with SDE and ODE: Although the SDE and its corresponding probability flow ODE share the same marginal distribution [7], their generation quality differs in practice [5].

[1] Kong, Xiangzhe, et al. "Full-atom peptide design with geometric latent diffusion." Advances in Neural Information Processing Systems 37 (2024): 74808-74839.

[2] Lin, Haitao, et al. "Ppflow: Target-aware peptide design with torsional flow matching." arXiv preprint arXiv:2405.06642 (2024).

[3] Fei, Y. E., et al. "ProteinBench: A Holistic Evaluation of Protein Foundation Models." The Thirteenth International Conference on Learning Representations.

[4] Xue, Kaiwen, et al. "Unifying bayesian flow networks and diffusion models through stochastic differential equations." arXiv preprint arXiv:2404.15766 (2024).

[5] Karras, Tero, et al. "Elucidating the design space of diffusion-based generative models." Advances in neural information processing systems 35 (2022): 26565-26577.

[6] Tong, Vinh, et al. "Learning to discretize denoising diffusion odes." arXiv preprint arXiv:2405.15506 (2024).

[7] Song, Yang, et al. "Score-based generative modeling through stochastic differential equations." arXiv preprint arXiv:2011.13456 (2020).

[8] Kong, Xiangzhe, et al. "End-to-end full-atom antibody design." Proceedings of the 40th International Conference on Machine Learning. 2023.

评论- W3: Supplementary Information

2025-08-04

Due to the limited word count during the rebuttal period, we would like to provide more details here in response to your W3 regarding the peptide design evaluation metrics in Table 1, as used in [1, 2]:

Validity: This metric measures the ratio of chemically valid peptides by assessing bond lengths. A bond is considered valid if its length is within ±0.5 Å of the ideal value. For instance, the bond between the generated nitrogen (N) and carbon (C) atoms is valid if the distance falls within the range of [1.34 - 0.5, 1.34 + 0.5] Å.
Novelty: Novelty is determined based on both the structure and sequence of the generated peptides. A peptide is considered novel if both its TM-score and sequence overlap with the reference peptide are below 0.5 [2]. We compute the average value across the generated peptides for each receptor and then take the average across all receptors.
Diversity: Diversity is quantified by calculating the average pairwise diversity, which is the product of $(1 − \text{TM-score})(1 − \text{SeqOL})$ , across the generated $N_{\text{samples}}$ for each receptor. The resulting values are then averaged across all receptors.

We only evaluate diversity and novelty for peptides that are valid to calculate the V&Div and V&Novel metric, in order to avoid false positives.

[1] Lin, Haitao, et al. "PPFLOW: Target-Aware Peptide Design with Torsional Flow Matching." International Conference on Machine Learning. PMLR, 2024.

[2] Lin, Yeqing, and Mohammed AlQuraishi. "Generating novel, designable, and diverse protein structures by equivariantly diffusing oriented residue clouds." Proceedings of the 40th International Conference on Machine Learning. 2023.

评论- Gentle Reminder: We Are Eagerly Awaiting Your Thoughts!

2025-08-05

Dear Reviewer R3Sb,

Thank you once again for your valuable comments and insights!

As the reviewer-author discussion period is nearing its end (August 6 11:59pm AoE), we would like to kindly remind you that we are eagerly awaiting your feedback on our response.

In our response, we've included the details you asked about - from training and memory costs to the evaluation metrics calculations. We've also answered your insightful questions about information shortcuts and the relationship between BFN and diffusion models.

If you have any further questions or require clarification on any aspect of our work, please do not hesitate to let us know.

Thank you for your time and effort, and we truly look forward to hearing from you!

Best regards,

Authors

评论- Friendly Reminder — Your Feedback Is Greatly Appreciated By Us!

2025-08-06

Dear Reviewer R3Sb,

Thank you once again for your valuable comments and insights！

With the discussion period deadline approaching (August 8, 11:59pm AoE), we would be very grateful to receive your thoughts on our response.

In our response, we have thoroughly replied all your questions and concerns, providing the requested details on training and memory costs, clarifying evaluation metrics, and answering your questions on information shortcuts and the BFN-diffusion model relationship.

If you have any further questions, suggestions, or would like us to elaborate on any part of our response, please don’t hesitate to let us know — we’d be more than happy to provide additional clarification or conduct further experiments.

Thank you for your time and consideration, and we sincerely look forward to your feedback.

Best regards,

Authors

评论- Kind Reminder —— Your Feedback on Our Rebuttal is Greatly Appreciated! (<24h Remaining for Discussion)

2025-08-08

Dear Reviewer R3Sb,

Thank you once again for your valuable time and insightful comments.

We are writing to kindly follow up on the rebuttal we submitted. As the discussion period deadline is now less than 24 hours away (ending on August 8, 11:59pm AoE), we would be very grateful to receive your thoughts on our response.

In our rebuttal, we have thoroughly addressed your concerns and answered all your questions, providing the requested details on training costs, clarifying evaluation metrics.

If you have any further questions or would like us to elaborate on any part of our response, please don’t hesitate to let us know. We would be more than happy to provide additional clarification or experiment as possible.

Thank you for your time and consideration. We sincerely look forward to your feedback.

Best regards,

Authors

评论- Summary of the Additional Experiments and Discussions During Rebuttal Phase

2025-08-09

We would like to express our appreciation to the reviewers for their insightful feedback and suggestions. Below, we outline the additional experiments and discussions we have conducted during rebuttal phase:

Additional experiments

Refolding generated sequences using AlphaFold3 to further validate the generated structures (To Reviewer XpUT)

We refolded our generated sequences using AlphaFold3 to provide further structural validation. Key metrics like RMSD, TM-Score, and PyRosetta binding energy (dG) show that ProteoBayes produces structures with better AlphaFold3 alignment than PepFlow.
Efficiency comparison to diffusion baselines to prove the efficiency of the proposed Bayesian flow (To Reviewer r2HE)

We compared our SO(3) generation algorithm with FrameDiff's SO(3) diffusion algorithm using the synthetic dataset experiment in Table 5 of Appendix C. Our model achieves superior results with fewer steps than diffusion baselines.
Sensitivity of the results to the choices of $\alpha_i$ for vMF noise (To Reviewer r2HE)

We investigated the model's sensitivity to the vMF noise parameter $\alpha_i$ . Our analysis demonstrates that the results are stable even under significant changes to the noise scale, proving the robustness of our approach.
Adding AbX as a Baseline (To Reviewer 2sZG)

ProteoBayes still consistently outperforms all competitors across each evaluated metric.

Further Discussion and Clarifications

About the Information Shortcut and the Proposed Information Flow
1. Will using training masks or other/random padding values work? (To Reviewer XpUT, 2sZG)
  
  We clarify that the information shortcut is an intrinsic property of the protein data distribution, which persists even when using training masks or alternative padding values.
2. Will the proposed information flow discard useful structures? (To Reviewer r2HE)
  
  The exclusion of side-chain information during sequence prediction does not limit the performance of sequence generation since residue types predominantly determine the sequence without the information of the side-chains.
3. Does information shortcut exist in the next time step? (To Reviewer R3Sb)
  
  We clarify that the network cannot learn relationships across two time steps in single-step based training, which prevents the formation of learning shortcuts.
Detailed Information on training and sampling procedures (To Reviewer R3Sb, XpUT)

We provide detailed information about training procedures, sampling methods, model inputs, metrics calculations, and computational costs in our rebuttal. These aspects largely follow established practices in the field.
Further discussion of the advantage of BFN over protein design (To Reviewer R3Sb, r2HE)

We discuss the advantage of BFN over protein design methods, as its conjugate Bayesian update eliminates discretization errors that are unavoidable in ODE/SDE-based approaches. This type of systematic error can significantly impact the activity of designed protein.
Explanation on how the network architecture enforces $f(-q)=-f(q)$ (To Reviewer r2HE)
Clarification on the necessity of modeling backbone dihedral angles (To Reviewer XpUT)

We humbly believe that these added components sufficiently address the reviewers' concerns and further improve the overall quality of our paper.

最终决定Accept (poster)

2025-09-17

The paper proposes a unified all-atom protein design framework introducing Bayesian flow network for backbone orientations and preventing information shortcuts (side-chain torsion angles from leaking residue-type information during sequence prediction) via a rationalized information flow. The reviewers generally appreciate the paper as the first work to apply Bayesian flow network for protein generation.

Reviewer R3Sb is the only negative reviewer. The reviewer's concerns were about lack of details, lack of visualizations, and some technical confusion. Since reviewer R3Sb did not follow-up during the rebuttal phase, I checked the discussions and can verify that the concerns are resolved, given that the authors provided the promised visualizations in the final manuscript.

Overall, I do not see additional critical concerns about the paper. I recommend acceptance.