Dear Reviewer iPE3, Thanks so much again for the time and effort in our work. Considering the limited time available and to save the reviewer's time, we summarize our responses here.

Weakness

W1: Fixed reward functions in training GenFlowNet

We appreciate this insightful comment. In the current paper, we use a fixed reward set , , , across all GFlowNets to evaluate the consistency and efficiency of parameter generation. This design isolates the evaluation of the parameter generation component by eliminating noise from variable reward functions. However, we agree that generalization to unseen rewards is critical.

To address this, we will include an experiment in the revised manuscript where the auto-encoder is trained on varying reward functions to further validate the flexibility and generalizability of our method.

W2: Slim experimental suite compared to recent GFlowNet works

Thank you for highlighting this. Our experimental suite prioritizes a proof of concept by focusing on two challenging and diverse tasks:

Structured Synthetic Data: Hypergrid tasks are valuable for testing trajectory balance and parameter optimization under controlled conditions.
Real-World Applicability: Molecule generation tasks illustrate GenFlowNet's relevance for real-world applications, including drug discovery.

We plan to expand this suite to include additional tasks such as:

Protein structure prediction
Combinatorial optimization
Large-scale multi-agent simulations

These extensions will demonstrate the scalability and versatility of GenFlowNet.

W3: Varying rewards and forward policies

Thank you for your suggestion.

Varying Rewards: Our experiments already evaluate varying rewards across tasks. However, exploring varying reward functions within the same task is beyond the current scope. We will address this in future work.

Non-MLP Policies: While we focus on MLP-based policies due to their prevalence in GFlowNet literature, we acknowledge the importance of exploring non-MLP architectures. Future work will consider convolutional and graph-based policy models to extend GenFlowNet's applicability to spatial and graph-structured tasks.

W4: Lack of error bars or standard deviation

Thank you for pointing this out. We conducted evaluations with 10 repetitions and reported the averages in the manuscript. Below are additional results with best, average, and median performance for the hypergrid task:

Structure	JS Divergence	KL Divergence	Empirical L1 Loss
Structure	0.674/0.675/0.677	7.275/7.276/7.275	3.097e-05/3.099e-05/3.099e-05
Structure	0.685/0.685/0.686	7.942/7.945/7.943	5.803e-06/5.805e-05/5.804e-05
Structure	0.641/0.644/0.643	10.421/10.422/10.422	0.001/0.001/0.001
Structure	0.636/0.637/0.637	9.463/9.467/9.466	3.000e-04/3.000e-04/3.000e-04

Questions

Q1: What is the unit for "time usage"?

The "Time usage" in Table 1 is measured in seconds. This clarification has been added to the updated manuscript.

Q2: Highlighting "superior performance" and uncertainty quantification

We acknowledge that the claim of "superior performance" is primarily based on time usage reduction, as shown in Table 1. While improvements in sampling accuracy (e.g., KL divergence and L1 loss) are modest, the significant efficiency gains in computational time justify emphasizing this aspect.

Uncertainty quantification is reflected in the best, average, and median results provided above. Additional results are included in the supplementary material (lines 855–857).

Q3: Initialization in experiments

In Figure 4, the parameters generated by GenFlowNet serve as initializations for GFlowNet models without fine-tuning. This training-free approach generates high-quality parameters, reducing the need for iterative training and enabling faster deployment.

Q4: Why does the training-free method outperform trained ones?

The training-free method benefits from the diverse and generalizable training dataset, which encompasses a wide range of GFlowNet structures. This enables GenFlowNet to learn representations that generalize well to unseen tasks. As a result, GenFlowNet-initialized models often start with better parameters, reducing the need for extensive optimization and achieving competitive or superior performance.

Q5: Details about Section 3.3 (unknown structures in hypergrid tasks)

In Section 3.3, GenFlowNet generates parameters for previously unseen structures in the hypergrid task. This demonstrates its ability to generalize across tasks and structures not encountered during training, showcasing its adaptability to diverse applications.