We appreciate your thoughtful feedback. Hopefully, the answers below sufficiently address your concerns. Please let us know if something else needs further clarification or if we missed something.

Weaknesses

Critical missing details for phylogenetic inference. What are the states? What are the actions?

A: In the phylogeny experiments, a state is represented as a forest. Initially, each leaf belongs to a different singleton tree. An action consists of picking two trees and joining their roots to a newly added node. The generative process is finished when all nodes are connected in a single tree. We understand that this description was missing in the document, and we included it in Appendix C.1, along with an illustrative picture.

It should be noted that an alternative approach to the problem in the Bayesian posterior setting is to train a single GFlowNet with a stochastic reward, as done in [Deleu et al., UAI 2022] and [Deleu et al., NeurIPS 2023]

A: Thanks for pointing this out. We are unsure about the relevance of the two references to stochastic rewards — both focus on causal structure learning. If they were not a typo, could you elaborate on that?

Nonetheless, we could use Zhang et al.'s (ICML 2023) scheme for training GFlowNets in environments with stochastic rewards drawn from the distribution

in which is a categorical distribution over the clients . By assigning a weight to the th client's reward, , Proposition 1 of Zhang et al. ensures that the trained GFlowNet would sample an object with probability proportional to

The downside is that this requires many communication steps between clients and server. This is precisely the bottleneck we are trying to avoid with FC-GFlowNets, imposing one single communication step between the clients and the server. We wrote a short version of this discussion in related works (Appendix D).

However, there are still comparisons to be made, such as using different training objectives for the client models (TB or CB).

A: Thanks for the suggestion. In principle, FC-GFlowNets are agnostic to how the local models were trained as long as they provide us with forward and backward policies. We have now included Table 2 (below) in Appendix C, comparing the performance of FC-GFlowNets when clients are trained with TB vs. CB. The metrics are all very similar.

	Grid World		Multisets		Sequence
	L₁ ↓	Top-800 ↑	L₁ ↓	Top-800 ↑	L₁ ↓	Top-800 ↑
FC-GFlowNet (CB)	0.038	-6.355	0.130	27.422	0.005	-1.535
	(± 0.016)	(± 0.000)	(± 0.004)	(± 0.000)	(± 0.002)	(± 0.000)
FC-GFlowNets (TB)	0.039	-6.355	0.131	27.422	0.006	-1.535
	(± 0.006)	(± 0.000)	(± 0.018)	(± 0.000)	(± 0.005)	(± 0.000)

The "contrastive loss" is claimed as original, but in fact it is not. I suggest that the authors revise the discussion on this in section 3.3 and in the claimed contributions

A: We once again thank you for noticing the relationship between the contrastive loss (CL) and the log-partition variance loss (VL) of Zhang et al. (ICLR 2023) — which we did not notice. In fact, CL equals twice VL in expectation. Nonetheless, as you pointed out, CL and VL use different estimators for the variance up to some positive multiplicative constant. Additionally, our balance condition (CB) is novel per se and essential to deriving the federated loss (Corollary 1), which is the basis of our method. We have updated the introduction, section 3.2 (which introduces the CB), and related works (Appendix D) to acknowledge this work.