The main question surrounds Eqn (4), which causes confusion when the reader tries to connect it with the inner while loop of Algorithm 1, Figure 2(b) as a special case, and the discussions about unrolling in Section 3.6.

In Eqn (4), the neuron input depends on the outputs of the adjacent neurons. When , the outputs of the adjacent neurons have not been unknown yet. So how is line 6 of Algorithm 1 computed?

Do the authors ignore the outputs of the adjacent neurons when ?

If so, we further look into the for loop of Algorithm 1. This loop loops over the neurons . For a later , if it is adjacent to an earlier neuron (call it ), would the input of the later neuron use the output of the earlier neuron in the last round (before the for loop) or in the current round (inside the for loop)?

In either answer to the above question, it appears that every neuron has one parameter matrix (as opposed to a few). The inner while loop of Algorithm 1 updates this parameter times. Is this understanding correct?

If correct, then the steps of the inner while loop of Algorithm 1 do not propagate information across hops of the graph. Rather, information is propagated to at most one hop away, no matter how big is . The inner while loop is more like running an optimization steps rather than propagating information in a -layer GNN.

If the above understanding is correct, then the unrolling in Figure 3 does not make sense. Consider the two on the right of Figure 3. They are the same neuron at different training stages. The first takes the value obtained after line 8 of Algorithm 1, when . The second takes the value at . This is very different from unrolling an RNN, where the RNN cell uses the same parameter values at different times.

If the above understanding is correct, then the discussion of the expressive power in Section 3.6 is dubious, because the neural network does not have a depth like in a usual neural network.

Now get back to Eqn (4). The neuron input includes the input representation . This does not seem to be the case in Figure 2(b). If one considers the architecture in Figure 2(b) a special case of Cyclic NN, then following the convention in Figure 2(c) and (d), the black arrows that chain the neurons should be red arrows instead. Moreover, the input should have a black arrow pointing to every neuron.

Do the authors really mean Figure 2(b) to be in the current form, or in the edited form elaborated above? For FF-Chain, do the authors mean the current Figure 2(b) or the edited form? What about BP-Chain and BP-Chain*?