Dear Reviewer s7zj:

Thank you for your comments. Please see below our response to your concerns.

Weakness 1: The computational overhead is not well discussed in the paper.

Answer 1: Most of the Neural Radiance Field works adopt coarse-to-fine sampling strategy, using two independent network modules handling sampled points at coarse and fine stage respectively. We reuse the two network module branches as target and online branch, aiming to introduce the fewest additional parameters. Therefore, the only additional parameters introduced by our MRVM are the light-weight projector and predictor, which brings negligible additional computational overhead (only +0.2% parameters compared to the baseline model). Please refer to Table 4 for more details. Moreover, since the projector and predictor are discarded at inference time, the inference speed is consistent with the original baseline methods.

Weakness 2: The author may need to clarify the role of mask-based pretraining more clearly in the paper.

Answer 2: Yes we agree with the reviewer's comment. Thanks for the valuable suggestion and we will clarify this point in our revision.

Question 1: Can the authors provide more details on the computational efficiency of MRVM-NeRF?

Answer 3: Please refer to Answer 1.

Question 2: Can the function take additional inputs besides the latent feature like geometrical information?

Answer 4: Theoretically, can take additional geometrical inputs like position and view direction by modifying the predictor network structure. It's worth exploring whether this operation will bring greater performance gain in the future work. Actually, the geometrical information has already been introduced at the beginning of the generalizable NeRF framework. In Section 3.1, the input latent embedding has already contained the information of position and direction in the way that and are projected into a latent embedding after positional encoding, which are then added to , to produce input vector . We apologize for the lack of these details and will clarify them in our revision.