Q1-1: The novelty and contribution of the proposed binarization is limited as most of the design is quite straightforward. The block residual is something new to me.

A1-1: Thank you for your feedback. We hope that our BiDRN is easy to follow with these straightforward designs. Experiments show that they are quite effective and provide a large improvement compared to other BNNs. Guided by simple and effective principles, our BiDRN could serve as a foundation of binarized HMR.

Q1-2: The designed residual block is quite specific for ResNet. This limits the scope of the proposed binarized dual residual block.

A1-2: Thank you for your feedback. Although the residual block replaces the ResNet in main experiments, it can be applied to broader CNN-based backbones. For example, we have also migrated the BiDRB to MobileNet, as shown in Table 6 of the Appendix.

Method	Params (M)	OPs (M)	MPJPE
MobileNetv1 (full-precision)	3.2	583.3	176.5
BNN	0.2	17.9	338.3
BiDRN (Ours)	0.2	17.9	188.3

We binarized MobileNet using BNN and our proposed BiDRN, and evaluated the models on a 2D keypoint prediction task. The results highlight the effectiveness and versatility of our BiDRN approach.

Q1-3: The experiment is conducted with comparison to other standard binarization methods, which are designed for general networks. This comparison is kind of unfair as the proposed binarization only works for this particular model, or maybe a broader ResNet-like architecture. A comparison to other binarization/quantization methods for networks in 3D human reconstruction or ResNet-like models is needed.

A1-3: Thank you for your suggestion. We add more explanations and experiments below.

The compared BNNs are actually designed for CNN, and most of them are applied to ResNet only in their experiments.
Our BiDRN is not restricted to ResNet-like architecture, it can be adapted to any CNN backbone. Considering that the compared BNNs are proposed for CNNs, we consider it a fair comparison.
To the best of our knowledge, BiDRN is the first work for binarized 3D human reconstruction and no other binarization methods target this region. Yet, we also follow your suggestion to compare with one more binarization method BBCU. Although it is not designed for 3D human reconstruction, it is also applied in ResNet-like models (SRResnet).

Method	Params (M)	OPs (G)	MPVPE (All)	MPVPE (Hand)	MPVPE (Face)
BBCU	22.04	2.68	164.9	89.7	50.3
BiDRN (Ours)	17.22	2.50	118.3	70.8	37.6

The results show that our BiDRN also largely outperforms BBCU with fewer parameters and operations.

Q1-4: It is not very clear why the proposed binarization method target at the Hand4Whole method. There are other (latest) methods (with better performance) in the field.

A1-4: Thank you for your feedback. At the beginning of this project, Hand4Whole is the SOTA method. Yet, our BiDRN can be adapted to other latest methods. We added an experiment by adapting our BiDRN to the latest method MultiHMR [ref1]. Due to the time and resource limit, we haven't finished the experiments. Yet, after training for the same number of iterations and evaluation on EHF, our BiDRN largely surpasses BNN, showing that it can be generalized to other methods in the field.

Method	#Iteration	EHF		Bedlam
		PVE	PAPVE	PVE	PAPVE
BNN	70,000	344.9	168.2	292.4	151.1
BiDRN (Ours)	70,000	302.7	137.1	270.2	126.5

[ref1] Multi-HMR: Multi-Person Whole-Body Human Mesh Recovery in a Single Shot, ECCV, 2024.

Q1-5: Latency. While the OPs is provided to have a rough idea of the number of operations, It is also necessary to see the change of speed with the proposed binarization.

A1-5: Thank you for your suggestion. We have provided the latency comparison in Table 7 of the Appendix, where our BiDRN achieves impressive acceleration compared to the full-precision method.