Q3. Add experiments about replacing SD-map with online HD-map predictions from models.

Thank you for your valuable suggestion. We conduct your suggested experiments by using HD maps predicted by models such as MapTR and MapTRv2-CL as inputs to ESDMotion. The tables above present the results on anchor-free model and anchor-based model.

Results with HiVT:

Maps	minADE	minFDE	MR
SD map(GT)	0.3134	0.6662	0.0737
HD map (MapTR)	0.3147	0.6671	0.0740
HD map (MapTRv2)	0.3114	0.6692	0.0727

Results with DenseTNT:

Maps	minADE	minFDE	MR
SD map(GT)	0.7597	1.3105	0.1523
HD map (MapTR)	0.7616	1.3139	0.1533
HD map (MapTRv2)	0.7529	1.3088	0.1517

The results indicate that the performance of ESDMotion with predicted HD maps is similiar to that with SD maps. This finding demonstrates that ESDMotion effectively reduces the performance gap between using SD maps and HD maps, validating the efficacy of the proposed SD map oriented designs.

We add the above discussions and experiments to Table 1 of our updated paper and thanks for raising your advice to better illustrate the advantages of ESDMotion.

Q4. What will the trajectory prediction performance be like if the online predicted SD-map is used as input instead of the offline SD-map?

Good point and thanks for your advice!

We conduct experiments using the predicted SD map generated by MapTRv2 as input for four methods. The results show that when using the predicted SD map, ESDMotion achieves performance similar to that with ground truth SD maps or predicted HD maps. In contrast, for the base and unc baselines, using the predicted SD map leads to significant performance degradation. For the BEVPred baseline, performance with the predicted SD map is better than using ground truth SD maps alone but remains significantly lower than when using HD maps.
The results demonstrate the effectiveness of the proposed designs for SD map oriented motion prediction. It accounts for the relative inaccuracies inherent to SD maps and demonstrates strong robustness. This robustness effectively mitigates the impact of errors introduced by predicted SD maps, ensuring reliable performance under such conditions.

Results with HiVT:

Models	MAPs	minADE	minFDE	MR
Base/Unc/BEVPred	GT SD map	0.3998	0.8207	0.0918
Base	Predicted SD map	0.4429	0.9165	0.0986
Unc	Predicted SD map	0.4285	0.9007	0.0956
BEVPred	Predicted SD map	0.3904	0.7690	0.0741
ESDMotion++	GT SD map	0.3114	0.6692	0.0727
ESDMotion++	Predicted SD map	0.3147	0.6671	0.0740

Results with DenseTNT:

Models	MAPs	minADE	minFDE	MR
Base/Unc/BEVPred	GT SD map	1.2117	1.9849	0.2776
Base	Predicted SD map	1.3692	2.2417	0.3937
Unc	Predicted SD map	1.3020	2.1364	0.3738
BEVPred	Predicted SD map	1.1940	2.0029	0.3285
ESDMotion++	GT SD map	0.7597	1.3105	0.1523
ESDMotion++	Predicted SD map	0.7712	1.3260	0.1561

We add the above discussions and experiments to Section 3.2 and Table 1 of our updated paper and thanks for raising your advice to better illustrate the advantages of ESDMotion.