暂无评分数据
ICLR 2025
Efficient Sequential Policy Optimization via Off-Policy Correction in Multi-Agent Reinforcement Learning
摘要
关键词
trust region policy optimizationmulti-agent learning
评审与讨论
PC编辑台拒稿
直接拒稿原因
The paper is desk rejected for significant textual overlap with [1]. This decision was confirmed by multiple members of the program committee. Section 5 in this submission is a slight rewording of Section 2 in [1], Section 2.1 in this submission is nearly identical to section 3.1 in [1], Section 4 is nearly identical to Section 5 in [1], and the abstract is a rewording of the abstract in [1]. As such, the degree of overlapping is considered as plagiarism where the proper citation is not given when the whole texts/phrases etc are copied.
[1] Wang, X., Tian, Z., Wan, Z., Wen, Y., Wang, J., & Zhang, W. (2023). Order Matters: Agent-by-agent Policy Optimization. ICLR 2023.