PaperHub

暂无评分数据

ICLR 2025

Efficient Sequential Policy Optimization via Off-Policy Correction in Multi-Agent Reinforcement Learning

OpenReviewPDF
提交: 2024-09-27更新: 2024-10-17

摘要

关键词
trust region policy optimizationmulti-agent learning

评审与讨论

编辑台拒稿

直接拒稿原因

The paper is desk rejected for significant textual overlap with [1]. This decision was confirmed by multiple members of the program committee. Section 5 in this submission is a slight rewording of Section 2 in [1], Section 2.1 in this submission is nearly identical to section 3.1 in [1], Section 4 is nearly identical to Section 5 in [1], and the abstract is a rewording of the abstract in [1]. As such, the degree of overlapping is considered as plagiarism where the proper citation is not given when the whole texts/phrases etc are copied.

[1] Wang, X., Tian, Z., Wan, Z., Wen, Y., Wang, J., & Zhang, W. (2023). Order Matters: Agent-by-agent Policy Optimization. ICLR 2023.