暂无评分数据
ICLR 2025
Learn With Imagination: Safe Set Guided State-wise Constrained Policy Optimization
TL;DR
Safe Set Guided State-wise Constrained Policy Optimization (S-3PO) is a novel algorithm generates state-wise safe optimal policies with zero training violations.
摘要
关键词
Deep Reinforcement LearningSafe ControlSafety IndexZero Training ViolationsImaginary Cost
评审与讨论
PC编辑台拒稿
直接拒稿原因
This paper is desk rejected because the authors are not anonymized.