Paper
Hub
搜索
Toggle language
Drake Thomas
~Drake_Thomas1
1
论文总数
1.0
年均投稿
6.3
平均评分
接收情况
1
/
1
会议分布
NeurIPS
1
发表论文 (1 篇)
2024
1 篇
6.3
4
Catastrophic Goodhart: regularizing RLHF with KL divergence does not mitigate heavy-tailed reward misspecification
NeurIPS 2024
Poster
合作者 (2)
AG
Adrià Garriga-Alonso
1 篇
TK
Thomas Kwa
1 篇