Paper
Hub
搜索
Toggle language
Xuehai Pan
~Xuehai_Pan1
4
论文总数
2.0
年均投稿
5.1
平均评分
接收情况
2
/
4
会议分布
ICLR
3
NeurIPS
1
发表论文 (4 篇)
2025
1 篇
4.3
4
Iterative Training of Language Models with Opponent Modeling for Red Teaming Data Generation
ICLR 2025
Rejected
2024
3 篇
7.5
4
Safe RLHF: Safe Reinforcement Learning from Human Feedback
ICLR 2024
Spotlight
6.3
4
Aligner: Efficient Alignment by Learning to Correct
NeurIPS 2024
Oral
2.5
4
Red Teaming Game: A Game-Theoretic Framework for Red Teaming Language Models
ICLR 2024
withdrawn
合作者 (20)
YY
Yaodong Yang
4 篇
JJ
Jiaming Ji
2 篇
FB
Fengshuo Bai
1 篇
HD
Hang Deng
1 篇
YH
Yang Han
1 篇
YR
Yiming Rong
1 篇
CM
Chengdong Ma
1 篇
HC
Hai Ci
1 篇
查看全部 20 位合作者