Peter Henderson
~Peter_Henderson1
6
论文总数
3.0
年均投稿
平均评分
接收情况6/6
会议分布
ICLR
5
COLM
1
发表论文 (6 篇)
20254 篇
4
Safety Alignment Should be Made More Than Just a Few Tokens Deep
ICLR 2025Oral
4
Fantastic Copyrighted Beasts and How (Not) to Generate Them
ICLR 2025Poster
4
On Evaluating the Durability of Safeguards for Open-Weight LLMs
ICLR 2025Poster
4
SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal
ICLR 2025Poster