Paper
Hub
搜索
Toggle language
Tinghao Xie
~Tinghao_Xie1
5
论文总数
2.5
年均投稿
6.3
平均评分
接收情况
5
/
5
会议分布
ICLR
5
发表论文 (5 篇)
2025
3 篇
6.8
4
SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal
ICLR 2025
Poster
5.0
4
Fantastic Copyrighted Beasts and How (Not) to Generate Them
ICLR 2025
Poster
6.5
4
On Evaluating the Durability of Safeguards for Open-Weight LLMs
ICLR 2025
Poster
2024
2 篇
6.3
4
BaDExpert: Extracting Backdoor Functionality for Accurate Backdoor Input Detection
ICLR 2024
Poster
7.0
4
Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!
ICLR 2024
Oral
合作者 (20)
PH
Peter Henderson
4 篇
PM
Prateek Mittal
4 篇
XQ
Xiangyu Qi
4 篇
LH
Luxi He
3 篇
YH
Yangsibo Huang
3 篇
DC
Danqi Chen
2 篇
BW
Boyi Wei
2 篇
RJ
Ruoxi Jia
2 篇
查看全部 20 位合作者