Yi Zeng
~Yi_Zeng3
10
论文总数
5.0
年均投稿
平均评分
接收情况7/10
会议分布
ICLR
8
NeurIPS
2
发表论文 (10 篇)
20257 篇
4
AIR-BENCH 2024: A Safety Benchmark based on Regulation and Policies Specified Risk Categories
ICLR 2025Spotlight
4
SCOPE: Scalable and Adaptive Evaluation of Misguided Safety Refusal in LLMs
ICLR 2025Rejected
4
AutoRedTeamer: An Autonomous Red Teaming Agent Against Language Models
ICLR 2025Rejected
4
Mind Control through Causal Inference: Predicting Clean Images from Poisoned Data
ICLR 2025Poster
4
SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal
ICLR 2025Poster
4
AutoRedTeamer: Autonomous Red Teaming with Lifelong Attack Integration
NeurIPS 2025Poster
4
MMDT: Decoding the Trustworthiness and Safety of Multimodal Foundation Models
ICLR 2025Poster