影响力指数

62.81/100

前 4.1%

全站排名 #2,618

发表论文11 篇

平均评分6.0

年均产出3.7 篇/年

Yi Zeng

PhD student@Virginia Tech·OpenReview

研究方向

Adversarial Machine Learning · AI and Security

SpecEval: Evaluating Model Adherence to Behavior Specifications

ICLR 2026Rejected

AIR-BENCH 2024: A Safety Benchmark based on Regulation and Policies Specified Risk Categories

ICLR 2025Spotlight

MMDT: Decoding the Trustworthiness and Safety of Multimodal Foundation Models

ICLR 2025Poster

AutoRedTeamer: Autonomous Red Teaming with Lifelong Attack Integration

NeurIPS 2025Poster

SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal

ICLR 2025Poster

Mind Control through Causal Inference: Predicting Clean Images from Poisoned Data

ICLR 2025Poster

SCOPE: Scalable and Adaptive Evaluation of Misguided Safety Refusal in LLMs

ICLR 2025Rejected

AutoRedTeamer: An Autonomous Red Teaming Agent Against Language Models

ICLR 2025Rejected

合作者 (20)

博士导师7 篇