影响力指数

52.36/100

前 7.4%

全站排名 #4,759

发表论文11 篇

平均评分5.4

年均产出3.7 篇/年

Andy Zou

PhD student@CMU, Carnegie Mellon University·美国·OpenReview

研究方向

ml safety · ai safety · alignment · robustness · monitoring · transparency

Comparing AI Agents to Cybersecurity Professionals in Real-World Penetration Testing

ICLR 2026Poster

TrustGen: A Platform of Dynamic Benchmarking on the Trustworthiness of Generative Foundation Models

ICLR 2026Poster

Safety Pretraining: Toward the Next Generation of Safe AI

NeurIPS 2025Poster

AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents

ICLR 2025Poster

Tamper-Resistant Safeguards for Open-Weight LLMs

ICLR 2025Poster

Transferable Adversarial Attack on Vision-enabled Large Language Models

ICLR 2025Withdrawn

Which Network is Trojaned? Increasing Trojan Evasiveness for Model-Level Detectors

ICLR 2025Withdrawn

合作者 (20)

Matt Fredrikson

博士导师7 篇

博士导师6 篇