Andy Zou
~Andy_Zou1
9
论文总数
4.5
年均投稿
平均评分
接收情况4/9
会议分布
ICLR
7
NeurIPS
2
发表论文 (9 篇)
20255 篇
4
Which Network is Trojaned? Increasing Trojan Evasiveness for Model-Level Detectors
ICLR 2025withdrawn
4
Transferable Adversarial Attack on Vision-enabled Large Language Models
ICLR 2025withdrawn
4
Safety Pretraining: Toward the Next Generation of Safe AI
NeurIPS 2025Poster
4
AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents
ICLR 2025Poster
6
Tamper-Resistant Safeguards for Open-Weight LLMs
ICLR 2025Poster
20244 篇
4
Robustness Evaluation of Proxy Models against Adversarial Optimization
ICLR 2024Rejected
5
Improving Alignment and Robustness with Circuit Breakers
NeurIPS 2024Poster
3
Enhancing Neural Network Transparency through Representation Analysis
ICLR 2024Rejected
4
How Hard is Trojan Detection in DNNs? Fooling Detectors With Evasive Trojans
ICLR 2024Rejected