影响力指数

95.86/100

前 0.2%

全站排名 #142

发表论文43 篇

平均评分5.4

年均产出14.3 篇/年

Chaowei Xiao

Assistant Professor@Johns Hopkins University·美国·OpenReview

研究方向

machine learning · security

Doxing via the Lens: Revealing Location-related Privacy Leakage on Multi-modal Large Reasoning Models

ICLR 2026Poster

ARMOR: Aligning Secure and Safe Large Language Models via Meticulous Reasoning

ICLR 2026Poster

A2ASecBench: A Protocol-Aware Security Benchmark for Agent-to-Agent Multi-Agent Systems

ICLR 2026Poster

Concept Concentration for Faithful Representation Intervention

ICLR 2026Rejected

TrustGen: A Platform of Dynamic Benchmarking on the Trustworthiness of Generative Foundation Models

ICLR 2026Poster

Computer-Use Agent Frameworks Can Expose Realistic Risks Through Tactics, Techniques, and Procedures

ICLR 2026Rejected

PRISM: Robust VLM Alignment with Principled Reasoning for Integrated Safety in Multimodality

ICLR 2026Rejected

Mitigating Indirect Prompt Injection via Instruction-Following Intent Analysis

ICLR 2026Rejected

ReasAlign: Reasoning Enhanced Safety Alignment against Prompt Injection Attack

ICLR 2026Withdrawn

VLA-Risk: Benchmarking Vision-Language-Action Models with Physical Robustness

ICLR 2026Rejected

SAFEVISION: Efficient Image Guardrail with Robust Policy Adherence and Explainability

ICLR 2026Rejected

AutoDAN-Turbo: A Lifelong Agent for Strategy Self-Exploration to Jailbreak LLMs

ICLR 2025Spotlight

T-Stitch: Accelerating Sampling in Pre-Trained Diffusion Models with Trajectory Stitching

ICLR 2025Poster

Robust Representation Consistency Model via Contrastive Denoising

ICLR 2025Poster

EIA: ENVIRONMENTAL INJECTION ATTACK ON GENERALIST WEB AGENTS FOR PRIVACY LEAKAGE

ICLR 2025Poster

DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agents

NeurIPS 2025Poster

JailDAM: Jailbreak Detection with Adaptive Memory for Vision-Language Model

COLM 2025Poster

Sample-specific Noise Injection for Diffusion-based Adversarial Purification

ICML 2025Poster

DataGen: Unified Synthetic Dataset Generation via Large Language Models

ICLR 2025Poster

Can Watermarks be Used to Detect LLM IP Infringement For Free?

ICLR 2025Poster

LeanAgent: Lifelong Learning for Formal Theorem Proving

ICLR 2025Poster

Sample-specific Noise Injection for Diffusion-based Adversarial Purification

ICLR 2025Rejected

SafeVision: Efficient Image Guardrail with Robust Policy Adherence and Explainability

ICLR 2025Rejected

Benchmarking Vision Language Model Unlearning via Fictitious Facial Identity Dataset

ICLR 2025Poster

Prompt Injection Benchmark for Foundation Model Integrated Systems

ICLR 2025Rejected

MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding

ICLR 2025Poster

MetaAgent: Automatically Constructing Multi-Agent Systems Based on Finite State Machines

ICML 2025Poster

Can Editing LLMs Inject Harm?

ICLR 2025Rejected

MetaAgent: Automatically Building Multi-Agent System based on Finite State Machine

ICLR 2025Rejected

AutoHijacker: Automatic Indirect Prompt Injection Against Black-box LLM Agents

ICLR 2025Rejected

合作者 (20)

合作者12 篇

Patrick McDaniel

Anima Anandkumar