影响力指数

88.52/100

前 0.7%

全站排名 #434

发表论文20 篇

平均评分6.4

年均产出6.7 篇/年

Prateek Mittal

Full Professor@Princeton University·美国·OpenReview

研究方向

Secure &Trustworthy Cyberspace · Adversarial Machine Learning · Privacy-Preserving Machine Learning

Can Small Training Runs Reliably Guide Data Curation? Rethinking Proxy-Model Practice

ICLR 2026Poster

Adversarial Déjà Vu: Jailbreak Dictionary Learning for Stronger Generalization to Unseen Attacks

ICLR 2026Poster

MURMUR: Using cross-user chatter to break collaborative language agents

ICLR 2026Rejected

Red-Teaming NSFW Image Classifiers as Text-to-Image Safeguards

ICLR 2026Withdrawn

Safety Alignment Should be Made More Than Just a Few Tokens Deep

Capturing the Temporal Dependence of Training Data Influence

Data Shapley in One Training Run

ReliabilityRAG: Effective and Provably Robust Defense for RAG-based Web-Search

NeurIPS 2025Poster

SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal

ICLR 2025Poster

Privacy Auditing of Large Language Models

ICLR 2025Poster

On Evaluating the Durability of Safeguards for Open-Weight LLMs

ICLR 2025Poster

Adapting to Evolving Adversaries with Regularized Continual Robust Training

ICML 2025Poster

Instructional Segment Embedding: Improving LLM Safety with Instruction Hierarchy

ICLR 2025Poster

Certifiably Robust RAG against Retrieval Corruption Attacks

ICLR 2025Rejected

Lottery Ticket Adaptation: Mitigating Destructive Interference in LLMs

ICLR 2025Withdrawn

合作者 (20)

Jiachen T. Wang

PhD Advisee7 篇

PhD Advisee6 篇

PhD Advisee6 篇

PhD Advisee5 篇

PhD Advisee5 篇

Peter Henderson