Niloofar Mireshghallah
~Niloofar_Mireshghallah1
13
论文总数
6.5
年均投稿
平均评分
接收情况9/13
会议分布
ICLR
6
COLM
5
NeurIPS
2
发表论文 (13 篇)
20258 篇
4
A False Sense of Privacy: Evaluating Textual Data Sanitization Beyond Surface-level Privacy Leakage
ICLR 2025Rejected
3
Leveraging Set Assumption for Membership Inference in Language Models
ICLR 2025Rejected
4
AI as Humanity’s Salieri: Quantifying Linguistic Creativity of Language Models via Systematic Attribution of Machine Text against Web Text
ICLR 2025Oral
3
ParaPO: Aligning Language Models to Reduce Verbatim Reproduction of Pre-training Data
COLM 2025Poster
4
Exploring the limits of strong membership inference attacks on large language models
NeurIPS 2025Poster
3
The Surprising Effectiveness of Membership Inference with Simple N-Gram Coverage
COLM 2025Poster
4
HAICOSYSTEM: An Ecosystem for Sandboxing Safety Risks in Human-AI Interactions
ICLR 2025Rejected
4
HAICOSYSTEM: An Ecosystem for Sandboxing Safety Risks in Interactive AI Agents
COLM 2025Poster
20245 篇
4
Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory
ICLR 2024Spotlight
4
Trust No Bot: Discovering Personal Disclosures in Human-LLM Conversations in the Wild
COLM 2024Poster
4
Do Membership Inference Attacks Work on Large Language Models?
COLM 2024Poster
4
Misusing Tools in Large Language Models With Visual Adversarial Examples
ICLR 2024Rejected
4
WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models
NeurIPS 2024Poster