PaperHub

Yejin Choi

~Yejin_Choi1

59
论文总数
29.5
年均投稿
6.3
平均评分
接收情况44/59
会议分布
ICLR
32
COLM
14
NeurIPS
9
ICML
4

发表论文 (59 篇)

202537

4.8
4

From Explicit CoT to Implicit CoT: Learning to Internalize CoT Step by Step

ICLR 2025Rejected
8.0
4

Trust or Escalate: LLM Judges with Provable Guarantees for Human Agreement

ICLR 2025Oral
7.3
4

DailyDilemmas: Revealing Value Preferences of LLMs with Quandaries of Daily Life

ICLR 2025Spotlight
4.5
4

Can Language Models Reason about Individualistic Human Values and Preferences?

ICLR 2025withdrawn
-

The HALoGen Benchmark: Fantastic LLM Hallucinations and Where To Find Them

ICLR 2025withdrawn
7.8
4

Broken Tokens? Your Language Model can Secretly Handle Non-Canonical Tokenizations

NeurIPS 2025Spotlight
5.3
4

LongPerceptualThoughts: Distilling System-2 Reasoning for System-1 Perception

COLM 2025Poster
7.8
4

Why and How LLMs Hallucinate: Connecting the Dots with Subsequence Associations

NeurIPS 2025Poster
6.0
4

Explore Theory of Mind: program-guided adversarial data generation for theory of mind reasoning

ICLR 2025Poster
5.5
4

Diverging Preferences: When do Annotators Disagree and do Models Know?

ICLR 2025Rejected
8.0
4

SuperBPE: Space Travel for Language Models

COLM 2025Poster
7.3
4

ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

NeurIPS 2025Poster
5.8
4

A False Sense of Privacy: Evaluating Textual Data Sanitization Beyond Surface-level Privacy Leakage

ICLR 2025Rejected
5.5
4

Diverging Preferences: When do Annotators Disagree and do Models Know?

ICML 2025Poster
5.7
3

Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing

ICLR 2025Poster
4.0
4

Pixelated Instructions: Can Multimodal Large Language Models Follow Printed Instructions in Images?

ICLR 2025Rejected
6.3
3

The Surprising Effectiveness of Membership Inference with Simple N-Gram Coverage

COLM 2025Poster
5.3
3

SimpleToM: Exposing the Gap between Explicit ToM Inference and Implicit ToM Application in LLMs

ICLR 2025Rejected
5.5
4

ZebraLogic: On the Scaling Limits of LLMs for Logical Reasoning

ICML 2025Poster
6.8
4

Verifying the Verifiers: Unveiling Pitfalls and Potentials in Fact Verifiers

COLM 2025Poster
6.0
3

CertainlyUncertain: A Benchmark and Metric for Multimodal Epistemic and Aleatoric Awareness

ICLR 2025Poster
7.3
3

WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild

ICLR 2025Spotlight
6.3
3

Hypothesis-Driven Theory-of-Mind Reasoning for Large Language Models

COLM 2025Poster
7.8
4

Model Swarms: Collaborative Search to Adapt LLM Experts via Swarm Intelligence

ICML 2025Poster
5.5
4

SafetyAnalyst: Interpretable, Transparent, and Steerable Safety Moderation for AI Behavior

ICML 2025Poster
6.5
4

HAICOSYSTEM: An Ecosystem for Sandboxing Safety Risks in Interactive AI Agents

COLM 2025Poster
3.3
3

SafetyAnalyst: Interpretable, transparent, and steerable LLM safety moderation

ICLR 2025Rejected
6.8
4

Model Swarms: Collaborative Search to Adapt LLM Experts via Swarm Intelligence

ICLR 2025Rejected
6.0
4

X-Teaming: Multi-Turn Jailbreaks and Defenses with Adaptive Multi-Agents

COLM 2025Poster
6.3
4

HAICOSYSTEM: An Ecosystem for Sandboxing Safety Risks in Human-AI Interactions

ICLR 2025Rejected
8.2
3

Prismatic Synthesis: Gradient-based Data Diversification Boosts Generalization in LLM Reasoning

NeurIPS 2025Spotlight
5.4
5

Benchmarking Vision Language Model Unlearning via Fictitious Facial Identity Dataset

ICLR 2025Poster
6.4
4

AI Debate Aids Assessment of Controversial Claims

NeurIPS 2025Poster
5.0
5

CulturalBench: a Robust, Diverse and Challenging Benchmark on Measuring (the Lack of) Cultural Knowledge of LLMs

ICLR 2025Rejected
7.0
4

AI as Humanity’s Salieri: Quantifying Linguistic Creativity of Language Models via Systematic Attribution of Machine Text against Web Text

ICLR 2025Oral
7.3
4

Language Model Alignment in Multilingual Trolley Problems

ICLR 2025Spotlight
7.3
4

VAGEN: Reinforcing World Model Reasoning for Multi-Turn VLM Agents

NeurIPS 2025Poster

202422

6.7
3

Quantifying Language Models' Sensitivity to Spurious Features in Prompt Design or: How I learned to start worrying about prompt formatting

ICLR 2024Poster
5.8
4

Data Mixture Inference Attack: BPE Tokenizers Reveal Training Data Compositions

NeurIPS 2024Poster
6.5
4

Don't throw away your value model! Generating more preferable text with Value-Guided Monte-Carlo Tree Search decoding

COLM 2024Poster
6.5
4

Trust No Bot: Discovering Personal Disclosures in Human-LLM Conversations in the Wild

COLM 2024Poster
7.5
4

Infini-gram: Scaling Unbounded n-gram Language Models to a Trillion Tokens

COLM 2024Poster
4.0
4

Making PPO even better: Value-Guided Monte-Carlo Tree Search decoding

ICLR 2024Rejected
7.5
4

Tuning Language Models by Proxy

COLM 2024Poster
7.0
4

CULTURE-GEN: Revealing Global Cultural Perception in Language Models through Natural Language Prompting

COLM 2024Poster
6.3
4

WildChat: 1M ChatGPT Interaction Logs in the Wild

ICLR 2024Spotlight
6.0
4

LUMOS: Towards Language Agents that are Unified, Modular, and Open Source

ICLR 2024Rejected
4.3
4

FiLM: Fill-in Language Models for Any-Order Generation

ICLR 2024Rejected
8.3
4

Information-Theoretic Distillation for Reference-less Summarization

COLM 2024Poster
6.3
4

Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory

ICLR 2024Spotlight
5.0
4

Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback

NeurIPS 2024Poster
5.0
4

The Unlocking Spell on Base LLMs: Rethinking Alignment via In-Context Learning

ICLR 2024Poster
6.4
5

Tailoring Self-Rationalizers with Multi-Reward Distillation

ICLR 2024Poster
6.8
4

Do Membership Inference Attacks Work on Large Language Models?

COLM 2024Poster
8.0
4

Phenomenal Yet Puzzling: Testing Inductive Reasoning Capabilities of Language Models with Hypothesis Refinement

ICLR 2024Oral
4.8
4

In Search of the Long-Tail: Systematic Generation of Long-Tail Knowledge via Logical Rule Induced Search

ICLR 2024withdrawn
5.8
4

WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models

NeurIPS 2024Poster
6.5
4

PlaSma: Procedural Knowledge Models for Language-based Planning and Re-Planning

ICLR 2024Poster
7.0
4

The Generative AI Paradox: “What It Can Create, It May Not Understand”

ICLR 2024Poster