Liwei Jiang
~Liwei_Jiang2
16
论文总数
8.0
年均投稿
平均评分
接收情况12/16
会议分布
ICLR
8
COLM
5
NeurIPS
2
ICML
1
发表论文 (16 篇)
202511 篇
4
Can Language Models Reason about Individualistic Human Values and Preferences?
ICLR 2025withdrawn
5
CulturalBench: a Robust, Diverse and Challenging Benchmark on Measuring (the Lack of) Cultural Knowledge of LLMs
ICLR 2025Rejected
4
X-Teaming: Multi-Turn Jailbreaks and Defenses with Adaptive Multi-Agents
COLM 2025Poster
4
DailyDilemmas: Revealing Value Preferences of LLMs with Quandaries of Daily Life
ICLR 2025Spotlight
4
PolyGuard: A Multilingual Safety Moderation Tool for 17 Languages
COLM 2025Poster
3
SafetyAnalyst: Interpretable, transparent, and steerable LLM safety moderation
ICLR 2025Rejected
4
HAICOSYSTEM: An Ecosystem for Sandboxing Safety Risks in Human-AI Interactions
ICLR 2025Rejected
4
SafetyAnalyst: Interpretable, Transparent, and Steerable Safety Moderation for AI Behavior
ICML 2025Poster
4
HAICOSYSTEM: An Ecosystem for Sandboxing Safety Risks in Interactive AI Agents
COLM 2025Poster
4
AI as Humanity’s Salieri: Quantifying Linguistic Creativity of Language Models via Systematic Attribution of Machine Text against Web Text
ICLR 2025Oral
4
AI Debate Aids Assessment of Controversial Claims
NeurIPS 2025Poster
20245 篇
4
WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models
NeurIPS 2024Poster
4
CULTURE-GEN: Revealing Global Cultural Perception in Language Models through Natural Language Prompting
COLM 2024Poster
4
Phenomenal Yet Puzzling: Testing Inductive Reasoning Capabilities of Language Models with Hypothesis Refinement
ICLR 2024Oral
4
Information-Theoretic Distillation for Reference-less Summarization
COLM 2024Poster
4
The Generative AI Paradox: “What It Can Create, It May Not Understand”
ICLR 2024Poster