影响力指数

64.23/100

前 3.7%

全站排名 #2,390

发表论文26 篇

平均评分4.7

年均产出8.7 篇/年

Wenxuan Wang

Assistant Professor@Renmin University of China·中国·OpenReview

研究方向

nlp · deep learning

5.5

DeepMath-103K: A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoning

ICLR 2026Poster

5.0

Castle-in-the-Air: Evaluating MLLM Visual Abilities on Human Cognitive Benchmarks

ICLR 2026Rejected

5.0

The PIMMUR Principles: Ensuring Validity in Collective Behavior of LLM Societies

ICLR 2026Rejected

3.5

DRQA: Dynamic Reasoning Quota Allocation for Controlling Overthinking in Reasoning Large Language Models

ICLR 2026Withdrawn

3.5

The Social Welfare Function Leaderboard: When LLM Agents Allocate Social Welfare

ICLR 2026Withdrawn

3.0

Curing "Miracle Steps'' in LLM Math Reasoning with Rubric Rewards

ICLR 2026Withdrawn

2.5

Benchmarking MLLM-based Web Understanding: Reasoning, Robustness and Safety

ICLR 2026Rejected

通讯

2.5

The Hunger Game Debate: On the Emergence of Over-Competition in Multi-Agent Systems

ICLR 2026Withdrawn

2.0

Language Models Do Not Have Human-Like Working Memory

ICLR 2026Rejected

三作

7.3

Time-R1: Post-Training Large Vision Language Model for Temporal Video Grounding

NeurIPS 2025Poster

6.8

Trust, But Verify: A Self-Verification Approach to Reinforcement Learning with Verifiable Rewards

NeurIPS 2025Poster

6.8

Two Experts Are All You Need for Steering Thinking: Reinforcing Cognitive Effort in MoE Reasoning Models Without Additional Training

NeurIPS 2025Poster

6.1

On the Resilience of LLM-Based Multi-Agent Collaboration with Faulty Agents

ICML 2025Poster

5.8

Competing Large Language Models in Multi-Agent Gaming Environments

ICLR 2025Poster

5.4

On the Resilience of Multi-Agent Systems with Malicious Agents

ICLR 2025Rejected

5.3

Chain-of-Jailbreak Attack for Image Generation Models via Editing Step by Step

ICLR 2025Rejected

一作

4.2

Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training

ICLR 2025Withdrawn

三作

3.7

Insight Over Sight? Exploring the Vision-Knowledge Conflicts in Multimodal LLMs

合作者 (20)

Wenxuan Wang

DeepMath-103K: A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoning

Castle-in-the-Air: Evaluating MLLM Visual Abilities on Human Cognitive Benchmarks

UI-Ins: Enhancing GUI Grounding with Multi-Perspective Instruction as Reasoning

ComboBench: Can LLMs Manipulate Physical Devices to Play Virtual Reality Games?

Towards Evaluating Fake Reasoning Bias in Language Models

The PIMMUR Principles: Ensuring Validity in Collective Behavior of LLM Societies

DRQA: Dynamic Reasoning Quota Allocation for Controlling Overthinking in Reasoning Large Language Models

The Social Welfare Function Leaderboard: When LLM Agents Allocate Social Welfare

Curing "Miracle Steps'' in LLM Math Reasoning with Rubric Rewards

Benchmarking MLLM-based Web Understanding: Reasoning, Robustness and Safety

The Hunger Game Debate: On the Emergence of Over-Competition in Multi-Agent Systems

Language Models Do Not Have Human-Like Working Memory

Time-R1: Post-Training Large Vision Language Model for Temporal Video Grounding

Trust, But Verify: A Self-Verification Approach to Reinforcement Learning with Verifiable Rewards

Two Experts Are All You Need for Steering Thinking: Reinforcing Cognitive Effort in MoE Reasoning Models Without Additional Training

On the Resilience of LLM-Based Multi-Agent Collaboration with Faulty Agents

Competing Large Language Models in Multi-Agent Gaming Environments

On the Resilience of Multi-Agent Systems with Malicious Agents

Chain-of-Jailbreak Attack for Image Generation Models via Editing Step by Step

Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training

Insight Over Sight? Exploring the Vision-Knowledge Conflicts in Multimodal LLMs