影响力指数

92.79/100

前 0.4%

全站排名 #258

发表论文50 篇

平均评分5.1

年均产出16.7 篇/年

Dong Yu

Distinguished Scientist@Tencent AI Lab·美国·OpenReview

研究方向

speech recognition · deep learning · natural language processing · LLM

R-Zero: Self-Evolving Reasoning LLM from Zero Data

ICLR 2026Poster

Parallel-R1: Towards Parallel Thinking via Reinforcement Learning

ICLR 2026Poster

Don't Throw Away Your Pretrained Model

ICLR 2026Poster

DeepMath-103K: A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoning

ICLR 2026Poster

RePlan: Reasoning-Guided Region Planning for Complex Instruction-Based Image Editing

ICLR 2026Rejected

Vision-SR1: Self-Rewarding Vision-Language Model via Reasoning Decomposition and Multi-Reward Policy Optimization

ICLR 2026Poster

EmoSteer-TTS: Fine-Grained and Training-Free Emotion-Controllable Text-to-Speech via Activation Steering

ICLR 2026Withdrawn

DeepCompress: A Dual Reward Strategy for Dynamically Exploring and Compressing Reasoning Chains

ICLR 2026Poster

InComeS: Integrating Compression and Selection Mechanisms into LLMs for Efficient Model Editing

ICLR 2026Withdrawn

On the Evolution of Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation

ICLR 2026Rejected

CDE: Curiosity-Driven Exploration for Efficient Reinforcement Learning in Large Language Models

ICLR 2026Poster

WebAggregator: Scaling Complex Logical Information Aggregation for Web Agents Foundation Models

ICLR 2026Withdrawn

MobileGUI-RL: Advancing Mobile GUI Agent through Reinforcement Learning in Online Environment

ICLR 2026Rejected

Revisiting Audio-language Pretraining for Learning General-purpose Audio Representation

ICLR 2026Withdrawn

DeepTheorem: Advancing LLM Reasoning for Theorem Proving Through Natural Language and Reinforcement Learning

ICLR 2026Rejected

One Token to Fool LLM-as-a-Judge

ICLR 2026Rejected

CLUE: Non-parametric Verification from Experience via Hidden-State Clustering

ICLR 2026Withdrawn

VOGUE: Guiding Exploration with Visual Uncertainty Improves Multimodal Reasoning

ICLR 2026Withdrawn

LeVo: High-Quality Song Generation with Multi-Preference Alignment

NeurIPS 2025Poster

Thoughts Are All Over the Place: On the Underthinking of Long Reasoning Models

NeurIPS 2025Spotlight

Improving LLM General Preference Alignment via Optimistic Online Mirror Descent

NeurIPS 2025Spotlight

Trust, But Verify: A Self-Verification Approach to Reinforcement Learning with Verifiable Rewards

NeurIPS 2025Poster

Two Experts Are All You Need for Steering Thinking: Reinforcing Cognitive Effort in MoE Reasoning Models Without Additional Training

NeurIPS 2025Poster

DSBench: How Far Are Data Science Agents from Becoming Data Science Experts?

ICLR 2025Poster

MPS-Prover: Advancing Stepwise Theorem Proving by Multi-Perspective Search and Data Curation

NeurIPS 2025Poster

UniGist: Towards General and Hardware-aligned Sequence-level Long Context Compression

NeurIPS 2025Poster

DOTS: Learning to Reason Dynamically in LLMs via Optimal Reasoning Trajectories Search

ICLR 2025Poster

LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory

ICLR 2025Poster

DivScene: Benchmarking LVLMs for Object Navigation with Diverse Scenes and Objects

ICLR 2025Rejected

RepoGraph: Enhancing AI Software Engineering with Repository-level Code Graph

ICLR 2025Poster

Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning

The First Few Tokens Are All You Need: An Efficient and Effective Unsupervised Prefix Fine-Tuning Method for Reasoning Models

NeurIPS 2025Poster

ParallelSpec: Parallel Drafter for Efficient Speculative Decoding

ICLR 2025Rejected

DeFine: Enhancing LLM Decision-Making with Factor Profiles and Analogical Reasoning

ICLR 2025Rejected

SePPO: Semi-Policy Preference Optimization for Diffusion Alignment

ICLR 2025Withdrawn

Do NOT Think That Much for 2+3=? On the Overthinking of Long Reasoning Models

ICML 2025Poster

HDFlow: Enhancing LLM Complex Problem-Solving with Hybrid Thinking and Dynamic Workflows

ICLR 2025Rejected

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

ICML 2025Rejected

SIaM: Self-Improving Code-Assisted Mathematical Reasoning of Large Language Models

ICLR 2025Withdrawn

Leopard: A Vision Language Model For Text-Rich Multi-Image Tasks

ICLR 2025Withdrawn

Controllable Text-to-Speech Synthesis with Masked-Autoencoded Style Representation

ICLR 2025Withdrawn

MultiMedia-Agent: A Multimodal Agent for Multimedia Content Generation

ICLR 2025Withdrawn

Video-to-Audio generation with Hidden Alignment

ICLR 2025Withdrawn

合作者 (20)