影响力指数

91.11/100

前 0.5%

全站排名 #322

发表论文64 篇

平均评分5.4

年均产出21.3 篇/年

Zhiyuan Liu

Full Professor@Tsinghua University·中国·OpenReview

研究方向

natural language processing · large language models · cross-modal learning

5.5

How Far Can Unsupervised RLVR Scale LLM Training?

ICLR 2026Poster

5.3

InfLLM-V2: Dense-Sparse Switchable Attention for Seamless Short-to-Long Adaptation

ICLR 2026Poster

通讯

5.2

Hierarchical Semantic-Acoustic Modeling via Semi-Discrete Residual Representations for Expressive End-to-End Speech Synthesis

ICLR 2026Poster

通讯

5.0

From f(x) and g(x) to f(g(x)): LLMs Learn New Skills in RL by Composing Old Ones

ICLR 2026Poster

4.5

LLaVA-UHD v3: Progressive Visual Compression for Efficient Naive-Resolution Encoding in MLLMs

ICLR 2026Withdrawn

4.0

SurveyBench: How Well Can LLM(-Agents) Write Academic Surveys?

ICLR 2026Rejected

4.0

KARE-RAG: Knowledge-Aware Refinement and Enhancement for RAG

ICLR 2026Rejected

3.5

AUTOTRITON: Automatic Triton Programming with Reinforcement Learning in LLMs

ICLR 2026Rejected

3.5

The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models

ICLR 2026Withdrawn

3.5

RLPR: Extrapolating RLVR to General Domains without Verifiers

ICLR 2026Rejected

3.3

StateX: Enhancing RNN Recall via Post-training State Expansion

ICLR 2026Withdrawn

3.3

Quicksviewer: An LMM for Efficient Video Understanding via Reinforced Compression of Video Cubes

ICLR 2026Rejected

3.0

Diversity-aware Training for Test-time Scaling

ICLR 2026Rejected

2.5

Reflective Reinforcement Tool Learning

ICLR 2026Withdrawn

7.8

Sparsing Law: Towards Large Language Models with Greater Activation Sparsity

ICML 2025Poster

7.3

A*-Thought: Efficient Reasoning via Bidirectional Compression for Low-Resource Settings

NeurIPS 2025Poster

7.2

Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence

ICLR 2025Spotlight

7.0

The Overthinker's DIET: Cutting Token Calories with DIfficulty-AwarE Training

NeurIPS 2025Poster

7.0

BlockFFN: Towards End-Side Acceleration-Friendly Mixture-of-Experts with Chunk-Level Activation Sparsity

COLM 2025Poster

7.0

Scaling Large Language Model-based Multi-Agent Collaboration

ICLR 2025Poster

7.0

AIR: A Systematic Analysis of Annotations, Instructions, and Response Pairs in Preference Dataset

COLM 2025Poster

6.8

ParamMute: Suppressing Knowledge-Critical FFNs for Faithful Retrieval-Augmented Generation

NeurIPS 2025Poster

6.5

Advancing LLM Reasoning Generalists with Preference Trees

ICLR 2025Poster

6.4

Learning to Focus: Causal Attention Distillation via Gradient‐Guided Token Pruning

NeurIPS 2025Poster

6.4

Multi-Agent Collaboration via Evolving Orchestration

NeurIPS 2025Poster

6.3

WorkflowLLM: Enhancing Workflow Orchestration Capability of Large Language Models

ICLR 2025Poster

6.0

A Multi-Power Law for Loss Curve Prediction Across Learning Rate Schedules

ICLR 2025Poster

6.0

Stuffed Mamba: Oversized States Lead to the Inability to Forget

COLM 2025Poster

6.0

RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rewards

ICLR 2025Poster

6.0

VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents

ICLR 2025Poster

5.8

Locret: Enhancing Eviction in Long-Context LLM Inference with Trained Retaining Heads

ICLR 2025Rejected

通讯

5.7

Rational Decision-Making Agent with Learning Internal Utility Judgment

ICLR 2025Poster

5.5

Optima: Optimizing Effectiveness and Efficiency for LLM-Based Multi-Agent System

ICLR 2025Rejected

5.5

Free Process Rewards without Process Labels

ICML 2025Poster

5.5

Proactive Agent: Shifting LLM Agents from Reactive Responses to Active Assistance

ICLR 2025Poster

5.3

Sparsing Law: Towards Large Language Models with Greater Activation Sparsity

ICLR 2025Rejected

4.8

AgentMonitor: A Plug-and-Play Framework for Predictive and Secure Multi-Agent Systems

ICLR 2025Withdrawn

4.6

Improving Zero-Shot Generalization of Instruction Tuning by Data Arrangement

ICLR 2025Withdrawn

3.6

Stuffed Mamba: State Collapse and State Capacity of RNN-Based Long-Context Modeling

合作者 (20)

Zhiyuan Liu

How Far Can Unsupervised RLVR Scale LLM Training?

InfLLM-V2: Dense-Sparse Switchable Attention for Seamless Short-to-Long Adaptation

Hierarchical Semantic-Acoustic Modeling via Semi-Discrete Residual Representations for Expressive End-to-End Speech Synthesis

From f(x) and g(x) to f(g(x)): LLMs Learn New Skills in RL by Composing Old Ones

KG-Infused RAG: Augmenting Corpus-Based RAG with External Knowledge Graphs

DeepThinkVLA: Enhancing Reasoning Capability of Vision-Language-Action Models

Test-Time Exploration in Unknown Environments

CPMöbius: Iterative Coach–Player Reasoning for Data-Free Reinforcement Learning

Process Reinforcement through Implicit Rewards

Query Routing over Multimodal Knowledge Bases for Retrieval-Augmented Reasoning

Evidence-Guided Multi-Image Reasoning in Visual Retrieval-Augmented Generation

LLaVA-UHD v3: Progressive Visual Compression for Efficient Naive-Resolution Encoding in MLLMs

SurveyBench: How Well Can LLM(-Agents) Write Academic Surveys?

KARE-RAG: Knowledge-Aware Refinement and Enhancement for RAG

AUTOTRITON: Automatic Triton Programming with Reinforcement Learning in LLMs

The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models

RLPR: Extrapolating RLVR to General Domains without Verifiers

StateX: Enhancing RNN Recall via Post-training State Expansion

Quicksviewer: An LMM for Efficient Video Understanding via Reinforced Compression of Video Cubes

Diversity-aware Training for Test-time Scaling

Reflective Reinforcement Tool Learning

Sparsing Law: Towards Large Language Models with Greater Activation Sparsity

A*-Thought: Efficient Reasoning via Bidirectional Compression for Low-Resource Settings

Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence

The Overthinker's DIET: Cutting Token Calories with DIfficulty-AwarE Training

BlockFFN: Towards End-Side Acceleration-Friendly Mixture-of-Experts with Chunk-Level Activation Sparsity

Scaling Large Language Model-based Multi-Agent Collaboration

AIR: A Systematic Analysis of Annotations, Instructions, and Response Pairs in Preference Dataset

ParamMute: Suppressing Knowledge-Critical FFNs for Faithful Retrieval-Augmented Generation

Advancing LLM Reasoning Generalists with Preference Trees

Learning to Focus: Causal Attention Distillation via Gradient‐Guided Token Pruning

Multi-Agent Collaboration via Evolving Orchestration

WorkflowLLM: Enhancing Workflow Orchestration Capability of Large Language Models

A Multi-Power Law for Loss Curve Prediction Across Learning Rate Schedules

Stuffed Mamba: Oversized States Lead to the Inability to Forget

RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rewards

VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents

Locret: Enhancing Eviction in Long-Context LLM Inference with Trained Retaining Heads

Rational Decision-Making Agent with Learning Internal Utility Judgment

Optima: Optimizing Effectiveness and Efficiency for LLM-Based Multi-Agent System

Free Process Rewards without Process Labels

Proactive Agent: Shifting LLM Agents from Reactive Responses to Active Assistance

Sparsing Law: Towards Large Language Models with Greater Activation Sparsity

AgentMonitor: A Plug-and-Play Framework for Predictive and Secure Multi-Agent Systems

Improving Zero-Shot Generalization of Instruction Tuning by Data Arrangement

Stuffed Mamba: State Collapse and State Capacity of RNN-Based Long-Context Modeling