影响力指数

75.09/100

前 1.9%

全站排名 #1,205

发表论文21 篇

平均评分5.7

年均产出7.0 篇/年

Xu Han

Assistant Professor@Tsinghua University, Tsinghua University·中国·OpenReview

研究方向

natural language processing · pre-trained language model · large language model

5.3

InfLLM-V2: Dense-Sparse Switchable Attention for Seamless Short-to-Long Adaptation

ICLR 2026Poster

4.5

A*-Thought: Efficient Reasoning via Bidirectional Compression for Low-Resource Settings

NeurIPS 2025Poster

三作

7.0

BlockFFN: Towards End-Side Acceleration-Friendly Mixture-of-Experts with Chunk-Level Activation Sparsity

COLM 2025Poster

三作

6.0

Stuffed Mamba: Oversized States Lead to the Inability to Forget

COLM 2025Poster

6.0

VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents

ICLR 2025Poster

5.8

Locret: Enhancing Eviction in Long-Context LLM Inference with Trained Retaining Heads

ICLR 2025Rejected

三作

5.3

Sparsing Law: Towards Large Language Models with Greater Activation Sparsity

ICLR 2025Rejected

三作

3.6

Stuffed Mamba: State Collapse and State Capacity of RNN-Based Long-Context Modeling

合作者 (20)

Xu Han

InfLLM-V2: Dense-Sparse Switchable Attention for Seamless Short-to-Long Adaptation

Process Reinforcement through Implicit Rewards

AUTOTRITON: Automatic Triton Programming with Reinforcement Learning in LLMs

StateX: Enhancing RNN Recall via Post-training State Expansion

Sparsing Law: Towards Large Language Models with Greater Activation Sparsity

A*-Thought: Efficient Reasoning via Bidirectional Compression for Low-Resource Settings

BlockFFN: Towards End-Side Acceleration-Friendly Mixture-of-Experts with Chunk-Level Activation Sparsity

Stuffed Mamba: Oversized States Lead to the Inability to Forget

VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents

Locret: Enhancing Eviction in Long-Context LLM Inference with Trained Retaining Heads

Sparsing Law: Towards Large Language Models with Greater Activation Sparsity

Stuffed Mamba: State Collapse and State Capacity of RNN-Based Long-Context Modeling