影响力指数

99.19/100

前 0.1%

全站排名 #16

发表论文85 篇

平均评分5.6

年均产出28.3 篇/年

Dahua Lin

Associate Professor@The Chinese University of Hong Kong·中国香港·OpenReview

研究方向

Large Language Models · Deep Learning · Computer Vision

Achieving Olympia-Level Geometry Large Language Model Agent via Complexity Boosting Reinforcement Learning

ICLR 2026Poster

RoboInter: A Holistic Intermediate Representation Suite Towards Robotic Manipulation

ICLR 2026Poster

MMSI-Bench: A Benchmark for Multi-Image Spatial Intelligence

ICLR 2026Poster

Beyond Fixed: Training-Free Variable-Length Denoising for Diffusion Large Language Models

ICLR 2026Poster

From Pixels to Words -- Towards Native Vision-Language Primitives at Scale

ICLR 2026Poster

STAR-Bench: Probing Deep Spatio-Temporal Reasoning as Audio 4D Intelligence

ICLR 2026Poster

Visual Self-Refine: A Pixel-Guided Paradigm for Accurate Chart Parsing

ICLR 2026Poster

DiCache: Let Diffusion Model Determine Its Own Cache

ICLR 2026Poster

SIM-CoT: Supervised Implicit Chain-of-Thought

ICLR 2026Poster

Advancing Complex Video Object Segmentation via Progressive Concept Construction

ICLR 2026Poster

Learning Video Generation for Robotic Manipulation with Collaborative Trajectory Control

ICLR 2026Poster

Scaling Large Vision-Language Model RL Training via Efficient Load Balancing

ICLR 2026Poster

ChangingGrounding: 3D Visual Grounding in Changing Scenes

ICLR 2026Rejected

FlexLinearAttention: Compiling a Unified Abstraction into Scalable Kernels for Linear Attention

ICLR 2026Poster

InterActHuman: Multi-Concept Human Animation with Layout-Aligned Audio Conditions

ICLR 2026Poster

CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning

ICLR 2026Rejected

ScaleCap: Scalable Image Captioning via Dual-Modality Debiasing

ICLR 2026Poster

CapRL: Stimulating Dense Image Caption Capabilities via Reinforcement Learning

ICLR 2026Poster

OPV: Outcome-based Process Verifier for Efficient Long Chain-of-Thought Verification

ICLR 2026Rejected

SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from Experience

ICLR 2026Withdrawn

SPARK: Synergistic Policy And Reward Co-Evolving Framework

ICLR 2026Withdrawn

LONG-HORIZON REASONING AGENT FOR OLYMPIAD- LEVEL MATHEMATICAL PROBLEM SOLVING

ICLR 2026Rejected

MCPVerse: An Expansive, Real-World Benchmark for Agentic Tool Use

ICLR 2026Rejected

The Imitation Game: Turing Machine Imitator is Length Generalizable Reasoner

ICLR 2026Poster

BoostStep: Boosting Mathematical Capability of Large Language Models via Step-aligned In Context Learning

ICLR 2026Rejected

BridgEAD: A Vision-Language Framework for Action Modeling in End-to-End Autonomous Driving

ICLR 2026Rejected

Phased DMD: Few-step Distribution Matching Distillation via Score Matching within Subintervals

ICLR 2026Withdrawn

Windtalkers: Watermarking Open-Source LLMs with Ciphered-Instruction

ICLR 2026Withdrawn

VideoRoPE: What Makes for Good Video Rotary Position Embedding?

Video World Models with Long-term Spatial Memory

NeurIPS 2025Poster

HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned Guidance

NeurIPS 2025Poster

LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models

ICLR 2025Spotlight

Predictive Inverse Dynamics Models are Scalable Learners for Robotic Manipulation

OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text

ICLR 2025Spotlight

Imagine360: Immersive 360 Video Generation from Perspective Anchor

NeurIPS 2025Poster

Semi-off-Policy Reinforcement Learning for Vision-Language Slow-Thinking Reasoning

NeurIPS 2025Poster

3DTrajMaster: Mastering 3D Trajectory for Multi-Entity Motion in Video Generation

ICLR 2025Poster

IDArb: Intrinsic Decomposition for Arbitrary Number of Input Views and Illuminations

ICLR 2025Poster

Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning

COLM 2025Poster

Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs

ICLR 2025Poster

LV-Eval: A Balanced Long-Context Benchmark with 5 Length Levels Up to 256K

COLM 2025Poster

Hierachical Balance Packing: Towards Efficient Supervised Fine-tuning for Long-Context LLM

NeurIPS 2025Poster

Training Language Models to Critique with Multi-Agent Feedback

ICLR 2025Rejected

Mixing Expert Knowledge: Bring Human Thoughts Back To the Game of Go

NeurIPS 2025Poster

OmniBal: Towards Fast Instruction-Tuning for Vision-Language Models via Omniverse Computation Balance

ICML 2025Poster

What are the Essential Factors in Crafting Effective Long Context Multi-Hop Instruction Datasets? Insights and Best Practices

ICLR 2025Rejected

RAR: Retrieving And Ranking Augmented MLLMs for Visual Recognition

ICLR 2025Rejected

SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation

ICML 2025Poster

MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models

ICLR 2025Poster

LV-Eval: A Balanced Long-Context Benchmark with 5 Length Levels Up to 256K

ICLR 2025Rejected

Deciphering Cross-Modal Alignment in Large Vision-Language Models with Modality Integration Rate

ICLR 2025Withdrawn

SAM2Long: Enhancing SAM2 for Long Video Segmentation with a Training-Free Memory Tree

ICLR 2025Withdrawn

VEnhancer: Generative Space-Time Enhancement for Video Generation

ICLR 2025Rejected

Unearthing Large Scale Domain-Specific Knowledge from Public Corpora

ICLR 2025Rejected

OMNIBAL: TOWARDS FAST INSTRUCT-TUNING FOR VISION-LANGUAGE MODELS VIA OMNIVERSE COMPUTATION BALANCE

ICLR 2025Withdrawn

MxMoE: Mixed-precision Quantization for MoE with Accuracy and Performance Co-Design

ICML 2025Poster

Tailor3D: Customized 3D Assets Editing and Generation with Dual-Side Images

ICLR 2025Withdrawn

3DTopia-XL: Scaling High-quality 3D Asset Generation via Primitive Diffusion

ICLR 2025Withdrawn

DualFocus: Integrating Macro and Micro Perspectives in Multi-modal Large Language Models

ICLR 2025Withdrawn

BroadWay: Boost Your Text-to-Video Generation Model in a Training-free Way

ICLR 2025Withdrawn

SongComposer: A Large Language Model for Lyric and Melody Composition in Song Generation

ICLR 2025Rejected

Trustworthy Dataset Proof: Certifying the Authentic Use of Dataset in Training Models for Enhanced Trust

ICLR 2025Withdrawn

Bootstrap3D: Improving Multi-view Diffusion Model with Synthetic Data

ICLR 2025Withdrawn

PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction

ICLR 2025Withdrawn

合作者 (20)