影响力指数

93.25/100

前 0.4%

全站排名 #239

发表论文40 篇

平均评分5.4

年均产出13.3 篇/年

Linjie Li

PhD student@University of Washington·美国·OpenReview

研究方向

Multimodal Understanding and Generation

Unfolding Spatial Cognition: Evaluating Multimodal Models on Visual Simulations

ICLR 2026Poster

AdaReasoner: Dynamic Tool Orchestration for Iterative Visual Reasoning

ICLR 2026Poster

OR-PRM: A Process Reward Model for Algorithmic Problem in Operations Research

ICLR 2026Poster

EdiVal-Agent: An Object-Centric Framework for Automated, Fine-Grained Evaluation of Multi-Turn Editing

ICLR 2026Poster

3D-CoS: A New 3D Reconstruction Paradigm Based on VLM Code Synthesis

ICLR 2026Rejected

STITCH: Simultaneous Thinking and Talking with Chunked Reasoning for Spoken Language Models

ICLR 2026Poster

ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning

ICLR 2026Poster

FullFront: Benchmarking MLLMs Across the Full Front-End Engineering Workflow

ICLR 2026Rejected

TextAtlas5M: A Large-Scale Dataset for Long and Structured Text Image Generation

ICLR 2026Rejected

InfoAgent: Advancing Autonomous Information‑Seeking Agents

ICLR 2026Rejected

V-MAGE: A Game Evaluation Framework for Assessing Vision-Centric Capabilities in Multimodal Large Language Models

ICLR 2026Withdrawn

Unary Feedback as Observation: Incentivizing Self-Reflection in Large Language Models via Multi-Turn RL

ICLR 2026Rejected

Seeing is Not Reasoning: MVPBench for Graph-based Evaluation of Multi-path Visual Physical CoT

ICLR 2026Rejected

Shanks: Simultaneous Hearing and Thinking for Spoken Language Models

ICLR 2026Withdrawn

Are Unified Vision-Language Models Necessary: Generalization Across Understanding and Generation

ICLR 2026Rejected

Where do Reasoning Models Make a Difference? Follow the Reasoning Leader for Efficient Decoding

ICLR 2026Withdrawn

Computer-Use Agents as Judges for Automatic GUI Design

ICLR 2026Withdrawn

VCode: a Multimodal Coding Benchmark with SVG as Symbolic Visual Representation

ICLR 2026Rejected

The Agent's Marathon: Probing the Limits of Endurance in Long-Horizon Tasks

ICLR 2026Rejected

Can MLLMs Reason in Multimodality? EMMA: An Enhanced MultiModal ReAsoning Benchmark

MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models

SlowFast-VGen: Slow-Fast Learning for Action-Driven Long Video Generation

ICLR 2025Spotlight

VAGEN: Reinforcing World Model Reasoning for Multi-Turn VLM Agents

NeurIPS 2025Poster

SoTA with Less: MCTS-Guided Sample Selection for Data-Efficient Visual Reasoning Self-Improvement

NeurIPS 2025Spotlight

ViCrit: A Verifiable Reinforcement Learning Proxy Task for Visual Perception in VLMs

NeurIPS 2025Poster

EditRoom: LLM-parameterized Graph Diffusion for Composable 3D Room Layout Editing

ICLR 2025Poster

Point-RFT: Improving Multimodal Reasoning with Visually Grounded Reinforcement Finetuning

NeurIPS 2025Poster

GenXD: Generating Any 3D and 4D Scenes

ICLR 2025Poster

CertainlyUncertain: A Benchmark and Metric for Multimodal Epistemic and Aleatoric Awareness

ICLR 2025Poster

MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos

ICLR 2025Poster

OmniContrast: Vision-Language-Interleaved Contrast from Pixels All at once

ICLR 2025Rejected

Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback

ICML 2025Poster

合作者 (20)

Chung-Ching Lin

Alex Jinpeng Wang

博士导师6 篇