影响力指数

96.31/100

前 0.2%

全站排名 #124

发表论文46 篇

平均评分5.5

年均产出15.3 篇/年

Limin Wang

Full Professor@Nanjing University·中国·OpenReview

研究方向

tracking · deep learning · computer vision · action recognition · video analysis

PixNerd: Pixel Neural Field Diffusion

ICLR 2026Poster

CaReBench: A Fine-grained Benchmark for Video Captioning and Retrieval

ICLR 2026Poster

VideoChat-Flash: Hierarchical Compression for Long-Context Video Modeling

ICLR 2026Poster

ExpVid: A Benchmark for Experiment Video Understanding & Reasoning

ICLR 2026Poster

TokenSculpt: Pruning with Min-Max Spatio-Temporal Duplication for Video Grounding

ICLR 2026Rejected

Balancing the Experts: Unlocking LoRA-MoE for GRPO via Mechanism-Aware Rewards

ICLR 2026Poster

FreeRet: MLLMs as Training-Free Retrievers

ICLR 2026Rejected

RIVER: A Real-Time Interaction Benchmark for Video LLMs

ICLR 2026Poster

Learning Goal-Oriented Language-Guided Navigation with Self-Improving Demonstrations at Scale

ICLR 2026Rejected

UniFlow: A Unified Pixel Flow Tokenizer for Visual Understanding and Generation

ICLR 2026Poster

Benchmarking Visual Knowledge in Multimodal Large Language Models

ICLR 2026Withdrawn

Arbitrary Generative Video Interpolation

ICLR 2026Poster

Towards Pixel-level VLM Perception via Simple Points Prediction

ICLR 2026Rejected

VideoCap-R1: Enhancing MLLMs for Video Captioning via Structured Thinking

ICLR 2026Withdrawn

History-Aware Transformation of ReID Features for Multiple Object Tracking

ICLR 2026Rejected

Efficient Low-rank and Sparse Approximation and Adaptation for Large Language Models

ICLR 2026Withdrawn

CoMo: Learning Continuous Latent Motion from Internet Videos for Scalable Robot Learning

ICLR 2026Withdrawn

StreamForest: Efficient Online Video Understanding with Persistent Event Memory

NeurIPS 2025Spotlight

OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text

ICLR 2025Spotlight

VideoChat-R1.5: Visual Test-Time Scaling to Reinforce Multimodal Reasoning by Iterative Perception

NeurIPS 2025Poster

Eagle 2.5: Boosting Long-Context Post-Training for Frontier Vision-Language Models

NeurIPS 2025Poster

SPA: 3D Spatial-Awareness Enables Effective Embodied Representation

ICLR 2025Poster

Bootstrapping Language-Guided Navigation Learning with Self-Refining Data Flywheel

ICLR 2025Poster

MotionRAG: Motion Retrieval-Augmented Image-to-Video Generation

NeurIPS 2025Poster

LongVPO: From Anchored Cues to Self-Reasoning for Long-Form Video Preference Optimization

NeurIPS 2025Poster

Differentiable Solver Search for fast diffusion sampling

ICLR 2025Rejected

Differentiable Solver Search for Fast Diffusion Sampling

ICML 2025Poster

CG-Bench: Clue-grounded Question Answering Benchmark for Long Video Understanding

ICLR 2025Poster

Modeling Fine-Grained Hand-Object Dynamics for Egocentric Video Representation Learning

ICLR 2025Poster

Stochastic Layer-Wise Shuffle for Improving Vision Mamba Training

ICML 2025Poster

TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning

ICLR 2025Poster

RotPruner: Large Language Model Pruning in Rotated Space

ICLR 2025Rejected

Efficient Test-Time Prompt Tuning for Vision-Language Models

ICLR 2025Rejected

DMM: Building a Versatile Image Generation Model via Distillation-Based Model Merging

ICLR 2025Rejected

TrackMamba: Mamba-Transformer Tracking

ICLR 2025Withdrawn

VideoEval: Comprehensive Benchmark Suite for Low-Cost Evaluation of Video Foundation Model

ICLR 2025Withdrawn

Tra-MoE: Scaling Trajectory Prediction Models for Adaptive Policy Conditioning

ICLR 2025Withdrawn

Stochastic Layer-Wise Shuffle: A Good Practice to Improve Vision Mamba Training

ICLR 2025Withdrawn

合作者 (20)

合作者11 篇