PaperHub

Dahua Lin

~Dahua_Lin1

57
论文总数
28.5
年均投稿
5.9
平均评分
接收情况35/57
会议分布
ICLR
34
NeurIPS
16
ICML
4
COLM
3

发表论文 (57 篇)

202536

6.4
5

Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs

ICLR 2025Poster
4.7
3

DualFocus: Integrating Macro and Micro Perspectives in Multi-modal Large Language Models

ICLR 2025withdrawn
4.5
4

Trustworthy Dataset Proof: Certifying the Authentic Use of Dataset in Training Models for Enhanced Trust

ICLR 2025withdrawn
7.5
4

Predictive Inverse Dynamics Models are Scalable Learners for Robotic Manipulation

ICLR 2025Oral
5.0
4

VEnhancer: Generative Space-Time Enhancement for Video Generation

ICLR 2025Rejected
8.2
4

Video World Models with Long-term Spatial Memory

NeurIPS 2025Poster
6.6
5

IDArb: Intrinsic Decomposition for Arbitrary Number of Input Views and Illuminations

ICLR 2025Poster
6.8
4

Imagine360: Immersive 360 Video Generation from Perspective Anchor

NeurIPS 2025Poster
6.0
4

Mixing Expert Knowledge: Bring Human Thoughts Back To the Game of Go

NeurIPS 2025Poster
5.0
4

Unearthing Large Scale Domain-Specific Knowledge from Public Corpora

ICLR 2025Rejected
4.9
4

MxMoE: Mixed-precision Quantization for MoE with Accuracy and Performance Co-Design

ICML 2025Poster
5.3
4

Deciphering Cross-Modal Alignment in Large Vision-Language Models with Modality Integration Rate

ICLR 2025withdrawn
6.0
6

Training Language Models to Critique with Multi-Agent Feedback

ICLR 2025Rejected
3.8
4

Bootstrap3D: Improving Multi-view Diffusion Model with Synthetic Data

ICLR 2025withdrawn
4.8
4

Tailor3D: Customized 3D Assets Editing and Generation with Dual-Side Images

ICLR 2025withdrawn
5.3
4

SAM2Long: Enhancing SAM2 for Long Video Segmentation with a Training-Free Memory Tree

ICLR 2025withdrawn
5.5
3

SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation

ICML 2025Poster
4.6
5

BroadWay: Boost Your Text-to-Video Generation Model in a Training-free Way

ICLR 2025withdrawn
4.6
5

SongComposer: A Large Language Model for Lyric and Melody Composition in Song Generation

ICLR 2025Rejected
5.5
4

RAR: Retrieving And Ranking Augmented MLLMs for Visual Recognition

ICLR 2025Rejected
6.8
4

Semi-off-Policy Reinforcement Learning for Vision-Language Slow-Thinking Reasoning

NeurIPS 2025Poster
5.5
4

LV-Eval: A Balanced Long-Context Benchmark with 5 Length Levels Up to 256K

ICLR 2025Rejected
6.0
5

Hierachical Balance Packing: Towards Efficient Supervised Fine-tuning for Long-Context LLM

NeurIPS 2025Poster
5.5
4

MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models

ICLR 2025Poster
8.2
5

HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned Guidance

NeurIPS 2025Poster
6.3
4

LV-Eval: A Balanced Long-Context Benchmark with 5 Length Levels Up to 256K

COLM 2025Poster
6.8
4

3DTrajMaster: Mastering 3D Trajectory for Multi-Entity Motion in Video Generation

ICLR 2025Poster
5.7
3

What are the Essential Factors in Crafting Effective Long Context Multi-Hop Instruction Datasets? Insights and Best Practices

ICLR 2025Rejected
5.0
3

OMNIBAL: TOWARDS FAST INSTRUCT-TUNING FOR VISION-LANGUAGE MODELS VIA OMNIVERSE COMPUTATION BALANCE

ICLR 2025withdrawn
6.0
5

OmniBal: Towards Fast Instruction-Tuning for Vision-Language Models via Omniverse Computation Balance

ICML 2025Poster
3.0
4

PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction

ICLR 2025withdrawn
4.8
4

3DTopia-XL: Scaling High-quality 3D Asset Generation via Primitive Diffusion

ICLR 2025withdrawn
8.3
4

VideoRoPE: What Makes for Good Video Rotary Position Embedding?

ICML 2025Oral
8.0
4

LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models

ICLR 2025Spotlight
6.5
4

Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning

COLM 2025Poster
7.5
4

OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text

ICLR 2025Spotlight

202421

5.3
3

MGF: Mixed Gaussian Flow for Diverse Trajectory Prediction

NeurIPS 2024Poster
5.0
4

InterControl: Zero-shot Human Interaction Generation by Controlling Every Joint

NeurIPS 2024Poster
8.0
3

Scaling Laws of RoPE-based Extrapolation

ICLR 2024Poster
7.3
4

CriticEval: Evaluating Large-scale Language Model as Critic

NeurIPS 2024Poster
6.3
4

ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models

NeurIPS 2024Poster
7.0
4

SKVQ: Sliding-window Key and Value Cache Quantization for Large Language Models

COLM 2024Poster
5.8
4

Streaming Long Video Understanding with Large Language Models

NeurIPS 2024Poster
7.5
4

HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion

ICLR 2024Poster
5.0
4

Towards Text-guided 3D Scene Composition

ICLR 2024withdrawn
7.3
4

Unified Human-Scene Interaction via Prompted Chain-of-Contacts

ICLR 2024Spotlight
5.4
5

Make-it-Real: Unleashing Large Multimodal Model for Painting 3D Objects with Realistic Materials

NeurIPS 2024Poster
4.8
4

Shadow Alignment: The Ease of Subverting Safely-Aligned Language Models

ICLR 2024withdrawn
5.5
4

SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction

ICLR 2024Poster
6.7
3

Prism: A Framework for Decoupling and Assessing the Capabilities of VLMs

NeurIPS 2024Poster
7.0
4

AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning

ICLR 2024Spotlight
6.0
4

AlchemistCoder: Harmonizing and Eliciting Code Capability by Hindsight Tuning on Multi-source Data

NeurIPS 2024Poster
6.5
4

Are We on the Right Way for Evaluating Large Vision-Language Models?

NeurIPS 2024Poster
4.3
4

Convolution on Your 12× Wide Feature: A ConvNet with Nested Design

ICLR 2024withdrawn
5.3
4

MMBench: Is Your Multi-modal Model an All-around Player?

ICLR 2024Rejected
5.5
4

LaVie: High-Quality Video Generation with Cascaded Latent Diffusion Models

ICLR 2024Rejected
5.3
3

InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD

NeurIPS 2024Poster