Dahua Lin
~Dahua_Lin1
57
论文总数
28.5
年均投稿
平均评分
接收情况35/57
会议分布
ICLR
34
NeurIPS
16
ICML
4
COLM
3
发表论文 (57 篇)
202536 篇
5
Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs
ICLR 2025Poster
3
DualFocus: Integrating Macro and Micro Perspectives in Multi-modal Large Language Models
ICLR 2025withdrawn
4
Trustworthy Dataset Proof: Certifying the Authentic Use of Dataset in Training Models for Enhanced Trust
ICLR 2025withdrawn
4
Predictive Inverse Dynamics Models are Scalable Learners for Robotic Manipulation
ICLR 2025Oral
4
VEnhancer: Generative Space-Time Enhancement for Video Generation
ICLR 2025Rejected
4
Video World Models with Long-term Spatial Memory
NeurIPS 2025Poster
5
IDArb: Intrinsic Decomposition for Arbitrary Number of Input Views and Illuminations
ICLR 2025Poster
4
Imagine360: Immersive 360 Video Generation from Perspective Anchor
NeurIPS 2025Poster
4
Mixing Expert Knowledge: Bring Human Thoughts Back To the Game of Go
NeurIPS 2025Poster
4
Unearthing Large Scale Domain-Specific Knowledge from Public Corpora
ICLR 2025Rejected
4
MxMoE: Mixed-precision Quantization for MoE with Accuracy and Performance Co-Design
ICML 2025Poster
4
Deciphering Cross-Modal Alignment in Large Vision-Language Models with Modality Integration Rate
ICLR 2025withdrawn
6
Training Language Models to Critique with Multi-Agent Feedback
ICLR 2025Rejected
4
Bootstrap3D: Improving Multi-view Diffusion Model with Synthetic Data
ICLR 2025withdrawn
4
Tailor3D: Customized 3D Assets Editing and Generation with Dual-Side Images
ICLR 2025withdrawn
4
SAM2Long: Enhancing SAM2 for Long Video Segmentation with a Training-Free Memory Tree
ICLR 2025withdrawn
3
SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation
ICML 2025Poster
5
BroadWay: Boost Your Text-to-Video Generation Model in a Training-free Way
ICLR 2025withdrawn
5
SongComposer: A Large Language Model for Lyric and Melody Composition in Song Generation
ICLR 2025Rejected
4
RAR: Retrieving And Ranking Augmented MLLMs for Visual Recognition
ICLR 2025Rejected
4
Semi-off-Policy Reinforcement Learning for Vision-Language Slow-Thinking Reasoning
NeurIPS 2025Poster
4
LV-Eval: A Balanced Long-Context Benchmark with 5 Length Levels Up to 256K
ICLR 2025Rejected
5
Hierachical Balance Packing: Towards Efficient Supervised Fine-tuning for Long-Context LLM
NeurIPS 2025Poster
4
MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models
ICLR 2025Poster
5
HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned Guidance
NeurIPS 2025Poster
4
LV-Eval: A Balanced Long-Context Benchmark with 5 Length Levels Up to 256K
COLM 2025Poster
4
3DTrajMaster: Mastering 3D Trajectory for Multi-Entity Motion in Video Generation
ICLR 2025Poster
3
What are the Essential Factors in Crafting Effective Long Context Multi-Hop Instruction Datasets? Insights and Best Practices
ICLR 2025Rejected
3
OMNIBAL: TOWARDS FAST INSTRUCT-TUNING FOR VISION-LANGUAGE MODELS VIA OMNIVERSE COMPUTATION BALANCE
ICLR 2025withdrawn
5
OmniBal: Towards Fast Instruction-Tuning for Vision-Language Models via Omniverse Computation Balance
ICML 2025Poster
4
PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction
ICLR 2025withdrawn
4
3DTopia-XL: Scaling High-quality 3D Asset Generation via Primitive Diffusion
ICLR 2025withdrawn
4
VideoRoPE: What Makes for Good Video Rotary Position Embedding?
ICML 2025Oral
4
LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models
ICLR 2025Spotlight
4
Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning
COLM 2025Poster
4
OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
ICLR 2025Spotlight
202421 篇
3
MGF: Mixed Gaussian Flow for Diverse Trajectory Prediction
NeurIPS 2024Poster
4
InterControl: Zero-shot Human Interaction Generation by Controlling Every Joint
NeurIPS 2024Poster
3
Scaling Laws of RoPE-based Extrapolation
ICLR 2024Poster
4
CriticEval: Evaluating Large-scale Language Model as Critic
NeurIPS 2024Poster
4
ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models
NeurIPS 2024Poster
4
SKVQ: Sliding-window Key and Value Cache Quantization for Large Language Models
COLM 2024Poster
4
Streaming Long Video Understanding with Large Language Models
NeurIPS 2024Poster
4
HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion
ICLR 2024Poster
4
Towards Text-guided 3D Scene Composition
ICLR 2024withdrawn
4
Unified Human-Scene Interaction via Prompted Chain-of-Contacts
ICLR 2024Spotlight
5
Make-it-Real: Unleashing Large Multimodal Model for Painting 3D Objects with Realistic Materials
NeurIPS 2024Poster
4
Shadow Alignment: The Ease of Subverting Safely-Aligned Language Models
ICLR 2024withdrawn
4
SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction
ICLR 2024Poster
3
Prism: A Framework for Decoupling and Assessing the Capabilities of VLMs
NeurIPS 2024Poster
4
AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning
ICLR 2024Spotlight
4
AlchemistCoder: Harmonizing and Eliciting Code Capability by Hindsight Tuning on Multi-source Data
NeurIPS 2024Poster
4
Are We on the Right Way for Evaluating Large Vision-Language Models?
NeurIPS 2024Poster
4
Convolution on Your 12× Wide Feature: A ConvNet with Nested Design
ICLR 2024withdrawn
4
MMBench: Is Your Multi-modal Model an All-around Player?
ICLR 2024Rejected
4
LaVie: High-Quality Video Generation with Cascaded Latent Diffusion Models
ICLR 2024Rejected
3
InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD
NeurIPS 2024Poster