Hanwang Zhang
~Hanwang_Zhang3
29
论文总数
14.5
年均投稿
平均评分
接收情况20/29
会议分布
NeurIPS
13
ICLR
12
ICML
4
发表论文 (29 篇)
202515 篇
4
VR-Sampling: Accelerating Flow Generative Model Training with Variance Reduction Sampling
ICLR 2025withdrawn
4
Towards Debiased Source-Free Domain Adaptation
ICLR 2025withdrawn
4
Ca2-VDM: Efficient Autoregressive Video Diffusion Model with Causal Generation and Cache Sharing
ICML 2025Poster
4
Learning to Animate Images from A Few Videos to Portray Delicate Human Actions
ICLR 2025withdrawn
4
Object Fusion via Diffusion Time-step for Customized Image Editing with Single Example
ICLR 2025withdrawn
4
Co-Reinforcement Learning for Unified Multimodal Understanding and Generation
NeurIPS 2025Spotlight
5
Towards Semantic Equivalence of Tokenization in Multimodal LLM
ICLR 2025Poster
4
Enhancing CLIP Robustness via Cross-Modality Alignment
NeurIPS 2025Spotlight
4
3D Question Answering via only 2D Vision-Language Models
ICML 2025Poster
4
$\mathcal{V}ista\mathcal{DPO}$: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models
ICML 2025Poster
4
A Closer Look at Time Steps is Worthy of Triple Speed-Up for Diffusion Model Training
ICLR 2025withdrawn
4
Geo-3DGS: Multi-view Geometry Consistency for 3D Gaussian Splatting and Surface Reconstruction
ICLR 2025Rejected
5
Vinci: Deep Thinking in Text-to-Image Generation using Unified Model with Reinforcement Learning
NeurIPS 2025Poster
4
Selftok-Zero: Reinforcement Learning for Visual Generation via Discrete and Autoregressive Visual Tokens
NeurIPS 2025Poster
5
On Path to Multimodal Generalist: General-Level and General-Bench
ICML 2025Oral
202414 篇
5
Robust Fine-tuning of Zero-shot Models via Variance Reduction
NeurIPS 2024Poster
4
Vitron: A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing
NeurIPS 2024Poster
4
Momentum-accelerated Diffusion Process for Faster Training and Sampling
ICLR 2024Rejected
4
Decoupled Kullback-Leibler Divergence Loss
ICLR 2024withdrawn
5
Decoupled Kullback-Leibler Divergence Loss
NeurIPS 2024Poster
4
Enhancing Zero-Shot Vision Models by Label-Free Prompt Distribution Learning and Bias Correcting
NeurIPS 2024Spotlight
4
Exploring Diffusion Time-steps for Unsupervised Representation Learning
ICLR 2024Poster
4
DisCo: Disentangled Control for Realistic Human Dance Generation
ICLR 2024withdrawn
4
Lever LM: Configuring In-Context Sequence to Lever Large Vision Language Models
NeurIPS 2024Poster
4
Towards Unified Multimodal Editing with Enhanced Knowledge Collaboration
NeurIPS 2024Spotlight
4
Fine-tuning Multimodal LLMs to Follow Zero-shot Demonstrative Instructions
ICLR 2024Spotlight
3
MVGamba: Unify 3D Content Generation as State Space Sequence Modeling
NeurIPS 2024Poster
3
Unified Generative and Discriminative Training for Multi-modal Large Language Models
NeurIPS 2024Poster
4
Action Imitation in Common Action Space for Customized Action Image Synthesis
NeurIPS 2024Poster