Hengshuang Zhao
~Hengshuang_Zhao2
37
论文总数
18.5
年均投稿
平均评分
接收情况23/37
会议分布
ICLR
16
NeurIPS
15
ICML
6
发表论文 (37 篇)
202523 篇
4
LiteReality: Graphic-Ready 3D Scene Reconstruction from RGB-D Scans
NeurIPS 2025Poster
4
LARM: Large Auto-Regressive Model for Long-Horizon Embodied Intelligence
ICLR 2025withdrawn
4
Orient Anything: Learning Robust Object Orientation Estimation from Rendering 3D Models
ICML 2025Poster
4
LARM: Large Auto-Regressive Model for Long-Horizon Embodied Intelligence
ICML 2025Poster
4
BOOD: Boundary-based Out-Of-Distribution Data Generation
ICLR 2025Rejected
3
Effective LLM Knowledge Learning Requires Rethinking Generalization
ICLR 2025Rejected
4
TGDPO: Harnessing Token-Level Reward Guidance for Enhancing Direct Preference Optimization
ICML 2025Poster
4
BOOD: Boundary-based Out-Of-Distribution Data Generation
ICML 2025Poster
5
PlayerOne: Egocentric World Simulator
NeurIPS 2025Oral
4
HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large Vision-Language Models
ICLR 2025withdrawn
4
Seg-VAR:Image Segmentation with Visual Autoregressive Modeling
NeurIPS 2025Poster
4
HaploVL: A Single-Transformer Baseline for Multi-Modal Understanding
ICML 2025Poster
4
Orient Anything V2: Unifying Orientation and Rotation Understanding
NeurIPS 2025Spotlight
4
Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations
NeurIPS 2025Poster
5
VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning
NeurIPS 2025Poster
4
MiCo: Multi-image Contrast for Reinforcement Visual Reasoning
NeurIPS 2025Poster
4
VIRT: Vision Instructed Transformer for Robotic Manipulation
ICLR 2025withdrawn
4
VIP: Vision Instructed Pre-training for Robotic Manipulation
ICML 2025Poster
4
ROSE: Remove Objects with Side Effects in Videos
NeurIPS 2025Poster
4
OmniBind: Large-scale Omni Multimodal Representation via Binding Spaces
ICLR 2025Poster
4
Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance
NeurIPS 2025Poster
4
Tailor3D: Customized 3D Assets Editing and Generation with Dual-Side Images
ICLR 2025withdrawn
4
EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
ICLR 2025withdrawn
202414 篇
4
InsightMapper: A closer look at inner-instance information for vectorized High-Definition Mapping
ICLR 2024withdrawn
3
One for All: Multi-Domain Joint Training for Point Cloud Based 3D Object Detection
NeurIPS 2024Poster
4
Influencer Backdoor Attack on Semantic Segmentation
ICLR 2024Spotlight
4
OpenIns3D: Snap and Lookup for 3D Open-vocabulary Instance Segmentation
ICLR 2024Rejected
4
CT++: Complementary Co-Training for Semi-Supervised Semantic Segmentation
ICLR 2024withdrawn
4
LION: Linear Group RNN for 3D Object Detection in Point Clouds
NeurIPS 2024Poster
4
LiT: Unifying LiDAR "Languages" with LiDAR Translator
NeurIPS 2024Poster
4
SyncVIS: Synchronized Video Instance Segmentation
NeurIPS 2024Poster
3
Grouplane: End-to-End 3D Lane Detection with Channel-Wise Grouping
ICLR 2024Rejected
4
Depth Anything V2
NeurIPS 2024Poster
4
Unitention: Attend a sample to the dataset
ICLR 2024Rejected
4
UniPAD: A Universal Pre-training Paradigm for Autonomous Driving
ICLR 2024withdrawn
4
DriveGPT4: Interpretable End-to-end Autonomous Driving via Large Language Model
ICLR 2024Rejected
4
Zero-shot Image Editing with Reference Imitation
NeurIPS 2024Poster