Shanghang Zhang
~Shanghang_Zhang4
39
论文总数
19.5
年均投稿
平均评分
接收情况21/39
会议分布
ICLR
22
NeurIPS
12
ICML
5
发表论文 (39 篇)
202528 篇
5
HyperAdapter: Generating Adapters for Pre-Trained Model-Based Continual Learning
ICLR 2025withdrawn
4
Co$^{\mathbf{3}}$Gesture: Towards Coherent Concurrent Co-speech 3D Gesture Generation with Interactive Diffusion
ICLR 2025Spotlight
4
Frequency-Decoupled Cross-Modal Knowledge Distillation
ICLR 2025withdrawn
4
Uni-Map: Unified Camera-LiDAR Perception for Robust HD Map Construction
ICLR 2025Rejected
5
Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning of Vision Language Models
NeurIPS 2025Poster
4
OmniArch: Building Foundation Model for Scientific Computing
ICML 2025Poster
4
Beyond Attention or Similarity: Maximizing Conditional Diversity for Token Pruning in MLLMs
NeurIPS 2025Poster
5
URDF-Anything: Constructing Articulated Objects with 3D Multimodal Language Model
NeurIPS 2025Spotlight
3
Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want
ICLR 2025Poster
4
Discovering Long-Term Effects on Parameter Efficient Fine-tuning
ICLR 2025Rejected
4
Orochi: Versatile Biomedical Image Processor
NeurIPS 2025Spotlight
5
CrayonRobo: Toward Generic Robot Manipulation via Crayon Visual Prompting
ICLR 2025withdrawn
4
PINNsAgent: Automated PDE Surrogation with Large Language Models
ICML 2025Poster
4
MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data Engine
ICLR 2025Poster
3
SAN: Hypothesizing Long-Term Synaptic Development and Neural Engram Mechanism in Scalable Model's Parameter-Efficient Fine-Tuning
ICML 2025Poster
4
EmpathyRobot: A Dataset and Benchmark for Empathetic Task Planning of Robotic Agent
ICLR 2025Rejected
3
Self-Corrected Multimodal Large Language Model for Robot Manipulation and Reflection
ICLR 2025withdrawn
4
Empowering World Models with Reflection for Embodied Video Prediction
ICML 2025Poster
4
EVA: An Embodied World Model for Future Video Anticipation
ICLR 2025Rejected
3
$\textbf{CoCoGesture}$: Towards Coherent Co-speech 3D Gesture Generation in the Wild
ICLR 2025withdrawn
4
Fast-in-Slow: A Dual-System VLA Model Unifying Fast Manipulation within Slow Reasoning
NeurIPS 2025Poster
5
SparseVLM: Visual Token Sparsification for Efficient Vision-Language Model Inference
ICML 2025Poster
5
SparseVLM: Visual Token Sparsification for Efficient Vision Language Models Inference
ICLR 2025Rejected
4
RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics
NeurIPS 2025Poster
4
AC-DiT: Adaptive Coordination Diffusion Transformer for Mobile Manipulation
NeurIPS 2025Poster
4
ViML: A Video, Music, Language Unified Dataset for Understanding and Generation
ICLR 2025withdrawn
4
SEEA-R1: Tree-Structured Reinforcement Fine-Tuning for Self-Evolving Embodied Agents
NeurIPS 2025Poster
4
HybridVLA: Collaborative Autoregression and Diffusion in a Unified Vision-Language-Action Model
NeurIPS 2025Rejected
202411 篇
4
HUB: Enhancing Learned Optimizers via Hybrid Update-based Strategy
ICLR 2024Rejected
5
ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate
ICLR 2024Poster
5
PDE-Diffusion: Physic guided diffusion model for solving partial derivative equations
ICLR 2024Rejected
3
PromptCoT: Align Prompt Distribution via Adapted Chain-of-Thought
ICLR 2024withdrawn
3
Era3D: High-Resolution Multiview Diffusion using Efficient Row-wise Attention
NeurIPS 2024Poster
4
ViDA: Homeostatic Visual Domain Adapter for Continual Test Time Adaptation
ICLR 2024Poster
3
Unveiling the Tapestry of Consistency in Large Vision-Language Models
NeurIPS 2024Poster
4
Split-Ensemble: Efficient OOD-aware Ensemble via Task and Model Splitting
ICLR 2024Rejected
5
RoboMamba: Efficient Vision-Language-Action Model for Robotic Reasoning and Manipulation
NeurIPS 2024Poster
3
Fisher-aware Quantization for DETR Detectors with Critical-category Objectives
ICLR 2024Rejected
3
A Dataset and Benchmark for Copyright Protection from Text-to-Image Diffusion Models
ICLR 2024withdrawn