PaperHub

Shanghang Zhang

~Shanghang_Zhang4

39
论文总数
19.5
年均投稿
5.6
平均评分
接收情况21/39
会议分布
ICLR
22
NeurIPS
12
ICML
5

发表论文 (39 篇)

202528

4.8
5

HyperAdapter: Generating Adapters for Pre-Trained Model-Based Continual Learning

ICLR 2025withdrawn
7.5
4

Co$^{\mathbf{3}}$Gesture: Towards Coherent Concurrent Co-speech 3D Gesture Generation with Interactive Diffusion

ICLR 2025Spotlight
4.5
4

Frequency-Decoupled Cross-Modal Knowledge Distillation

ICLR 2025withdrawn
6.3
4

Uni-Map: Unified Camera-LiDAR Perception for Robust HD Map Construction

ICLR 2025Rejected
6.4
5

Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning of Vision Language Models

NeurIPS 2025Poster
5.5
4

OmniArch: Building Foundation Model for Scientific Computing

ICML 2025Poster
7.8
4

Beyond Attention or Similarity: Maximizing Conditional Diversity for Token Pruning in MLLMs

NeurIPS 2025Poster
7.1
5

URDF-Anything: Constructing Articulated Objects with 3D Multimodal Language Model

NeurIPS 2025Spotlight
5.7
3

Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want

ICLR 2025Poster
4.8
4

Discovering Long-Term Effects on Parameter Efficient Fine-tuning

ICLR 2025Rejected
7.3
4

Orochi: Versatile Biomedical Image Processor

NeurIPS 2025Spotlight
5.2
5

CrayonRobo: Toward Generic Robot Manipulation via Crayon Visual Prompting

ICLR 2025withdrawn
3.8
4

PINNsAgent: Automated PDE Surrogation with Large Language Models

ICML 2025Poster
6.5
4

MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data Engine

ICLR 2025Poster
5.5
3

SAN: Hypothesizing Long-Term Synaptic Development and Neural Engram Mechanism in Scalable Model's Parameter-Efficient Fine-Tuning

ICML 2025Poster
6.0
4

EmpathyRobot: A Dataset and Benchmark for Empathetic Task Planning of Robotic Agent

ICLR 2025Rejected
5.0
3

Self-Corrected Multimodal Large Language Model for Robot Manipulation and Reflection

ICLR 2025withdrawn
5.5
4

Empowering World Models with Reflection for Embodied Video Prediction

ICML 2025Poster
5.8
4

EVA: An Embodied World Model for Future Video Anticipation

ICLR 2025Rejected
4.0
3

$\textbf{CoCoGesture}$: Towards Coherent Co-speech 3D Gesture Generation in the Wild

ICLR 2025withdrawn
7.3
4

Fast-in-Slow: A Dual-System VLA Model Unifying Fast Manipulation within Slow Reasoning

NeurIPS 2025Poster
5.5
5

SparseVLM: Visual Token Sparsification for Efficient Vision-Language Model Inference

ICML 2025Poster
5.2
5

SparseVLM: Visual Token Sparsification for Efficient Vision Language Models Inference

ICLR 2025Rejected
7.8
4

RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics

NeurIPS 2025Poster
6.4
4

AC-DiT: Adaptive Coordination Diffusion Transformer for Mobile Manipulation

NeurIPS 2025Poster
4.5
4

ViML: A Video, Music, Language Unified Dataset for Understanding and Generation

ICLR 2025withdrawn
6.8
4

SEEA-R1: Tree-Structured Reinforcement Fine-Tuning for Self-Evolving Embodied Agents

NeurIPS 2025Poster
6.0
4

HybridVLA: Collaborative Autoregression and Diffusion in a Unified Vision-Language-Action Model

NeurIPS 2025Rejected

202411