影响力指数

99.22/100

前 0.1%

全站排名 #15

发表论文82 篇

平均评分5.6

年均产出27.3 篇/年

Jun Zhu

Professor@Tsinghua University·中国·OpenReview

研究方向

machine learning

DiffusionNFT: Online Diffusion Reinforcement with Forward Process

Exploratory Diffusion Model for Unsupervised Reinforcement Learning

Part-X-MLLM: Part-aware 3D Multimodal Large Language Model

ICLR 2026Poster

Diffusion Models as Dataset Distillation Priors

ICLR 2026Poster

Nano3D: A Training-Free Approach for Efficient 3D Editing Without Masks

ICLR 2026Poster

Mitigating Object Hallucination in Large Vision-Language Models through Adversarial Contrastive Finetuning

ICLR 2026Rejected

Theoretical Analysis of Relative Errors in Gradient Computations for Adversarial Attacks with CE Loss

ICLR 2026Rejected

Helix: Evolutionary Reinforcement Learning for Open-Ended Scientific Problem Solving

ICLR 2026Poster

AnyPos: Automated Task-Agnostic Actions for Bimanual Manipulation

ICLR 2026Rejected

UltraViCo: Breaking Extrapolation Limits in Video Diffusion Transformers

ICLR 2026Poster

Large Scale Diffusion Distillation via Score-Regularized Continuous-Time Consistency

ICLR 2026Poster

Towards Safe Reasoning in Large Reasoning Models via Corrective Intervention

ICLR 2026Poster

SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse–Linear Attention

ICLR 2026Poster

Jailbreaking LLMs' Safeguard with Universal Magic Words for Text Embedding Models

ICLR 2026Rejected

MePo: Meta Post-Refinement for Rehearsal-Free General Continual Learning

ICLR 2026Rejected

Vidarc: Low Latency Embodied Video Diffusion Model with Closed-loop Control

ICLR 2026Rejected

Unveiling the Basin-Like Loss Landscape in Large Language Models

ICLR 2026Poster

NFT: Bridging Supervised Learning and Reinforcement Learning in Math Reasoning

ICLR 2026Poster

VoiceBridge: Designing Latent Bridge Models for General Speech Restoration at Scale

ICLR 2026Rejected

Efficient Hyperparameter Tuning via Trajectory Invariance Principle

ICLR 2026Rejected

Vidar: Embodied Video Diffusion Model for Generalist Manipulation

ICLR 2026Rejected

Multimodal Physical Adversarial Clothing Evades Visible-Thermal Detectors with Non-Overlapping RGB-T Pattern

ICLR 2026Withdrawn

Exploring Aleatoric Uncertainty in Object Detection via Vision Foundation Models

ICLR 2026Withdrawn

Towards the Worst-case Robustness of Large Language Models

ICLR 2026Rejected

LangSAM: Language-Guided Expert Routing on SAM2 for Dense Scene Understanding

ICLR 2026Withdrawn

AudioMoG: Guiding Audio Generation with Mixture-of-Guidance

ICLR 2026Withdrawn

Stabilizing Gradient Descent via Second-Order Control-Theoretic Dynamics

ICLR 2026Withdrawn

STAIR: Improving Safety Alignment with Introspective Reasoning

SageAttention2: Efficient Attention with Thorough Outlier Smoothing and Per-thread INT4 Quantization

ICML 2025Poster

On the Optimization and Generalization of Two-layer Transformers with Sign Gradient Descent

ICLR 2025Spotlight

Audio Super-Resolution with Latent Bridge Models

NeurIPS 2025Poster

RIFLEx: A Free Lunch for Length Extrapolation in Video Diffusion Transformers

ICML 2025Poster

Direct Discriminative Optimization: Your Likelihood-Based Visual Generative Model is Secretly a GAN Discriminator

ICML 2025Spotlight

Toward Guidance-Free AR Visual Generation via Condition Contrastive Alignment

SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration

ICLR 2025Poster

Masked Diffusion Models are Secretly Time-Agnostic Masked Models and Exploit Inaccurate Categorical Sampling

ICLR 2025Poster

RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation

ICLR 2025Poster

ShapeLLM-Omni: A Native Multimodal LLM for 3D Generation and Understanding

NeurIPS 2025Spotlight

Scaling Diffusion Transformers Efficiently via $\mu$P

NeurIPS 2025Poster

SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-Bit Training

NeurIPS 2025Spotlight

Robust Representation Consistency Model via Contrastive Denoising

ICLR 2025Poster

PivotMesh: Generic 3D Mesh Generation via Pivot Vertices Guidance

ICLR 2025Poster

FrameBridge: Improving Image-to-Video Generation with Bridge Models

ICML 2025Poster

Visual Generation Without Guidance

ICML 2025Poster

SpargeAttention: Accurate and Training-free Sparse Attention Accelerating Any Model Inference

ICML 2025Poster

ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing

ICLR 2025Poster

A Regularized Newton Method for Nonconvex Optimization with Global and Local Complexity Guarantees

NeurIPS 2025Poster

Oscillation-Reduced MXFP4 Training for Vision Transformers

ICML 2025Poster

Diffusion Bridge Implicit Models

ICLR 2025Poster

Elucidating the Preconditioning in Consistency Distillation

ICLR 2025Poster

ManiBox: Enhancing Spatial Grasping Generalization via Scalable Simulation Data Generation

ICLR 2025Rejected

Zero-shot Quantization for Object Detection

ICLR 2025Rejected

When Bigger is Better: Revisiting Large-Batch Optimization in Language Model Pretraining

NeurIPS 2025Rejected

FrameBridge: Improving Image-to-Video Generation with Bridge Models

ICLR 2025Rejected

SparseDM: Toward Sparse Efficient Diffusion Models

ICLR 2025Withdrawn

LUNCH: Adaptive Balancing of Continual Learning via Hyperparameter Uncertainty

ICLR 2025Withdrawn

合作者 (20)