Zhangyang Wang
~Zhangyang_Wang1
65
论文总数
32.5
年均投稿
平均评分
接收情况43/65
会议分布
ICLR
39
NeurIPS
14
ICML
7
COLM
5
发表论文 (65 篇)
202533 篇
4
Transformers Provably Learn Two-Mixture of Linear Classification via Gradient Flow
ICLR 2025Poster
4
On the Provable Separation of Scales in Maximal Update Parameterization
ICML 2025Poster
4
PIPA: Preference Alignment as Prior-Informed Statistical Estimation
ICML 2025Poster
4
LLaMaFlex: Many-in-one LLMs via Generalized Pruning and Weight Sharing
ICLR 2025Poster
4
One Token Embedding Is Enough to Deadlock Your Large Reasoning Model
NeurIPS 2025Poster
4
On-the-Fly Adaptive Distillation of Transformer to Dual-State Linear Attention for Long-Context LLM Serving
ICML 2025Poster
4
HD-Painter: High-Resolution and Prompt-Faithful Text-Guided Image Inpainting with Diffusion Models
ICLR 2025Poster
4
Scaling Up Parameter Generation: A Recurrent Diffusion Approach
NeurIPS 2025Poster
4
R-Sparse: Rank-Aware Activation Sparsity for Efficient LLM Inference
ICLR 2025Poster
3
SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training
ICLR 2025Poster
4
SEAL: Steerable Reasoning Calibration of Large Language Models for Free
COLM 2025Poster
4
LoX: Low-Rank Extrapolation Robustifies LLM Safety Against Fine-tuning
COLM 2025Poster
4
Rethinking Addressing in Language Models via Contextualized Equivariant Positional Encoding
ICML 2025Poster
4
StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text
ICLR 2025withdrawn
3
Graph-KV: Breaking Sequence via Injecting Structural Biases into Large Language Models
NeurIPS 2025Poster
5
CamCo: Camera-Controllable 3D-Consistent Image-to-Video Generation
ICLR 2025Rejected
3
Fantastic Experts and How to Find Them: A Multi-Dimensional Study for Experts-Level Sparsification in Mixture-of-Experts
ICLR 2025Rejected
4
Rethinking Addressing in Language Models via Contextualized Equivariant Positional Encoding
ICLR 2025Rejected
4
Understanding and Mitigating Bottlenecks of State Space Models through the Lens of Recency and Over-smoothing
ICLR 2025Poster
4
From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients
ICLR 2025Rejected
4
OscillationInversion: Understand the structure of Large Flow Model through the Lens of Inversion Method
ICLR 2025withdrawn
4
Cavia: Camera-controllable Multi-view Video Diffusion with View-Integrated Attention
ICML 2025Poster
3
HALoS: Hierarchical Asynchronous Local SGD over Slow Networks for Geo-Distributed Large Language Model Training
ICML 2025Poster
5
SPIN-Bench: How Well Do LLMs Plan Strategically and Reason Socially?
COLM 2025Poster
4
Cavia: Camera-controllable Multi-view Video Diffusion with View-Integrated Attention
ICLR 2025Rejected
4
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients
ICLR 2025withdrawn
4
More is Less: The Pitfalls of Multi-Model Synthetic Preference Data in DPO Safety Alignment
COLM 2025Poster
4
4K4DGen: Panoramic 4D Generation at 4K Resolution
ICLR 2025Spotlight
4
From Low Rank Gradient Subspace Stabilization to Low-Rank Weights: Observations, Theories, and Applications
ICML 2025Poster
3
Can Test-Time Scaling Improve World Foundation Model?
COLM 2025Poster
4
REPA Works Until It Doesn’t: Early-Stopped, Holistic Alignment Supercharges Diffusion Training
NeurIPS 2025Poster
4
SAS: Simulated Attention Score
NeurIPS 2025Poster
4
Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights
NeurIPS 2025Poster
202432 篇
3
Training Dynamics of Transformers to Recognize Word Co-occurrence via Gradient Flow Analysis
NeurIPS 2024Poster
3
Doubly Robust Instance-Reweighted Adversarial Training
ICLR 2024Poster
4
Prompt-Free Diffusion: Taking “Text” out of Text-to-Image Diffusion Models
ICLR 2024withdrawn
4
Principled Architecture-aware Scaling of Hyperparameters
ICLR 2024Poster
3
Data Distillation Can Be Like Vodka: Distilling More Times For Better Quality
ICLR 2024Poster
4
Fill with Anything: High-Resolution and Prompt-Faithful Image Completion
ICLR 2024withdrawn
4
Polynomial Width is Sufficient for Set Representation with High-dimensional Features
ICLR 2024Poster
3
Multi-Concept T2I-Zero: Tweaking Only The Text Embeddings And Nothing Else
ICLR 2024withdrawn
4
Forget-Me-Not: Learning to Forget in Text-to-Image Diffusion Models
ICLR 2024withdrawn
4
Social Reward: Evaluating and Enhancing Generative AI through Million-User Feedback from an Online Creative Community
ICLR 2024Spotlight
4
AlphaPruning: Using Heavy-Tailed Self Regularization Theory for Improved Layer-wise Pruning of Large Language Models
NeurIPS 2024Poster
4
Diffusion4D: Fast Spatial-temporal Consistent 4D generation via Video Diffusion Models
NeurIPS 2024Poster
4
Compressing LLMs: The Truth is Rarely Pure and Never Simple
ICLR 2024Poster
4
Efficient-3Dim: Learning a Generalizable Single-image Novel-view Synthesizer in One Day
ICLR 2024Poster
4
Safe and Robust Watermark Injection with a Single OoD Image
ICLR 2024Poster
3
Junk DNA Hypothesis: A Task-Centric Angle of LLM Pre-trained Weights through Sparsity
ICLR 2024Rejected
4
(Dynamic) Prompting might be all you need to repair Compressed LLMs
ICLR 2024Rejected
4
DP-OPT: Make Large Language Model Your Privacy-Preserving Prompt Engineer
ICLR 2024Spotlight
4
LightGaussian: Unbounded 3D Gaussian Compression with 15x Reduction and 200+ FPS
NeurIPS 2024Spotlight
5
($\texttt{PEEP}$) $\textbf{P}$redicting $\textbf{E}$nzym$\textbf{e}$ $\textbf{P}$romiscuity with its Molecule Mate – an Attentive Metric Learning Solution
ICLR 2024Rejected
4
Sparse MoE as a New Treatment: Addressing Forgetting, Fitting, Learning Issues in Multi-Modal Multi-Task Learning
ICLR 2024Rejected
6
Latent 3D Graph Diffusion
ICLR 2024Poster
4
$\textit{Read-ME}$: Refactorizing LLMs as Router-Decoupled Mixture of Experts with System Co-Design
NeurIPS 2024Poster
5
Expressive Gaussian Human Avatars from Monocular RGB Video
NeurIPS 2024Poster
5
PAIR Diffusion: A Comprehensive Multimodal Object-Level Image Editor
ICLR 2024withdrawn
3
Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding
NeurIPS 2024Poster
4
Large Spatial Model: End-to-end Unposed Images to Semantic 3D
NeurIPS 2024Poster
3
Sparse Cocktail: Every Sparse Pattern Every Sparse Ratio All At Once
ICLR 2024Rejected
3
Drag View: Generalizable Novel View Synthesis with Unposed Imagery
ICLR 2024withdrawn
5
Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity
ICLR 2024Rejected
4
SteinDreamer: Variance Reduction for Text-to-3D Score Distillation via Stein Identity
ICLR 2024Rejected
4
Taming Mode Collapse in Score Distillation for Text-to-3D Generation
ICLR 2024withdrawn