Zhihang Yuan
~Zhihang_Yuan1
14
论文总数
7.0
年均投稿
平均评分
接收情况10/14
会议分布
ICLR
7
NeurIPS
3
ICML
3
COLM
1
发表论文 (14 篇)
202511 篇
4
ASVD: Activation-aware Singular Value Decomposition for Compressing Large Language Models
ICLR 2025Rejected
4
MxMoE: Mixed-precision Quantization for MoE with Accuracy and Performance Co-Design
ICML 2025Poster
4
MambaQuant: Quantizing the Mamba Family with Variance Aligned Rotation Methods
ICLR 2025Poster
5
I-LLM: Efficient Integer-Only Inference for Fully-Quantized Low-Bit Large Language Models
ICLR 2025Rejected
4
R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing
NeurIPS 2025Poster
4
SAFEx: Analyzing Vulnerabilities of MoE-Based LLMs via Stable Safety-critical Expert Identification
NeurIPS 2025Poster
4
A Closer Look at Time Steps is Worthy of Triple Speed-Up for Diffusion Model Training
ICLR 2025withdrawn
4
MoEQuant: Enhancing Quantization for Mixture-of-Experts Large Language Models via Expert-Balanced Sampling and Affinity Guidance
ICML 2025Poster
4
RWKVQuant: Quantizing the RWKV Family with Proxy Guided Hybrid of Scalar and Vector Quantization
ICML 2025Poster
5
OSTQuant: Refining Large Language Model Quantization with Orthogonal and Scaling Transformations for Better Distribution Fitting
ICLR 2025Poster
4
MQuant: Unleashing the Inference Potential of Multimodal Large Language Models via Full Static Quantization
ICLR 2025Rejected