影响力指数

24.18/100

前 32.7%

全站排名 #21,039

发表论文14 篇

平均评分4.7

年均产出4.7 篇/年

Zhefeng Wang

Researcher@Huawei Technologies Ltd.·中国·OpenReview

研究方向

Language Model · ML System · Natural Language Processing

5.5

Direction-Magnitude Decoupling for Fast Video Generation with Flow Matching Models

ICLR 2026Rejected

5.0

Accelerating Large Language Model Inference via Speculative Decoding with Progressive Tree Drafting

ICLR 2026Withdrawn

3.3

Semantic-aware Pruning of Large Language Models via Neuron Importance Explanation

ICLR 2026Withdrawn

6.4

Adapprox: Memory Efficient Optimization via Adaptive Randomized Low-Rank Approximation

ICLR 2025Rejected

6.3

Efficiently Serving Large Multimodal Models Using EPD Disaggregation

ICML 2025Poster

5.5

Beware of Calibration Data for Pruning Large Language Models

ICLR 2025Poster

5.3

FISTAPruner: Layer-wise Post-training Pruning for Large Language Models

ICLR 2025Rejected

4.4

SinkQ: Accurate 2-bit KV Cache Quantization with Dynamic Sink Tracking

ICLR 2025Withdrawn

4.0

FASP: Fast and Accurate Structured Pruning of Large Language Models

ICLR 2025Withdrawn

3.0

CASD: Enhancing Generation Accuracy via Context-Aware Speculative Decoding

合作者 (20)

Zhefeng Wang

Direction-Magnitude Decoupling for Fast Video Generation with Flow Matching Models

CaliDrop: KV Cache Compression with Query-based Calibration

RA-SpaRC: Robust Adaptation with Sparse Plus Low-Rank Compressors

Adaptive Dual-Granularity Pruning Method for Large Language Models

Accelerating Large Language Model Inference via Speculative Decoding with Progressive Tree Drafting

Semantic-aware Pruning of Large Language Models via Neuron Importance Explanation

Adapprox: Memory Efficient Optimization via Adaptive Randomized Low-Rank Approximation

Efficiently Serving Large Multimodal Models Using EPD Disaggregation

Beware of Calibration Data for Pruning Large Language Models

FISTAPruner: Layer-wise Post-training Pruning for Large Language Models

SinkQ: Accurate 2-bit KV Cache Quantization with Dynamic Sink Tracking

FASP: Fast and Accurate Structured Pruning of Large Language Models

CASD: Enhancing Generation Accuracy via Context-Aware Speculative Decoding