影响力指数
-/100
发表论文3
平均评分3.3
年均产出3.0 篇/年
AI 学术分析

Haoyun Deng

Researcher@Apple·美国·OpenReview
研究方向

large language models · inference optimization · efficient serving · speculative decoding · prefill–decoding disaggregation · prefix cache · kv cache · quantization · pruning · model compression · low-bit inference · scheduling · memory optimization · distributed inference systems · inference frameworks · Triton · TensorRT-LLM · vLLM · serving infrastructure