Anshumali Shrivastava
~Anshumali_Shrivastava1
11
论文总数
5.5
年均投稿
平均评分
接收情况8/11
会议分布
ICLR
5
NeurIPS
5
ICML
1
发表论文 (11 篇)
20255 篇
6
LeanQuant: Accurate and Scalable Large Language Model Quantization with Loss-error-aware Grid
ICLR 2025Poster
4
SpaLLM: Unified Compressive Adaptation of Large Language Models with Sketching
ICLR 2025Rejected
4
Sketch to Adapt: Fine-Tunable Sketches for Efficient LLM Adaptation
ICML 2025Poster
4
70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float (DFloat11)
NeurIPS 2025Poster
4
Breaking the Frozen Subspace: Importance Sampling for Low-Rank Optimization in LLM Pretraining
NeurIPS 2025Poster
20246 篇
4
In defense of parameter sharing for model-compression
ICLR 2024Poster
5
KV Cache is 1 Bit Per Channel: Efficient Large Language Model Inference with Coupled Quantization
NeurIPS 2024Poster
6
HashOrder: Accelerating Graph Processing Through Hashing-based Reordering
ICLR 2024Rejected
4
NoMAD-Attention: Efficient LLM Inference on CPUs Through Multiply-add-free Attention
NeurIPS 2024Poster
5
SS1: Accelerating Inference with Fast and Expressive Sketch Structured Transform
NeurIPS 2024Poster
4
Compress, Then Prompt: Improving Accuracy-Efficiency Trade-off of LLM Inference with Transferable Prompt
ICLR 2024Rejected