Sanjiv Kumar
~Sanjiv_Kumar1
30
论文总数
15.0
年均投稿
平均评分
接收情况22/30
会议分布
ICLR
21
NeurIPS
6
ICML
3
发表论文 (30 篇)
202518 篇
4
LAuReL: Learned Augmented Residual Layer
ICML 2025Poster
4
Reasoning with Latent Thoughts: On the Power of Looped Transformers
ICLR 2025Poster
4
Scalable In-context Ranking with Generative Models
NeurIPS 2025Poster
3
Structured Preconditioners in Adaptive Optimization: A Unified Analysis
ICML 2025Poster
4
Mimetic Initialization Helps State Space Models Learn to Recall
ICLR 2025Rejected
5
No more hard-prompts: SoftSRV prompting for synthetic data generation
ICLR 2025Rejected
5
On the Role of Depth and Looping for In-Context Learning with Task Diversity
ICLR 2025Rejected
4
Asymmetric Embedding Models for Hierarchical Retrieval: Provable Constructions and a Pretrain-Finetune Recipe
ICLR 2025Rejected
4
Analyzing Similarity Metrics for Data Selection for Language Model Pretraining
NeurIPS 2025Poster
4
Hierarchical Retrieval: The Geometry and a Pretrain-Finetune Recipe
NeurIPS 2025Poster
4
Better autoregressive regression with LLMs via regression-aware fine-tuning
ICLR 2025Spotlight
3
Faster Cascades via Speculative Decoding
ICLR 2025Oral
4
Efficient stagewise pretraining via progressive subnetworks
ICLR 2025Poster
3
LoRA Done RITE: Robust Invariant Transformation Equilibration for LoRA Optimization
ICLR 2025Oral
3
Bipartite Ranking From Multiple Labels: On Loss Versus Label Aggregation
ICML 2025Poster
4
A Little Help Goes a Long Way: Efficient LLM Training by Leveraging Small LMs
ICLR 2025Rejected
4
Spark Transformer: How Many FLOPs is a Token Worth?
ICLR 2025Rejected
4
Spark Transformer: Reactivating Sparsity in Transformer FFN and Attention
NeurIPS 2025Poster
202412 篇
3
Plugin estimators for selective classification with out-of-distribution detection
ICLR 2024Poster
4
On Bias-Variance Alignment in Deep Models
ICLR 2024Spotlight
3
Accelerating Blockwise Parallel Language Models with Draft Refinement
NeurIPS 2024Poster
4
Learning to Reject Meets Long-tail Learning
ICLR 2024Spotlight
4
Think before you speak: Training Language Models With Pause Tokens
ICLR 2024Poster
4
On the memorisation of image classifiers
ICLR 2024withdrawn
4
Language Model Cascades: Token-Level Uncertainty And Beyond
ICLR 2024Poster
4
On the Inductive Bias of Stacking Towards Improving Reasoning
NeurIPS 2024Poster
3
DistillSpec: Improving Speculative Decoding via Knowledge Distillation
ICLR 2024Poster
4
Efficient Stagewise Pretraining via Progressive Subnetworks
ICLR 2024Rejected
4
Two-stage LLM Fine-tuning with Less Specialization and More Generalization
ICLR 2024Poster
3
Functional Interpolation for Relative Positions improves Long Context Transformers
ICLR 2024Poster