Nolan Simran Dey
~Nolan_Simran_Dey1
5
论文总数
2.5
年均投稿
平均评分
接收情况4/5
会议分布
NeurIPS
3
ICLR
2
发表论文 (5 篇)
20254 篇
4
Don't be lazy: CompleteP enables compute-efficient deep transformers
NeurIPS 2025Poster
5
Power Lines: Scaling laws for weight decay and batch size in LLM pre-training
NeurIPS 2025Poster
3
Straight to Zero: Why Linearly Decaying the Learning Rate to Zero Works Best for LLMs
ICLR 2025Poster
-
BLIMEY: Towards Better Routing Methods in Sparse Mixture of Experts
ICLR 2025withdrawn