Paper
Hub
搜索
Toggle language
Martin Marek
~Martin_Marek1
1
论文总数
1.0
年均投稿
7.3
平均评分
接收情况
1
/
1
会议分布
NeurIPS
1
发表论文 (1 篇)
2025
1 篇
7.3
4
Small Batch Size Training for Language Models: When Vanilla SGD Works, and Why Gradient Accumulation is Wasteful
NeurIPS 2025
Poster
合作者 (4)
AS
Aditya Somasundaram
1 篇
AW
Andrew Gordon Wilson
1 篇
MG
Micah Goldblum
1 篇
SL
Sanae Lotfi
1 篇