PaperHub

暂无评分数据

ICLR 2024

Delayed Generalization: Bridging Double Descent and Grokking

OpenReviewPDF
提交: 2023-09-24更新: 2024-03-26
TL;DR

We argue that grokking and double descent are better understood as similar instances of a broader phenomenon that we call \emph{Staggered Learning}.

摘要

关键词
double descentgrokkingscience of deep learningempirical theory of deep learninggeneralizationoverfittingdelayed generalizationfeature learningpattern learningrepresentation learning

评审与讨论

暂无评审记录