暂无评分数据
ICLR 2025
Steepest Descent in the Modular Norm
TL;DR
We cast optimizers like Adam and Shampoo as steepest descent methods under different norms. Generalizing this idea opens up a new design space for training algorithms
摘要
关键词
AdamShampooProdigyoptimizersoptimizationsteepest descentnormsmodular normspectral norm
评审与讨论
作者撤稿通知
We decided we want to further hone the manuscript prior to conference submission. We are withdrawing now to avoid wasting reviewer time.