Steepest Descent in the Modular Norm

提交: 2024-09-28更新: 2024-10-11

TL;DR

We cast optimizers like Adam and Shampoo as steepest descent methods under different norms. Generalizing this idea opens up a new design space for training algorithms

摘要

关键词

AdamShampooProdigyoptimizersoptimizationsteepest descentnormsmodular normspectral norm

评审与讨论

撤稿通知

2024-10-11

We decided we want to further hone the manuscript prior to conference submission. We are withdrawing now to avoid wasting reviewer time.