Zhiyuan Li
~Zhiyuan_Li2
18
论文总数
9.0
年均投稿
平均评分
接收情况15/18
会议分布
ICLR
12
ICML
4
NeurIPS
2
发表论文 (18 篇)
202511 篇
5
Understanding Warmup-Stable-Decay Learning Rates: A River Valley Loss Landscape View
ICLR 2025Poster
4
Chain-of-Thought Provably Enables Learning the (Otherwise) Unlearnable
ICLR 2025Poster
5
Shift is Good: Mismatched Data Mixing Improves Test Performance
NeurIPS 2025Rejected
4
On Learning Verifiers and Implications to Chain-of-Thought Reasoning
NeurIPS 2025Poster
4
Non-Asymptotic Length Generalization
ICML 2025Poster
4
A Coefficient Makes SVRG Effective
ICLR 2025Poster
4
Reasoning with Latent Thoughts: On the Power of Looped Transformers
ICLR 2025Poster
3
Adam Exploits $\ell_\infty$-geometry of Loss Landscape via Coordinate-wise Adaptivity
ICLR 2025Spotlight
4
PENCIL: Long Thoughts with Short Memory
ICML 2025Poster
4
Weak-to-Strong Generalization Even in Random Feature Networks, Provably
ICML 2025Poster
3
Structured Preconditioners in Adaptive Optimization: A Unified Analysis
ICML 2025Poster
20247 篇
4
Fast Equilibrium of SGD in Generic Situations
ICLR 2024Poster
6
Chain of Thought Empowers Transformers to Solve Inherently Serial Problems
ICLR 2024Poster
4
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training
ICLR 2024Poster
4
Simplicity Bias of SGD via Sharpness Minimization
ICLR 2024Rejected
4
A Coefficient Makes SVRG Effective
ICLR 2024Rejected
3
Dichotomy of Early and Late Phase Implicit Biases Can Provably Induce Grokking
ICLR 2024Poster
4
The Marginal Value of Momentum for Small Learning Rate SGD
ICLR 2024Poster