Wei Hu
~Wei_Hu1
11
论文总数
5.5
年均投稿
平均评分
接收情况8/11
会议分布
ICLR
8
NeurIPS
3
发表论文 (11 篇)
20257 篇
4
What Happens During the Loss Plateau? Understanding Abrupt Learning in Transformers
NeurIPS 2025Poster
4
HERTA: A High-Efficiency and Rigorous Training Algorithm for Unfolded Graph Neural Networks
ICLR 2025withdrawn
4
Linear Projections of Teacher Embeddings for Few-Class Distillation
ICLR 2025Rejected
5
Let Me Grok for You: Accelerating Grokking via Embedding Transfer from a Weaker Model
ICLR 2025Poster
4
Benign Overfitting in Single-Head Attention
ICLR 2025Rejected
4
Swing-by Dynamics in Concept Learning and Compositional Generalization
ICLR 2025Poster
4
Benign Overfitting in Single-Head Attention
NeurIPS 2025Poster
20244 篇
4
How Do Transformers Learn In-Context Beyond Simple Functions? A Case Study on Learning with Representations
ICLR 2024Poster
5
Abrupt Learning in Transformers: A Case Study on Matrix Completion
NeurIPS 2024Poster
3
Benign Overfitting and Grokking in ReLU Networks for XOR Cluster Data
ICLR 2024Poster
3
Dichotomy of Early and Late Phase Implicit Biases Can Provably Induce Grokking
ICLR 2024Poster