影响力指数

84.51/100

前 1%

全站排名 #628

发表论文25 篇

平均评分5.9

年均产出12.5 篇/年

Yuanzhi Li

Assistant Professor@Carnegie Mellon University·美国·OpenReview

7.3

Understanding the Evolution of the Neural Tangent Kernel at the Edge of Stability

NeurIPS 2025Poster

三作

7.3

Physics of Language Models: Part 3.3, Knowledge Capacity Scaling Laws

ICLR 2025Spotlight

二作

7.0

Mixture of Parrots: Experts improve memorization more than reasoning

ICLR 2025Poster

6.8

Physics of Language Models: Part 2.2, How to Learn From Mistakes on Grade-School Math Problems

ICLR 2025Poster

三作

6.5

Physics of Language Models: Part 3.2, Knowledge Manipulation

ICLR 2025Poster

二作

6.3

Interpretability of Language Models for Learning Hierarchical Structures

ICLR 2025Rejected

二作

6.1

On the Clean Generalization and Robust Overfitting in Adversarial Training from Two Theoretical Views: Representation Complexity and Training Dynamics

ICML 2025Poster

二作

6.0

Adversarial Training Can Provably Improve Robustness: Theoretical Analysis of Feature Learning Process Under Structured Data

ICLR 2025Poster

二作

6.0

Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning Process

ICLR 2025Poster

三作

5.4

Understand Clean Generalization and Robust Overfitting in Adversarial Training from Two Theoretical Views: Representation Complexity and Training Dynamics

ICLR 2025Rejected

二作

4.3

Beyond Parameter Count: Implicit Bias in Soft Mixture of Experts

ICLR 2025Rejected

7.5

Role of Locality and Weight Sharing in Image-Based Tasks: A Sample Complexity Separation between CNNs, LCNs, and FCNs

ICLR 2024Spotlight

通讯

7.5

VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?

COLM 2024Poster

6.8

SmartPlay : A Benchmark for LLMs as Intelligent Agents

ICLR 2024Poster

通讯

6.5

Understanding Transferable Representation Learning and Zero-shot Transfer in CLIP

ICLR 2024Poster

三作

6.0

Why Clean Generalization and Robust Overfitting Both Happen in Adversarial Training

ICLR 2024Rejected

二作

4.0

Positional Description Matters for Transformers Arithmetic

ICLR 2024Rejected

3.8

How does overparametrization affect features?

ICLR 2024Rejected

三作

3.0

TinyStories: How Small Can Language Models Be and Still Speak Coherent English

合作者 (20)

Yuanzhi Li

Understanding the Evolution of the Neural Tangent Kernel at the Edge of Stability

Physics of Language Models: Part 3.3, Knowledge Capacity Scaling Laws

Mixture of Parrots: Experts improve memorization more than reasoning

Physics of Language Models: Part 2.2, How to Learn From Mistakes on Grade-School Math Problems

Physics of Language Models: Part 3.2, Knowledge Manipulation

Interpretability of Language Models for Learning Hierarchical Structures

On the Clean Generalization and Robust Overfitting in Adversarial Training from Two Theoretical Views: Representation Complexity and Training Dynamics

Adversarial Training Can Provably Improve Robustness: Theoretical Analysis of Feature Learning Process Under Structured Data

Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning Process

Understand Clean Generalization and Robust Overfitting in Adversarial Training from Two Theoretical Views: Representation Complexity and Training Dynamics

Beyond Parameter Count: Implicit Bias in Soft Mixture of Experts

Role of Locality and Weight Sharing in Image-Based Tasks: A Sample Complexity Separation between CNNs, LCNs, and FCNs

VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?

SmartPlay : A Benchmark for LLMs as Intelligent Agents

Understanding Transferable Representation Learning and Zero-shot Transfer in CLIP

Knowledge Storage and Extraction in Language Models (Part A)

AgentKit: Structured LLM Reasoning with Dynamic Graphs

Textbooks Are All You Need

Simple mechanisms for representing, indexing and manipulating concepts

Knowledge Manipulation in Language Models (Part B)

How Language Models Learn Context-Free Grammars

Why Clean Generalization and Robust Overfitting Both Happen in Adversarial Training

Positional Description Matters for Transformers Arithmetic

How does overparametrization affect features?

TinyStories: How Small Can Language Models Be and Still Speak Coherent English