影响力指数

96.88/100

前 0.2%

全站排名 #100

发表论文42 篇

平均评分5.8

年均产出14.0 篇/年

Percy Liang

Associate Professor@Stanford University·美国·OpenReview

研究方向

foundation models

Relative Scaling Laws for LLMs

ICLR 2026Desk Rejected

Pre-training under infinite compute

Reinforcement Learning for Machine Learning Engineering Agents

ICLR 2026Poster

WorldGym: World Model as An Environment for Policy Evaluation

ICLR 2026Poster

Cosmos Policy: Fine-Tuning Video Models for Visuomotor Control and Planning

ICLR 2026Poster

Comparing AI Agents to Cybersecurity Professionals in Real-World Penetration Testing

ICLR 2026Poster

Fantastic Pretraining Optimizers and Where to Find Them

ICLR 2026Poster

Holistic Agent Leaderboard: The Missing Infrastructure for AI Agent Evaluation

ICLR 2026Poster

MLE-Smith: Scaling MLE Tasks with Automated Multi-agent Pipeline

ICLR 2026Poster

SpecEval: Evaluating Model Adherence to Behavior Specifications

ICLR 2026Rejected

Curating High Quality Pretraining Data for Language Models via Compression Ratios

ICLR 2026Rejected

Scheduling data improves fine-tuning data efficiency

ICLR 2026Withdrawn

UQ: Assessing Language Models on Unsolved Questions

ICLR 2026Rejected

AHELM: A Holistic Evaluation of Audio-Language Models

ICLR 2026Withdrawn

RoboReward: A Dataset and Benchmark for Vision-Language Reward Models in Robotics

ICLR 2026Withdrawn

Improving Dynamic Object Interactions in Text-to-Video Generation with AI Feedback

ICLR 2026Withdrawn

Cybench: A Framework for Evaluating Cybersecurity Capabilities and Risks of Language Models

Eliciting Language Model Behaviors with Investigator Agents

ICML 2025Poster

AIR-BENCH 2024: A Safety Benchmark based on Regulation and Policies Specified Risk Categories

ICLR 2025Spotlight

Blackbox Model Provenance via Palimpsestic Membership Inference

NeurIPS 2025Spotlight

Reliable and Efficient Amortized Model-based Evaluation

ICML 2025Poster

Language Models May Verbatim Complete Text They Were Not Explicitly Trained On

ICML 2025Spotlight

Audits Under Resource, Data, and Access Constraints: Scaling Laws For Less Discriminatory Alternatives

NeurIPS 2025Poster

Model Equality Testing: Which Model is this API Serving?

ICLR 2025Poster

Reliable and Efficient Amortized Model-based Evaluation

ICLR 2025Rejected

On the Entropy Calibration of Language Models

NeurIPS 2025Poster

Independence Tests for Language Models

ICML 2025Spotlight

BioDiscoveryAgent: An AI Agent for Designing Genetic Perturbation Experiments

ICLR 2025Poster

AutoBencher: Towards Declarative Benchmark Construction

ICLR 2025Poster

Auditing Prompt Caching in Language Model APIs

ICML 2025Poster

Instruction Following without Instruction Tuning

ICLR 2025Rejected

Understanding Warmup-Stable-Decay Learning Rates: A River Valley Loss Landscape View

ICLR 2025Poster

Independence Tests for Language Models

ICLR 2025Rejected

VideoAgent: Self-Improving Video Generation

ICLR 2025Rejected

On the Entropy Calibration of Language Models

ICLR 2025Withdrawn

合作者 (20)

Tatsunori Hashimoto

David Leo Wright Hall

Rohith Kuditipudi