Percy Liang
~Percy_Liang1
26
论文总数
13.0
年均投稿
平均评分
接收情况19/26
会议分布
ICLR
17
ICML
5
NeurIPS
3
COLM
1
发表论文 (26 篇)
202519 篇
4
Model Equality Testing: Which Model is this API Serving?
ICLR 2025Poster
4
On the Entropy Calibration of Language Models
NeurIPS 2025Poster
4
Reliable and Efficient Amortized Model-based Evaluation
ICML 2025Poster
6
Reliable and Efficient Amortized Model-based Evaluation
ICLR 2025Rejected
4
On the Entropy Calibration of Language Models
ICLR 2025withdrawn
5
Independence Tests for Language Models
ICML 2025Spotlight
4
Auditing Prompt Caching in Language Model APIs
ICML 2025Poster
4
Independence Tests for Language Models
ICLR 2025Rejected
4
Instruction Following without Instruction Tuning
ICLR 2025Rejected
5
Understanding Warmup-Stable-Decay Learning Rates: A River Valley Loss Landscape View
ICLR 2025Poster
3
Eliciting Language Model Behaviors with Investigator Agents
ICML 2025Poster
4
Audits Under Resource, Data, and Access Constraints: Scaling Laws For Less Discriminatory Alternatives
NeurIPS 2025Poster
4
VideoAgent: Self-Improving Video Generation
ICLR 2025Rejected
4
AutoBencher: Towards Declarative Benchmark Construction
ICLR 2025Poster
4
Blackbox Model Provenance via Palimpsestic Membership Inference
NeurIPS 2025Spotlight
4
Language Models May Verbatim Complete Text They Were Not Explicitly Trained On
ICML 2025Spotlight
5
BioDiscoveryAgent: An AI Agent for Designing Genetic Perturbation Experiments
ICLR 2025Poster
4
AIR-BENCH 2024: A Safety Benchmark based on Regulation and Policies Specified Risk Categories
ICLR 2025Spotlight
3
Cybench: A Framework for Evaluating Cybersecurity Capabilities and Risks of Language Models
ICLR 2025Oral
20247 篇
4
Llamas Know What GPTs Don't Show: Surrogate Models for Selective Classification
ICLR 2024withdrawn
4
Length-Controlled AlpacaEval: A Simple Debiasing of Automatic Evaluators
COLM 2024Poster
4
Benchmarking Large Language Models as AI Research Agents
ICLR 2024Rejected
4
On the Learnability of Watermarks for Language Models
ICLR 2024Poster
4
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training
ICLR 2024Poster
3
Benchmarking and Improving Generator-Validator Consistency of Language Models
ICLR 2024Poster
4
Large Language Models as Analogical Reasoners
ICLR 2024Poster