影响力指数

30.97/100

前 23.4%

全站排名 #15,053

发表论文5 篇

平均评分6.1

年均产出2.5 篇/年

Long Phan

Research Engineer@Center for AI Safety·OpenReview

Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs

NeurIPS 2025Spotlight

Tamper-Resistant Safeguards for Open-Weight LLMs

ICLR 2025Poster

Improving Alignment and Robustness with Circuit Breakers

NeurIPS 2024Poster

Enhancing Neural Network Transparency through Representation Analysis

ICLR 2024Rejected

Robustness Evaluation of Proxy Models against Adversarial Optimization

ICLR 2024Rejected

合作者 (20)

Matt Fredrikson