影响力指数

44.35/100

前 11.4%

全站排名 #7,339

发表论文14 篇

平均评分4.7

年均产出4.7 篇/年

Arthur Conmy

Researcher@Google DeepMind·英国·OpenReview

Thought Anchors: Which LLM Reasoning Steps Matter?

ICLR 2026Rejected

Chain-of-Thought Reasoning In The Wild Is Not Always Faithful

ICLR 2026Rejected

Base Models Know How to Reason, Thinking Models Learn When

ICLR 2026Withdrawn

Eliciting Secret Knowledge from Language Models

ICLR 2026Rejected

Fluid Reasoning Representations

ICLR 2026Rejected

SAEBench: A Comprehensive Benchmark for Sparse Autoencoders in Language Model Interpretability

ICML 2025Poster

Applying Sparse Autoencoders to Unlearn Knowledge in Language Models

ICLR 2025Rejected

Interpreting Attention Layer Outputs with Sparse Autoencoders

ICLR 2025Rejected

Scaling Sparse Feature Circuits For Studying In-Context Learning

ICLR 2025Rejected

Jumping Ahead: Improving Reconstruction Fidelity with JumpReLU Sparse Autoencoders

ICLR 2025Rejected

Scaling Sparse Feature Circuits For Studying In-Context Learning

ICML 2025Poster

合作者 (20)

Senthooran Rajamanoharan

Dmitrii Kharlapenko

Iván Arcuschin

Robert Krzyzanowski

Callum Stuart McDougall

Joseph Isaac Bloom