Samuel Marks
~Samuel_Marks1
9
论文总数
4.5
年均投稿
平均评分
接收情况7/9
会议分布
ICLR
4
NeurIPS
3
COLM
1
ICML
1
发表论文 (9 篇)
20255 篇
4
Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models
ICLR 2025Oral
3
Erasing Conceptual Knowledge from Language Models
ICLR 2025Rejected
4
Erasing Conceptual Knowledge from Language Models
NeurIPS 2025Poster
4
NNsight and NDIF: Democratizing Access to Open-Weight Foundation Model Internals
ICLR 2025Poster
4
SAEBench: A Comprehensive Benchmark for Sparse Autoencoders in Language Model Interpretability
ICML 2025Poster
20244 篇
4
The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets
ICLR 2024Rejected
4
The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets
COLM 2024Poster
5
Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from Disparate Training Data
NeurIPS 2024Poster
3
Measuring Progress in Dictionary Learning for Language Model Interpretability with Board Game Models
NeurIPS 2024Poster