Alex Tamkin
~Alex_Tamkin1
6
论文总数
3.0
年均投稿
平均评分
接收情况4/6
会议分布
ICLR
3
COLM
2
NeurIPS
1
发表论文 (6 篇)
20252 篇
20244 篇
4
Codebook Features: Sparse and Discrete Interpretability for Neural Networks
ICLR 2024Rejected
4
Eliciting Human Preferences with Language Models
ICLR 2024Rejected
4
Towards Measuring the Representation of Subjective Global Opinions in Language Models
COLM 2024Poster
5
Many-shot Jailbreaking
NeurIPS 2024Poster