Paper
Hub
搜索
Toggle language
Jacob Dunefsky
~Jacob_Dunefsky1
3
论文总数
1.5
年均投稿
5.6
平均评分
接收情况
2
/
3
会议分布
NeurIPS
1
ICLR
1
COLM
1
发表论文 (3 篇)
2025
1 篇
6.5
4
One-shot Optimized Steering Vectors Mediate Safety-relevant Behaviors in LLMs
COLM 2025
Poster
2024
2 篇
6.5
4
Transcoders find interpretable LLM feature circuits
NeurIPS 2024
Poster
3.7
3
Observable Propagation: Uncovering Feature Vectors in Transformers
ICLR 2024
withdrawn
合作者 (3)
AC
Arman Cohan
2 篇
NN
Neel Nanda
1 篇
PC
Philippe Chlenski
1 篇