Paper
Hub
搜索
Toggle language
Felix Hofstätter
~Felix_Hofstätter1
5
论文总数
2.5
年均投稿
5.6
平均评分
接收情况
3
/
5
会议分布
NeurIPS
2
ICLR
2
ICML
1
发表论文 (5 篇)
2025
3 篇
6.6
4
The Elicitation Game: Evaluating Capability Elicitation Techniques
ICML 2025
Poster
5.0
3
AI Sandbagging: Language Models can Strategically Underperform on Evaluations
ICLR 2025
Poster
7.0
3
Noise Injection Reveals Hidden Capabilities of Sandbagging Language Models
NeurIPS 2025
Poster
2024
2 篇
5.5
4
AI Sandbagging: Language Models can Strategically Underperform on Evaluations
NeurIPS 2024
Rejected
3.7
3
Tall Tales at Different Scales: Evaluating Scaling Trends For Deception in Language Models
ICLR 2024
Rejected
合作者 (18)
FW
Francis Rhys Ward
4 篇
TW
Teun van der Weij
4 篇
OJ
Oliver Jaffe
3 篇
SB
Samuel F. Brown
3 篇
HB
Henning Bartsch
1 篇
JT
Jayden Teoh
1 篇
RD
Rada Djoneva
1 篇
CT
Cameron Tice
1 篇
查看全部 18 位合作者