Teun van der Weij
~Teun_van_der_Weij2
5
论文总数
2.5
年均投稿
平均评分
接收情况4/5
会议分布
NeurIPS
3
ICLR
1
ICML
1
发表论文 (5 篇)
20254 篇
3
AI Sandbagging: Language Models can Strategically Underperform on Evaluations
ICLR 2025Poster
4
CTRL-ALT-DECEIT Sabotage Evaluations for Automated AI R&D
NeurIPS 2025Spotlight
4
The Elicitation Game: Evaluating Capability Elicitation Techniques
ICML 2025Poster
3
Noise Injection Reveals Hidden Capabilities of Sandbagging Language Models
NeurIPS 2025Poster