Paper
Hub
搜索
Toggle language
Mikita Balesni
~Mikita_Balesni1
3
论文总数
1.5
年均投稿
4.3
平均评分
接收情况
1
/
3
会议分布
ICLR
3
发表论文 (3 篇)
2025
2 篇
3.5
4
The Two-Hop Curse: LLMs trained on A→B, B→C fail to learn A→C
ICLR 2025
Rejected
3.0
3
Honesty to Subterfuge: In-Context Reinforcement Learning Can Make Honest Models Reward Hack
ICLR 2025
Rejected
2024
1 篇
6.5
4
The Reversal Curse: LLMs trained on “A is B” fail to learn “B is A”
ICLR 2024
Poster
合作者 (10)
OE
Owain Evans
2 篇
TK
Tomasz Korbak
2 篇
AS
Asa Cooper Stickland
1 篇
LB
Lukas Berglund
1 篇
MK
Maximilian Kaufmann
1 篇
MT
Meg Tong
1 篇
CS
Christoph Sträter
1 篇
JN
Joe Needham
1 篇
查看全部 10 位合作者