Paper
Hub
搜索
Toggle language
Bilal Piot
~Bilal_Piot1
6
论文总数
3.0
年均投稿
6.2
平均评分
接收情况
5
/
6
会议分布
ICLR
4
NeurIPS
2
发表论文 (6 篇)
2025
3 篇
7.3
4
Learning from negative feedback, or positive feedback or both
ICLR 2025
Spotlight
7.0
4
Building Math Agents with Multi-Turn Iterative Preference Learning
ICLR 2025
Poster
6.5
4
RRM: Robust Reward Model Training Mitigates Reward Hacking
ICLR 2025
Poster
2024
3 篇
5.3
4
Multi-turn Reinforcement Learning with Preference Human Feedback
NeurIPS 2024
Poster
7.3
3
Unlocking the Power of Representations in Long-term Novelty-based Exploration
ICLR 2024
Spotlight
4.0
4
Direct Language Model Alignment from Online AI Feedback
NeurIPS 2024
Rejected
合作者 (20)
RJ
Rishabh Joshi
3 篇
TL
Tianqi Liu
3 篇
DC
Daniele Calandriello
3 篇
JS
Jiaming Shen
2 篇
MS
Mohammad Saleh
2 篇
WX
Wei Xiong
2 篇
ZQ
Zhen Qin
2 篇
AR
Aviv Rosenberg
2 篇
查看全部 20 位合作者