Alekh Agarwal
~Alekh_Agarwal2
7
论文总数
3.5
年均投稿
平均评分
接收情况6/7
会议分布
ICML
3
NeurIPS
2
COLM
1
ICLR
1
发表论文 (7 篇)
20254 篇
4
Design Considerations in Offline Preference-based RL
ICML 2025Poster
4
Theoretical guarantees on the best-of-n alignment policy
ICML 2025Poster
4
Catoni Contextual Bandits are Robust to Heavy-tailed Rewards
ICML 2025Spotlight
7
Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning
ICLR 2025Spotlight
20243 篇
5
Small steps no more: Global convergence of stochastic gradient bandits for arbitrary learning rates
NeurIPS 2024Poster
4
Helping or Herding? Reward Model Ensembles Mitigate but do not Eliminate Reward Hacking
COLM 2024Poster
3
Robust Preference Optimization through Reward Model Distillation
NeurIPS 2024Rejected