影响力指数

43.48/100

前 12%

全站排名 #7,703

发表论文6 篇

平均评分6.5

年均产出3.0 篇/年

Rishabh Joshi

Researcher@Google·OpenReview

Learning from negative feedback, or positive feedback or both

ICLR 2025Spotlight

Reward-Guided Prompt Evolving in Reinforcement Learning for LLMs

ICML 2025Poster

Building Math Agents with Multi-Turn Iterative Preference Learning

ICLR 2025Poster

RRM: Robust Reward Model Training Mitigates Reward Hacking

ICLR 2025Poster

Evolving Alignment via Asymmetric Self-Play

ICLR 2025Rejected

Statistical Rejection Sampling Improves Preference Optimization

ICLR 2024Poster

合作者 (20)