Liang Qiu
~Liang_Qiu2
4
论文总数
4.0
年均投稿
平均评分
接收情况4/4
会议分布
NeurIPS
2
COLM
1
ICML
1
发表论文 (4 篇)
20254 篇
5
Self-Rewarding PPO: Aligning Large Language Models with Demonstrations Only
COLM 2025Poster
4
Think-RM: Enabling Long-Horizon Reasoning in Generative Reward Models
NeurIPS 2025Poster
3
Ask a Strong LLM Judge when Your Reward Model is Uncertain
NeurIPS 2025Poster
4
Discriminative Finetuning of Generative Large Language Models without Reward Models and Human Preference Data
ICML 2025Poster