Chengqi Lyu
~Chengqi_Lyu1
7
论文总数
3.5
年均投稿
平均评分
接收情况5/7
会议分布
ICLR
3
NeurIPS
3
COLM
1
发表论文 (7 篇)
20254 篇
4
Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning
COLM 2025Poster
6
Training Language Models to Critique with Multi-Agent Feedback
ICLR 2025Rejected
5
Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs
ICLR 2025Poster
4
Pre-Trained Policy Discriminators are General Reward Models
NeurIPS 2025Poster