Tengyang Xie
~Tengyang_Xie1
8
论文总数
4.0
年均投稿
平均评分
接收情况8/8
会议分布
ICLR
4
ICML
2
NeurIPS
2
发表论文 (8 篇)
20256 篇
4
Exploratory Preference Optimization: Harnessing Implicit Q*-Approximation for Sample-Efficient RLHF
ICLR 2025Poster
4
Reinforce LLM Reasoning through Multi-Agent Reflection
ICML 2025Poster
5
Correcting the Mythos of KL-Regularization: Direct Alignment without Overoptimization via Chi-Squared Preference Optimization
ICLR 2025Spotlight
4
Do We Need to Verify Step by Step? Rethinking Process Supervision from a Theoretical Perspective
ICML 2025Poster
5
Outcome-Based Online Reinforcement Learning: Algorithms and Fundamental Limits
NeurIPS 2025Poster
4
Trajectory Bellman Residual Minimization: A Simple Value-Based Method for LLM Reasoning
NeurIPS 2025Poster