Wen Sun
~Wen_Sun1
28
论文总数
14.0
年均投稿
平均评分
接收情况23/28
会议分布
ICLR
17
NeurIPS
8
ICML
3
发表论文 (28 篇)
202517 篇
4
Avoiding exp(R) scaling in RLHF through Preference-based Exploration
NeurIPS 2025Poster
4
Efficient Imitation under Misspecification
ICLR 2025Poster
4
Computationally Efficient RL under Linear Bellman Completeness for Deterministic Dynamics
ICLR 2025Oral
4
Model-based RL as a Minimalist Approach to Horizon-Free and Second-Order Bounds
ICLR 2025Poster
5
A Reductions Approach to Risk-Sensitive Reinforcement Learning with Optimized Certainty Equivalents
ICML 2025Poster
3
Convergence of Consistency Model with Multistep Sampling under General Data Assumptions
ICML 2025Poster
4
Convergence Of Consistency Model With Multistep Sampling Under General Data Assumptions
ICLR 2025Rejected
5
Diffusing States and Matching Scores: A New Framework for Imitation Learning
ICLR 2025Poster
4
Accelerating RL for LLM Reasoning with Optimal Advantage Regression
NeurIPS 2025Poster
5
Correcting the Mythos of KL-Regularization: Direct Alignment without Overoptimization via Chi-Squared Preference Optimization
ICLR 2025Spotlight
4
On Orchestrating Personalized LLMs
ICLR 2025Rejected
4
On Speeding Up Language Model Evaluation
ICLR 2025Poster
5
Value-Guided Search for Efficient Chain-of-Thought Reasoning
NeurIPS 2025Poster
4
Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF
ICLR 2025Poster
3
$Q\sharp$: Provably Optimal Distributional RL for LLM Post-Training
ICML 2025Rejected
4
$Q\sharp$: Provably Optimal Distributional RL for LLM Post-Training
NeurIPS 2025Poster
5
Scaling Offline RL via Efficient and Expressive Shortcut Models
NeurIPS 2025Poster
202411 篇
4
Making RL with Preference-based Feedback Efficient via Randomization
ICLR 2024Poster
4
Provable Reward-Agnostic Preference-Based Reinforcement Learning
ICLR 2024Spotlight
4
Offline Data Enhanced On-Policy Policy Gradient with Provable Guarantees
ICLR 2024Poster
4
Efficient and Sharp Off-Policy Evaluation in Robust Markov Decision Processes
NeurIPS 2024Poster
4
The Importance of Online Data: Understanding Preference Fine-tuning via Coverage
NeurIPS 2024Poster
4
Learning to Generate Better than your Large Language Models
ICLR 2024Rejected
4
Provable Offline Preference-Based Reinforcement Learning
ICLR 2024Spotlight
4
Adversarial Imitation Learning via Boosting
ICLR 2024Poster
4
JoinGym: An Efficient Query Optimization Environment for Reinforcement Learning
ICLR 2024Rejected
4
Provably Efficient CVaR RL in Low-rank MDPs
ICLR 2024Poster
4
REBEL: Reinforcement Learning via Regressing Relative Rewards
NeurIPS 2024Poster