Han Zhong
~Han_Zhong1
11
论文总数
5.5
年均投稿
平均评分
接收情况8/11
会议分布
ICLR
5
ICML
3
NeurIPS
3
发表论文 (11 篇)
20255 篇
4
DPO Meets PPO: Reinforced Token Optimization for RLHF
ICML 2025Spotlight
3
BRiTE: Bootstrapping Reinforced Thinking Process to Enhance Language Model Reasoning
ICML 2025Poster
4
Less is More: Improving LLM Alignment via Preference Data Selection
NeurIPS 2025Spotlight
4
The Sample Complexity of Online Strategic Decision Making with Information Asymmetry and Knowledge Transportability
ICML 2025Poster
-
Self-Exploring Language Models: Active Preference Elicitation for Online Alignment
ICLR 2025withdrawn
20246 篇
4
Provably Efficient Exploration in Quantum Reinforcement Learning with Logarithmic Worst-Case Regret
ICLR 2024Rejected
4
Sample-efficient Learning of Infinite-horizon Average-reward MDPs with General Function Approximation
ICLR 2024Poster
3
Active Probabilistic Clustering
ICLR 2024withdrawn
4
Rethinking Model-based, Policy-based, and Value-based Reinforcement Learning via the Lens of Representation Complexity
NeurIPS 2024Poster
4
Towards Robust Offline Reinforcement Learning under Diverse Data Corruption
ICLR 2024Spotlight
4
Distributionally Robust Reinforcement Learning with Interactive Data Collection: Fundamental Hardness and Near-Optimal Algorithms
NeurIPS 2024Poster