影响力指数

68.85/100

前 2.9%

全站排名 #1,835

发表论文11 篇

平均评分5.8

年均产出5.5 篇/年

Han Zhong

PhD student@Peking University·OpenReview

Less is More: Improving LLM Alignment via Preference Data Selection

NeurIPS 2025Spotlight

DPO Meets PPO: Reinforced Token Optimization for RLHF

ICML 2025Spotlight

BRiTE: Bootstrapping Reinforced Thinking Process to Enhance Language Model Reasoning

ICML 2025Poster

The Sample Complexity of Online Strategic Decision Making with Information Asymmetry and Knowledge Transportability

ICML 2025Poster

Self-Exploring Language Models: Active Preference Elicitation for Online Alignment

ICLR 2025Withdrawn

Towards Robust Offline Reinforcement Learning under Diverse Data Corruption

ICLR 2024Spotlight

Sample-efficient Learning of Infinite-horizon Average-reward MDPs with General Function Approximation

ICLR 2024Poster

Provably Efficient Exploration in Quantum Reinforcement Learning with Logarithmic Worst-Case Regret

ICLR 2024Rejected

Distributionally Robust Reinforcement Learning with Interactive Data Collection: Fundamental Hardness and Near-Optimal Algorithms

NeurIPS 2024Poster

Rethinking Model-based, Policy-based, and Value-based Reinforcement Learning via the Lens of Representation Complexity

NeurIPS 2024Poster

Active Probabilistic Clustering

ICLR 2024Withdrawn

合作者 (20)