PaperHub

Simon Shaolei Du

~Simon_Shaolei_Du1

32
论文总数
16.0
年均投稿
6.2
平均评分
接收情况26/32
会议分布
ICLR
14
NeurIPS
13
COLM
3
ICML
2

发表论文 (32 篇)

202518

6.4
5

Understanding the Gain from Data Filtering in Multimodal Contrastive Learning

NeurIPS 2025Poster
7.1
5

A Minimalist Example of Edge-of-Stability and Progressive Sharpening

NeurIPS 2025Poster
7.0
4

Extragradient Preference Optimization (EGPO): Beyond Last-Iterate Convergence for Nash Learning from Human Feedback

COLM 2025Poster
6.0
4

The Crucial Role of Samplers in Online Direct Preference Optimization

ICLR 2025Poster
6.4
4

Highlighting What Matters: Promptable Embeddings for Attribute-Focused Image Retrieval

NeurIPS 2025Poster
4.5
4

On Erroneous Agreements of CLIP Image Embeddings

ICLR 2025Rejected
6.8
4

Transformers are Efficient Compilers, Provably

COLM 2025Poster
6.8
4

Deployment Efficient Reward-Free Exploration with Linear Function Approximation

NeurIPS 2025Poster
7.8
4

Cross-environment Cooperation Enables Zero-shot Multi-agent Coordination

ICML 2025Oral
6.1
4

Minimax Optimal Regret Bound for Reinforcement Learning with Trajectory Feedback

ICML 2025Poster
5.5
6

Minimax Optimal Regret Bound for Reinforcement Learning with Trajectory Feedback

ICLR 2025Rejected
6.5
4

LoRe: Personalizing LLMs via Low-Rank Reward Modeling

COLM 2025Poster
4.8
4

SHARP: Accelerating Language Model Inference by SHaring Adjacent layers with Recovery Parameters

ICLR 2025Rejected
4.8
4

Deployment Efficient Reward-Free Exploration with Linear Function Approximation

ICLR 2025Rejected
7.8
4

Sharp Gap-Dependent Variance-Aware Regret Bounds for Tabular MDPs

NeurIPS 2025Poster
3.8
4

Transformers are Efficient Compilers, Provably

ICLR 2025Rejected
5.3
4

Multi-Agent Reinforcement Learning from Human Feedback: Data Coverage and Algorithmic Techniques

ICLR 2025Rejected
7.3
4

Reinforcement Learning for Reasoning in Large Language Models with One Training Example

NeurIPS 2025Poster

202414