Simon Shaolei Du
~Simon_Shaolei_Du1
32
论文总数
16.0
年均投稿
平均评分
接收情况26/32
会议分布
ICLR
14
NeurIPS
13
COLM
3
ICML
2
发表论文 (32 篇)
202518 篇
5
Understanding the Gain from Data Filtering in Multimodal Contrastive Learning
NeurIPS 2025Poster
5
A Minimalist Example of Edge-of-Stability and Progressive Sharpening
NeurIPS 2025Poster
4
Extragradient Preference Optimization (EGPO): Beyond Last-Iterate Convergence for Nash Learning from Human Feedback
COLM 2025Poster
4
The Crucial Role of Samplers in Online Direct Preference Optimization
ICLR 2025Poster
4
Highlighting What Matters: Promptable Embeddings for Attribute-Focused Image Retrieval
NeurIPS 2025Poster
4
On Erroneous Agreements of CLIP Image Embeddings
ICLR 2025Rejected
4
Transformers are Efficient Compilers, Provably
COLM 2025Poster
4
Deployment Efficient Reward-Free Exploration with Linear Function Approximation
NeurIPS 2025Poster
4
Cross-environment Cooperation Enables Zero-shot Multi-agent Coordination
ICML 2025Oral
4
Minimax Optimal Regret Bound for Reinforcement Learning with Trajectory Feedback
ICML 2025Poster
6
Minimax Optimal Regret Bound for Reinforcement Learning with Trajectory Feedback
ICLR 2025Rejected
4
LoRe: Personalizing LLMs via Low-Rank Reward Modeling
COLM 2025Poster
4
SHARP: Accelerating Language Model Inference by SHaring Adjacent layers with Recovery Parameters
ICLR 2025Rejected
4
Deployment Efficient Reward-Free Exploration with Linear Function Approximation
ICLR 2025Rejected
4
Sharp Gap-Dependent Variance-Aware Regret Bounds for Tabular MDPs
NeurIPS 2025Poster
4
Transformers are Efficient Compilers, Provably
ICLR 2025Rejected
4
Multi-Agent Reinforcement Learning from Human Feedback: Data Coverage and Algorithmic Techniques
ICLR 2025Rejected
4
Reinforcement Learning for Reasoning in Large Language Models with One Training Example
NeurIPS 2025Poster
202414 篇
4
Understanding the Gains from Repeated Self-Distillation
NeurIPS 2024Poster
4
Toward Global Convergence of Gradient EM for Over-Paramterized Gaussian Mixture Models
NeurIPS 2024Poster
4
How Over-Parameterization Slows Down Gradient Descent in Matrix Sensing: The Curses of Symmetry and Initialization
ICLR 2024Spotlight
5
Learning Optimal Tax Design in Nonatomic Congestion Games
NeurIPS 2024Poster
4
Learning to Cooperate with Humans using Generative Agents
NeurIPS 2024Poster
4
Distributional Successor Features Enable Zero-Shot Policy Optimization
NeurIPS 2024Poster
4
Horizon-Free Regret for Linear Markov Decision Processes
ICLR 2024Poster
4
Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning
ICLR 2024Poster
3
Dichotomy of Early and Late Phase Implicit Biases Can Provably Induce Grokking
ICLR 2024Poster
4
JoMA: Demystifying Multilayer Transformers via Joint Dynamics of MLP and Attention
ICLR 2024Poster
4
A Black-box Approach for Non-stationary Multi-agent Reinforcement Learning
ICLR 2024Poster
4
Free from Bellman Completeness: Trajectory Stitching via Model-based Return-conditioned Supervised Learning
ICLR 2024Poster
3
CLIPLoss and Norm-Based Data Selection Methods for Multimodal Contrastive Learning
NeurIPS 2024Spotlight
4
Decoding-Time Language Model Alignment with Multiple Objectives
NeurIPS 2024Poster