Sergey Levine
~Sergey_Levine1
78
论文总数
39.0
年均投稿
平均评分
接收情况50/78
会议分布
ICLR
46
NeurIPS
19
ICML
11
COLM
2
发表论文 (78 篇)
202547 篇
4
Self-Challenging Language Model Agents
NeurIPS 2025Poster
4
Zero-Shot Goal Dialogue via Reinforcement Learning on Imagined Conversations
ICLR 2025Rejected
4
Annotation Bootstrapping: Reinforcing Visual Pre-Training using Unlabelled Images
ICLR 2025Rejected
5
A Stable Whitening Optimizer for Efficient Neural Network Training
NeurIPS 2025Poster
4
Q-SFT: Q-Learning for Language Models via Supervised Fine-Tuning
ICLR 2025Poster
3
Scaling Test-Time Compute Without Verification or RL is Suboptimal
ICML 2025Spotlight
4
Behavioral Exploration: Learning to Explore via In-Context Adaptation
ICML 2025Poster
4
Cliqueformer: Model-Based Optimization With Structured Transformers
ICLR 2025Rejected
4
Reinforcement Learning with Action Chunking
NeurIPS 2025Poster
4
Real-Time Execution of Action Chunking Flow Policies
NeurIPS 2025Poster
4
Planning without Search: Refining Frontier LLMs with Offline Goal-Conditioned RL
NeurIPS 2025Poster
4
One Step Diffusion via Shortcut Models
ICLR 2025Oral
4
Unsupervised-to-Online Reinforcement Learning
ICLR 2025Rejected
4
Flow Q-Learning
ICML 2025Poster
4
Interactive Dialogue Agents via Reinforcement Learning with Hindsight Regenerations
ICLR 2025Rejected
4
ViVa: Video-Trained Value Functions for Guiding Online RL from Diverse Data
ICLR 2025Rejected
4
Efficient Online Reinforcement Learning Fine-Tuning Need Not Retain Offline Data
ICLR 2025Poster
4
Language Guided Skill Discovery
ICLR 2025Poster
4
Offline Goal-conditioned Reinforcement Learning with Quasimetric Representations
NeurIPS 2025Poster
5
Digi-Q: Learning VLM Q-Value Functions for Training Device-Control Agents
ICLR 2025Poster
4
OGBench: Benchmarking Offline Goal-Conditioned RL
ICLR 2025Poster
3
Vision-Language Models Provide Promptable Representations for Reinforcement Learning
ICLR 2025withdrawn
5
Leveraging Skills from Unlabeled Prior Data for Efficient Online Exploration
ICLR 2025Rejected
4
Prioritized Generative Replay
ICLR 2025Oral
4
Leveraging Skills from Unlabeled Prior Data for Efficient Online Exploration
ICML 2025Poster
4
Consistently Simulating Human Personas with Multi-Turn Reinforcement Learning
NeurIPS 2025Poster
3
Successor Representations Enable Emergent Compositional Instruction Following
ICLR 2025Rejected
4
Temporal Representation Alignment: Successor Features Enable Emergent Compositionality in Robot Instruction Following
NeurIPS 2025Poster
4
What Do Learning Dynamics Reveal About Generalization in LLM Mathematical Reasoning?
ICML 2025Poster
4
Value-Based Deep RL Scales Predictably
ICML 2025Poster
4
Pre-Memorization Train Accuracy Reliably Predicts Generalization in LLM Reasoning
ICLR 2025Rejected
4
Compute-Optimal Scaling for Value-Based Deep RL
NeurIPS 2025Poster
4
Adding Conditional Control to Diffusion Models with Reinforcement Learning
ICLR 2025Poster
4
Defining Deception in Decision Making
ICLR 2025Rejected
4
Horizon Reduction Makes RL Scalable
NeurIPS 2025Spotlight
3
Proposer-Agent-Evaluator (PAE): Autonomous Skill Discovery For Foundation Model Internet Agents
ICML 2025Poster
5
Proposer-Agent-Evaluator (PAE): Autonomous Skill Discovery For Foundation Model Internet Agents
ICLR 2025Rejected
4
Fine-Tuning Discrete Diffusion Models via Reward Optimization with Applications to DNA and Protein Design
ICLR 2025Poster
3
Reward-Guided Iterative Refinement in Diffusion Models at Test-Time with Applications to Protein and DNA Design
ICML 2025Poster
4
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
ICML 2025Poster
4
LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models
ICML 2025Poster
4
LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models
ICLR 2025Rejected
4
Fine-Tuning of Continuous-Time Diffusion Models as Entropy-Regularized Control
ICLR 2025Rejected
4
Derivative-Free Guidance in Continuous and Discrete Diffusion Models with Soft Value-based Decoding
NeurIPS 2025Poster
5
Derivative-Free Guidance in Continuous and Discrete Diffusion Models with Soft Value-Based Decoding
ICLR 2025Rejected
3
Knowledge Insulating Vision-Language-Action Models: Train Fast, Run Fast, Generalize Better
NeurIPS 2025Spotlight
4
Hi Robot: Open-Ended Instruction Following with Hierarchical Vision-Language-Action Models
ICML 2025Poster
202431 篇
3
Zero-Shot Goal-Directed Dialogue via RL on Imagined Conversations
ICLR 2024Rejected
3
Offline RL with Observation Histories: Analyzing and Improving Sample Complexity
ICLR 2024Poster
5
Learning to Assist Humans without Inferring Rewards
NeurIPS 2024Poster
4
Conservative World Models
ICLR 2024Rejected
3
Contrastive Representations Make Planning Easy
ICLR 2024Rejected
4
Is Value Learning Really the Main Bottleneck in Offline RL?
NeurIPS 2024Poster
4
METRA: Scalable Unsupervised RL with Metric-Aware Abstraction
ICLR 2024Oral
4
Advantage-Conditioned Diffusion: Offline RL via Generalization
ICLR 2024Rejected
4
Inference via Interpolation: Contrastive Representations Provably Enable Planning and Inference
NeurIPS 2024Poster
4
Predicting Emergent Capabilities by Finetuning
COLM 2024Poster
4
Deep Neural Networks Tend To Extrapolate Predictably
ICLR 2024Poster
3
Project and Probe: Sample-Efficient Adaptation by Interpolating Orthogonal Features
ICLR 2024Spotlight
4
Vision-Language Models Provide Promptable Representations for Reinforcement Learning
ICLR 2024Rejected
3
Confidence-Based Model Selection: When to Take Shortcuts in Spurious Settings
ICLR 2024Rejected
3
Autonomous Evaluation and Refinement of Digital Agents
COLM 2024Poster
4
RLIF: Interactive Imitation Learning as Reinforcement Learning
ICLR 2024Poster
4
Offline RL for Online RL: Decoupled Policy Learning for Mitigating Exploration Bias
ICLR 2024Rejected
4
V-Former: Offline RL with Temporally-Extended Actions
ICLR 2024Rejected
4
Training Diffusion Models with Reinforcement Learning
ICLR 2024Poster
4
DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning
NeurIPS 2024Poster
3
Adapt On-the-Go: Behavior Modulation for Single-Life Robot Deployment
ICLR 2024Rejected
4
Latent Conservative Objective Models for Offline Data-Driven Crystal Structure Prediction
ICLR 2024Rejected
4
Stabilizing Contrastive RL: Techniques for Robotic Goal Reaching from Offline Data
ICLR 2024Spotlight
4
Zero-Shot Robotic Manipulation with Pre-Trained Image-Editing Diffusion Models
ICLR 2024Poster
4
Bridging Model-Based Optimization and Generative Modeling via Conservative Fine-Tuning of Diffusion Models
NeurIPS 2024Poster
3
Designing Cell-Type-Specific Promoter Sequences Using Conservative Model-Based Optimization
NeurIPS 2024Poster
4
The False Promise of Imitating Proprietary Language Models
ICLR 2024Spotlight
4
LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models
ICLR 2024Rejected
4
D5RL: Diverse Datasets for Data-Driven Deep Reinforcement Learning
ICLR 2024Rejected
4
Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning
NeurIPS 2024Poster
4
AutoRT: Embodied Foundation Models for Large Scale Orchestration of Robotic Agents
ICLR 2024Rejected