Sergey Levine

~Sergey_Levine1

78

论文总数

39.0

年均投稿

5.9

平均评分

接收情况50/78

会议分布

ICLR

46

NeurIPS

19

ICML

11

COLM

2

发表论文 (78 篇)

202547 篇

Self-Challenging Language Model Agents

NeurIPS 2025Poster

Zero-Shot Goal Dialogue via Reinforcement Learning on Imagined Conversations

ICLR 2025Rejected

Annotation Bootstrapping: Reinforcing Visual Pre-Training using Unlabelled Images

ICLR 2025Rejected

A Stable Whitening Optimizer for Efficient Neural Network Training

NeurIPS 2025Poster

Q-SFT: Q-Learning for Language Models via Supervised Fine-Tuning

ICLR 2025Poster

Scaling Test-Time Compute Without Verification or RL is Suboptimal

ICML 2025Spotlight

Behavioral Exploration: Learning to Explore via In-Context Adaptation

ICML 2025Poster

Cliqueformer: Model-Based Optimization With Structured Transformers

ICLR 2025Rejected

Reinforcement Learning with Action Chunking

NeurIPS 2025Poster

Real-Time Execution of Action Chunking Flow Policies

NeurIPS 2025Poster

Planning without Search: Refining Frontier LLMs with Offline Goal-Conditioned RL

NeurIPS 2025Poster

One Step Diffusion via Shortcut Models

Unsupervised-to-Online Reinforcement Learning

ICLR 2025Rejected

Flow Q-Learning

ICML 2025Poster

Interactive Dialogue Agents via Reinforcement Learning with Hindsight Regenerations

ICLR 2025Rejected

ViVa: Video-Trained Value Functions for Guiding Online RL from Diverse Data

ICLR 2025Rejected

Efficient Online Reinforcement Learning Fine-Tuning Need Not Retain Offline Data

ICLR 2025Poster

Language Guided Skill Discovery

ICLR 2025Poster

Offline Goal-conditioned Reinforcement Learning with Quasimetric Representations

NeurIPS 2025Poster

Digi-Q: Learning VLM Q-Value Functions for Training Device-Control Agents

ICLR 2025Poster

OGBench: Benchmarking Offline Goal-Conditioned RL

ICLR 2025Poster

Vision-Language Models Provide Promptable Representations for Reinforcement Learning

ICLR 2025withdrawn

Leveraging Skills from Unlabeled Prior Data for Efficient Online Exploration

ICLR 2025Rejected

Prioritized Generative Replay

Leveraging Skills from Unlabeled Prior Data for Efficient Online Exploration

ICML 2025Poster

Consistently Simulating Human Personas with Multi-Turn Reinforcement Learning

NeurIPS 2025Poster

Successor Representations Enable Emergent Compositional Instruction Following

ICLR 2025Rejected

Temporal Representation Alignment: Successor Features Enable Emergent Compositionality in Robot Instruction Following

NeurIPS 2025Poster

What Do Learning Dynamics Reveal About Generalization in LLM Mathematical Reasoning?

ICML 2025Poster

Value-Based Deep RL Scales Predictably

ICML 2025Poster

Pre-Memorization Train Accuracy Reliably Predicts Generalization in LLM Reasoning

ICLR 2025Rejected

Compute-Optimal Scaling for Value-Based Deep RL

NeurIPS 2025Poster

Adding Conditional Control to Diffusion Models with Reinforcement Learning

ICLR 2025Poster

Defining Deception in Decision Making

ICLR 2025Rejected

Horizon Reduction Makes RL Scalable

NeurIPS 2025Spotlight

Proposer-Agent-Evaluator (PAE): Autonomous Skill Discovery For Foundation Model Internet Agents

ICML 2025Poster

Proposer-Agent-Evaluator (PAE): Autonomous Skill Discovery For Foundation Model Internet Agents

ICLR 2025Rejected

Fine-Tuning Discrete Diffusion Models via Reward Optimization with Applications to DNA and Protein Design

ICLR 2025Poster

Reward-Guided Iterative Refinement in Diffusion Models at Test-Time with Applications to Protein and DNA Design

ICML 2025Poster

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

ICML 2025Poster

LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models

ICML 2025Poster

LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models

ICLR 2025Rejected

Fine-Tuning of Continuous-Time Diffusion Models as Entropy-Regularized Control

ICLR 2025Rejected

Derivative-Free Guidance in Continuous and Discrete Diffusion Models with Soft Value-based Decoding

NeurIPS 2025Poster

Derivative-Free Guidance in Continuous and Discrete Diffusion Models with Soft Value-Based Decoding

ICLR 2025Rejected

Knowledge Insulating Vision-Language-Action Models: Train Fast, Run Fast, Generalize Better

NeurIPS 2025Spotlight

Hi Robot: Open-Ended Instruction Following with Hierarchical Vision-Language-Action Models

ICML 2025Poster

202431 篇

Zero-Shot Goal-Directed Dialogue via RL on Imagined Conversations

ICLR 2024Rejected

Offline RL with Observation Histories: Analyzing and Improving Sample Complexity

ICLR 2024Poster

Learning to Assist Humans without Inferring Rewards

NeurIPS 2024Poster

Conservative World Models

ICLR 2024Rejected

Contrastive Representations Make Planning Easy

ICLR 2024Rejected

Is Value Learning Really the Main Bottleneck in Offline RL?

NeurIPS 2024Poster

METRA: Scalable Unsupervised RL with Metric-Aware Abstraction

Advantage-Conditioned Diffusion: Offline RL via Generalization

ICLR 2024Rejected

Inference via Interpolation: Contrastive Representations Provably Enable Planning and Inference

NeurIPS 2024Poster

Predicting Emergent Capabilities by Finetuning

COLM 2024Poster

Deep Neural Networks Tend To Extrapolate Predictably

ICLR 2024Poster

Project and Probe: Sample-Efficient Adaptation by Interpolating Orthogonal Features

ICLR 2024Spotlight

Vision-Language Models Provide Promptable Representations for Reinforcement Learning

ICLR 2024Rejected

Confidence-Based Model Selection: When to Take Shortcuts in Spurious Settings

ICLR 2024Rejected

Autonomous Evaluation and Refinement of Digital Agents

COLM 2024Poster

RLIF: Interactive Imitation Learning as Reinforcement Learning

ICLR 2024Poster

Offline RL for Online RL: Decoupled Policy Learning for Mitigating Exploration Bias

ICLR 2024Rejected

V-Former: Offline RL with Temporally-Extended Actions

ICLR 2024Rejected

Training Diffusion Models with Reinforcement Learning

ICLR 2024Poster

DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning

NeurIPS 2024Poster

Adapt On-the-Go: Behavior Modulation for Single-Life Robot Deployment

ICLR 2024Rejected

Latent Conservative Objective Models for Offline Data-Driven Crystal Structure Prediction

ICLR 2024Rejected

Stabilizing Contrastive RL: Techniques for Robotic Goal Reaching from Offline Data

ICLR 2024Spotlight

Zero-Shot Robotic Manipulation with Pre-Trained Image-Editing Diffusion Models

ICLR 2024Poster

Bridging Model-Based Optimization and Generative Modeling via Conservative Fine-Tuning of Diffusion Models

NeurIPS 2024Poster

Designing Cell-Type-Specific Promoter Sequences Using Conservative Model-Based Optimization

NeurIPS 2024Poster

The False Promise of Imitating Proprietary Language Models

ICLR 2024Spotlight

LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models

ICLR 2024Rejected

D5RL: Diverse Datasets for Data-Driven Deep Reinforcement Learning

ICLR 2024Rejected

Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning

NeurIPS 2024Poster

AutoRT: Embodied Foundation Models for Large Scale Orchestration of Robotic Agents

ICLR 2024Rejected

合作者 (20)

Aviral Kumar16 篇

Anca Dragan10 篇

Kevin Frans8 篇

Pieter Abbeel8 篇

Chelsea Finn8 篇

Masatoshi Uehara7 篇

Tommaso Biancalani7 篇