影响力指数

97.44/100

前 0.1%

全站排名 #75

发表论文39 篇

平均评分6.0

年均产出13.0 篇/年

Aaron Courville

Assistant Professor@University of Montreal·OpenReview

研究方向

deep learning · probabilistic graphical models · neural networks · inference methods

The Markovian Thinker: Architecture-Agnostic Linear Scaling of Reasoning

ICLR 2026Poster

Generative Adversarial Post-Training Mitigates Reward Hacking in Live Human-AI Music Interaction

ICLR 2026Poster

The Intricate Dance of Prompt Complexity, Quality, Diversity and Consistency in T2I Models

ICLR 2026Poster

Simplicial Embeddings Improve Sample Efficiency in Actor–Critic Agents

ICLR 2026Poster

Towards Sustainable Investment Policies Informed by Opponent Shaping

ICLR 2026Poster

A Comedy of Estimators: On KL Regularization in RL Training of LLMs

ICLR 2026Rejected

Asymmetric Proximal Policy Optimization: mini-critics boost LLM reasoning

ICLR 2026Poster

Shape of Thought: When Distribution Can Matter More than Correctness in Reasoning Tasks

ICLR 2026Desk Rejected

Learning Robust Social Strategies with Large Language Models

ICLR 2026Rejected

Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning

NeurIPS 2025Spotlight

Advantage Alignment Algorithms

Don't flatten, tokenize! Unlocking the key to SoftMoE's efficacy in deep RL

ICLR 2025Spotlight

Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation

NeurIPS 2025Poster

Measure gradients, not activations! Enhancing neuronal activity in deep reinforcement learning

NeurIPS 2025Poster

Compositional Discrete Latent Code for High Fidelity, Productive Diffusion Models

NeurIPS 2025Poster

Forgetting Transformer: Softmax Attention with a Forget Gate

ICLR 2025Poster

BiXSE: Improving Dense Retrieval via Probabilistic Graded Relevance Distillation

COLM 2025Poster

The Courage to Stop: Overcoming Sunk Cost Fallacy in Deep Reinforcement Learning

ICML 2025Poster

Mitigating Plasticity Loss in Continual Reinforcement Learning by Reducing Churn

ICML 2025Poster

VinePPO: Refining Credit Assignment in RL Training of LLMs

ICML 2025Poster

Neuroplastic Expansion in Deep Reinforcement Learning

ICLR 2025Poster

Scaling Stick-Breaking Attention: An Efficient Implementation and In-depth Study

ICLR 2025Poster

Adaptive Computation Pruning for the Forgetting Transformer

COLM 2025Poster

The Impact of On-Policy Parallelized Data Collection on Deep Reinforcement Learning Networks

ICML 2025Poster

Sample, Predict, then Proceed: Self-Verification Sampling for Tool Use of LLMs

NeurIPS 2025Rejected

Asynchronous RLHF: Faster and More Efficient Off-Policy RL for Language Models

ICLR 2025Poster

Not All LLM Reasoners Are Created Equal

ICLR 2025Rejected

FLAM: Frame-Wise Language-Audio Modeling

ICML 2025Poster

VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment

ICLR 2025Rejected

Bias Analysis in Unconditional Image Generative Models

ICLR 2025Rejected

Training Universal Text Encoders with Pair Relevance Classification Loss

ICLR 2025Rejected

合作者 (20)

Johan Obando-Ceron

Pablo Samuel Castro

Milad Aghajohari

Juan Agustin Duque

Alessandro Sordoni