Aaron Courville
~Aaron_Courville3
30
论文总数
15.0
年均投稿
平均评分
接收情况23/30
会议分布
ICLR
15
NeurIPS
6
ICML
5
COLM
4
发表论文 (30 篇)
202522 篇
5
Compositional Discrete Latent Code for High Fidelity, Productive Diffusion Models
NeurIPS 2025Poster
5
Don't flatten, tokenize! Unlocking the key to SoftMoE's efficacy in deep RL
ICLR 2025Spotlight
4
Bias Analysis in Unconditional Image Generative Models
ICLR 2025Rejected
4
Neuroplastic Expansion in Deep Reinforcement Learning
ICLR 2025Poster
3
The Impact of On-Policy Parallelized Data Collection on Deep Reinforcement Learning Networks
ICML 2025Poster
4
Scaling Stick-Breaking Attention: An Efficient Implementation and In-depth Study
ICLR 2025Poster
4
Sample, Predict, then Proceed: Self-Verification Sampling for Tool Use of LLMs
NeurIPS 2025Rejected
3
Adaptive Computation Pruning for the Forgetting Transformer
COLM 2025Poster
4
The Courage to Stop: Overcoming Sunk Cost Fallacy in Deep Reinforcement Learning
ICML 2025Poster
4
Mitigating Plasticity Loss in Continual Reinforcement Learning by Reducing Churn
ICML 2025Poster
4
BiXSE: Improving Dense Retrieval via Probabilistic Graded Relevance Distillation
COLM 2025Poster
4
Training Universal Text Encoders with Pair Relevance Classification Loss
ICLR 2025Rejected
4
Not All LLM Reasoners Are Created Equal
ICLR 2025Rejected
4
Forgetting Transformer: Softmax Attention with a Forget Gate
ICLR 2025Poster
4
Measure gradients, not activations! Enhancing neuronal activity in deep reinforcement learning
NeurIPS 2025Poster
4
FLAM: Frame-Wise Language-Audio Modeling
ICML 2025Poster
4
VinePPO: Refining Credit Assignment in RL Training of LLMs
ICML 2025Poster
4
Asynchronous RLHF: Faster and More Efficient Off-Policy RL for Language Models
ICLR 2025Poster
4
VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment
ICLR 2025Rejected
4
Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning
NeurIPS 2025Spotlight
4
Advantage Alignment Algorithms
ICLR 2025Oral
4
Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation
NeurIPS 2025Poster
20248 篇
6
Meta-Value Learning: a General Framework for Learning with Learning Awareness
ICLR 2024Rejected
4
V-STaR: Training Verifiers for Self-Taught Reasoners
COLM 2024Poster
3
Scattered Mixture-of-Experts Implementation
COLM 2024Poster
4
GenRL: Multimodal-foundation world models for generalization in embodied agents
NeurIPS 2024Poster
4
Diffusion Generative Flow Samplers: Improving learning signals through partial trajectory optimization
ICLR 2024Poster
4
The Curse of Diversity in Ensemble-Based Exploration
ICLR 2024Poster
5
LOQA: Learning with Opponent Q-Learning Awareness
ICLR 2024Poster
4
Best Response Shaping
ICLR 2024Rejected