Yang Yu
~Yang_Yu5
38
论文总数
19.0
年均投稿
平均评分
接收情况26/38
会议分布
ICLR
24
NeurIPS
9
ICML
5
发表论文 (38 篇)
202524 篇
4
Uncertainty-Sensitive Privileged Learning
NeurIPS 2025Poster
5
Hindsight Preference Learning for Offline Preference-based Reinforcement Learning
ICLR 2025Rejected
6
On the Optimization Landscape of Low Rank Adaptation Methods for Large Language Models
ICLR 2025Poster
4
Focus-Then-Reuse: Fast Adaptation in Visual Perturbation Environments
NeurIPS 2025Poster
4
Safe Multi-task Pretraining with Constraint Prioritized Decision Transformer
ICLR 2025Rejected
4
LLM-Assisted Semantically Diverse Teammate Generation for Efficient Multi-agent Coordination
ICML 2025Poster
3
Behavior-Regularized Diffusion Policy Optimization for Offline Reinforcement Learning
ICML 2025Poster
4
Efficient Multi-agent Offline Coordination via Diffusion-based Trajectory Stitching
ICLR 2025Poster
3
Learning Generalizable Environment Models via Discovering Superposed Causal Relationships
ICLR 2025Rejected
4
Adaptable Safe Policy Learning from Multi-task Data with Constraint Prioritized Decision Transformer
NeurIPS 2025Poster
4
Improving Reward Model Generalization from Adversarial Process Enhanced Preferences
ICML 2025Poster
3
Boosting Offline Multi-Objective Reinforcement Learning via Preference Conditioned Diffusion Models
ICLR 2025withdrawn
4
LLMOPT: Learning to Define and Solve General Optimization Problems from Scratch
ICLR 2025Poster
3
Controlling Large Language Model with Latent Action
ICML 2025Poster
4
Semantic Temporal Abstraction via Vision-Language Model Guidance for Efficient Reinforcement Learning
ICLR 2025Poster
4
Diffusion-Guided Safe Policy Optimization From Cost-Label-Free Offline Dataset
ICLR 2025withdrawn
4
Q-Adapter: Customizing Pre-trained LLMs to New Preferences with Forgetting Mitigation
ICLR 2025Poster
5
Haland: Human-AI Coordination via Policy Generation from Language-guided Diffusion
ICLR 2025Rejected
4
Multi-Agent Imitation by Learning and Sampling from Factorized Soft Q-Function
NeurIPS 2025Poster
4
Any-step Dynamics Model Improves Future Predictions for Online and Offline Reinforcement Learning
ICLR 2025Poster
5
Learning View-invariant World Models for Visual Robotic Manipulation
ICLR 2025Poster
3
Learning to Reuse Policies in State Evolvable Environments
ICML 2025Poster
-
Whale-X: Learning Scalable Embodied World Models with Enhanced Generalizability
ICLR 2025withdrawn
4
SOO-Bench: Benchmarks for Evaluating the Stability of Offline Black-Box Optimization
ICLR 2025Poster
202414 篇
4
Efficient Recurrent Off-Policy RL Requires a Context-Encoder-Specific Learning Rate
NeurIPS 2024Poster
4
Reward-Consistent Dynamics Models are Strongly Generalizable for Offline Reinforcement Learning
ICLR 2024Spotlight
4
Provably and Practically Efficient Adversarial Imitation Learning with General Function Approximation
NeurIPS 2024Poster
4
Offline Imitation Learning without Auxiliary High-quality Behavior Data
ICLR 2024Rejected
4
Effective Offline Environment Reconstruction when the Dataset is Collected from Diversified Behavior Policies
ICLR 2024Rejected
3
Multi-Agent Domain Calibration with a Handful of Offline Data
NeurIPS 2024Poster
4
Policy Learning from Tutorial Books via Understanding, Rehearsing and Introspecting
NeurIPS 2024Oral
3
Flow to Better: Offline Preference-based Reinforcement Learning via Preferred Trajectory Generation
ICLR 2024Poster
4
KALM: Knowledgeable Agents by Offline Reinforcement Learning from Large Language Model Rollouts
NeurIPS 2024Poster
5
Language Model Self-improvement by Reinforcement Learning Contemplation
ICLR 2024Poster
4
Learn to Achieve Out-of-the-Box Imitation Ability from Only One Demonstration
ICLR 2024Rejected
4
Efficient Human-AI Coordination via Preparatory Language-based Convention
ICLR 2024Rejected
4
Policy Rehearsing: Training Generalizable Policies for Reinforcement Learning
ICLR 2024Poster
3
One by One, Continual Coordinating with Humans via Hyper-Teammate Identification
ICLR 2024withdrawn