Yaodong Yang
~Yaodong_Yang1
38
论文总数
19.0
年均投稿
平均评分
接收情况23/38
会议分布
ICLR
24
NeurIPS
11
ICML
3
发表论文 (38 篇)
202520 篇
3
Random Feature Models with Learnable Activation Functions
ICLR 2025Rejected
4
Mitigating Reward Over-Optimization in RLHF via Behavior-Supported Regularization
ICLR 2025Poster
3
Mixed Hierarchical Oracle and Multi-Agent Benchmark in Two-player Zero-sum Games
ICLR 2025withdrawn
4
Emerging Safety Attack and Defense in Federated Instruction Tuning of Large Language Models
ICLR 2025Poster
4
SAE-V: Interpreting Multimodal Models for Enhanced Alignment
ICML 2025Poster
4
Risk-aware Direct Preference Optimization under Nested Risk Measure
ICML 2025Rejected
3
Computing Ex Ante Equilibrium in Heterogeneous Zero-Sum Team Games
ICLR 2025withdrawn
3
Risk-aware Direct Preference Optimization under Nested Risk Measure
NeurIPS 2025Poster
4
Iterative Training of Language Models with Opponent Modeling for Red Teaming Data Generation
ICLR 2025Rejected
4
SafeVLA: Towards Safety Alignment of Vision-Language-Action Model via Constrained Learning
NeurIPS 2025Spotlight
4
In-Context Editing: Learning Knowledge from Self-Induced Distributions
ICLR 2025Poster
4
Amulet: ReAlignment During Test Time for Personalized Preference Adaptation of LLMs
ICLR 2025Poster
3
Social World Model-Augmented Mechanism Design Policy Learning
NeurIPS 2025Poster
4
Generative RLHF-V: Learning Principles from Multi-modal Human Preference
NeurIPS 2025Poster
4
STAR: Efficient Preference-based Reinforcement Learning via Dual Regularization
NeurIPS 2025Poster
4
DexFlyWheel: A Scalable and Self-improving Data Generation Framework for Dexterous Manipulation
NeurIPS 2025Spotlight
4
Magnetic Preference Optimization: Achieving Last-iterate Convergence for Language Model Alignment
ICLR 2025Poster
4
Falcon: Fast Visuomotor Policies via Partial Denoising
ICML 2025Poster
4
Safe RLHF-V: Safe Reinforcement Learning from Multi-modal Human Feedback
NeurIPS 2025Poster
4
Empirical Study on Robustness and Resilience in Cooperative Multi-Agent Reinforcement Learning
NeurIPS 2025Poster
202418 篇
4
Reason to Behave: Achieving Human-Like Task Execution for Physics-Based Characters
ICLR 2024withdrawn
4
Boosting Multi-Agent Reinforcement Learning via Transition-Informed Representations
ICLR 2024Rejected
3
Measuring Value Understanding in Language Models through Discriminator-Critique Gap
ICLR 2024withdrawn
4
Planning with Theory of Mind for Few-Shot Adaptation in Sequential Social Dilemmas
ICLR 2024Rejected
4
Open-Ended Learning in General-Sum Games: The Role of Diversity in Correlated Equilibrium
ICLR 2024Rejected
4
BATTLE: Towards Behavior-oriented Adversarial Attacks against Deep Reinforcement Learning
ICLR 2024Rejected
4
SafeDreamer: Safe Reinforcement Learning with World Models
ICLR 2024Poster
3
Scalable Constrained Policy Optimization for Safe Multi-agent Reinforcement Learning
NeurIPS 2024Poster
5
Masked Pretraining for Multi-Agent Decision Making
ICLR 2024withdrawn
4
MultiReAct: Multimodal Tools Augmented Reasoning-Acting Traces for Embodied Agent Planning
ICLR 2024Rejected
4
Red Teaming Game: A Game-Theoretic Framework for Red Teaming Language Models
ICLR 2024withdrawn
4
Maximum Entropy Heterogeneous-Agent Reinforcement Learning
ICLR 2024Spotlight
3
Panacea: Pareto Alignment via Preference Adaptation for LLMs
NeurIPS 2024Poster
3
Byzantine Robust Cooperative Multi-Agent Reinforcement Learning as a Bayesian Game
ICLR 2024Poster
4
Safe RLHF: Safe Reinforcement Learning from Human Feedback
ICLR 2024Spotlight
4
Heterogeneous Value Alignment Evaluation for Large Language Models
ICLR 2024withdrawn
4
Aligner: Efficient Alignment by Learning to Correct
NeurIPS 2024Oral
3
CivRealm: A Learning and Reasoning Odyssey in Civilization for Decision-Making Agents
ICLR 2024Spotlight