Bo An
~Bo_An2
49
论文总数
24.5
年均投稿
平均评分
接收情况25/49
会议分布
ICLR
32
NeurIPS
12
ICML
5
发表论文 (49 篇)
202533 篇
4
Policy Optimization under Imperfect Human Interactions with Agent-Gated Shared Autonomy
ICLR 2025Poster
4
Group-in-Group Policy Optimization for LLM Agent Training
NeurIPS 2025Poster
4
Establishing Linear Surrogate Regret Bounds for Convex Smooth Losses via Convolutional Fenchel–Young Losses
NeurIPS 2025Spotlight
6
Improving Ordinal Conformal Prediction by Stepwise Adaptive Posterior Alignment
ICLR 2025withdrawn
4
OPHR: Mastering Volatility Trading with Multi-Agent Deep Reinforcement Learning
NeurIPS 2025Poster
4
Let's Revise Step-by-Step: A Unified Local Search Framework for Code Generation with LLMs
NeurIPS 2025Poster
3
Dual-Head Knowledge Distillation: Enhancing Logits Utilization with an Auxiliary Head
ICLR 2025withdrawn
3
A Closer Look at Backdoor Attacks on CLIP
ICML 2025Poster
4
A Closer Look at Backdoor Attacks on CLIP
ICLR 2025Rejected
4
AgentStudio: A Toolkit for Building General Virtual Agents
ICLR 2025Poster
4
MF-LLM: Simulating Population Decision Dynamics via a Mean-Field Large Language Model Framework
NeurIPS 2025Poster
4
Incentivizing LLMs to Self-Verify Their Answers
NeurIPS 2025Poster
4
Solving Urban Network Security Games: Learning Platform, Benchmark, and Challenge for AI Research
ICLR 2025Rejected
3
Guiding VLM Agents with Process Rewards at Inference Time for GUI Navigation
ICLR 2025Rejected
4
Policy Regularization on Globally Accessible States in Cross-Dynamics Reinforcement Learning
ICML 2025Spotlight
4
Conformal Prediction for Deep Classifier via Truncating
ICLR 2025Rejected
4
ASOR: Anchor State Oriented Regularization for Policy Optimization under Dynamics Shift
ICLR 2025Rejected
4
Representation Surgery in Model Merging with Probabilistic Modeling
ICML 2025Poster
3
Efficient LLM Alignment via Hierarchical Coarse-to-Fine Refinement
ICLR 2025withdrawn
3
Computing Ex Ante Equilibrium in Heterogeneous Zero-Sum Team Games
ICLR 2025withdrawn
4
Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning
ICLR 2025withdrawn
3
MEMO: Memory-Guided and Emotion-Aware Talking Video Generation
ICLR 2025withdrawn
4
Towards Efficient Online Tuning of VLM Agents via Counterfactual Soft Reinforcement Learning
ICML 2025Poster
4
In-Context Learning for Games
ICLR 2025Rejected
4
Offline Equilibrium Finding in Extensive-form Games: Datasets, Methods, and Analysis
ICLR 2025Rejected
4
Deciphering the Extremes: A Novel Approach for Pathological Long-tailed Recognition in Scientific Discovery
NeurIPS 2025Spotlight
4
Efficient Last-Iterate Convergence in Solving Extensive-Form Games
NeurIPS 2025Poster
4
Q-Adapter: Customizing Pre-trained LLMs to New Preferences with Forgetting Mitigation
ICLR 2025Poster
4
Improving Reward Models with Proximal Policy Exploration for Preference-Based Reinforcement Learning
NeurIPS 2025Poster
4
Outward Odyssey: Improving Reward Models with Proximal Policy Exploration for Preference-Based Reinforcement Learning
ICLR 2025Rejected
4
Empirical Study on Robustness and Resilience in Cooperative Multi-Agent Reinforcement Learning
NeurIPS 2025Poster
4
Cradle: Empowering Foundation Agents towards General Computer Control
ICLR 2025Rejected
3
Cradle: Empowering Foundation Agents towards General Computer Control
ICML 2025Poster
202416 篇
4
DAG-Based Column Generation for Adversarial Team Games
ICLR 2024Rejected
3
Synapse: Trajectory-as-Exemplar Prompting with Memory for Computer Control
ICLR 2024Poster
4
Removing Length Bias in RLHF is not Enough
NeurIPS 2024Rejected
4
Solving Homogeneous and Heterogeneous Cooperative Tasks with Greedy Sequential Execution
ICLR 2024Spotlight
7
S$2$AC: Energy-Based Reinforcement Learning with Stein Soft Actor Critic
ICLR 2024Poster
5
Consistent Multi-Class Classification from Multiple Unlabeled Datasets
ICLR 2024Spotlight
3
On the Vulnerability of Adversarially Trained Models Against Two-faced Attacks
ICLR 2024Poster
4
AVOID: Alleviating VAE's Overestimation in Unsupervised OOD Detection
ICLR 2024Rejected
4
MaNo: Exploiting Matrix Norm for Unsupervised Accuracy Estimation Under Distribution Shifts
NeurIPS 2024Poster
4
True Knowledge Comes from Practice: Aligning Large Language Models with Embodied Environments via Reinforcement Learning
ICLR 2024Poster
4
Gradient norm as a powerful proxy to out-of-distribution error estimation
ICLR 2024Rejected
4
Learning Scalable Causal Discovery Policies with Adversarial Reinforcement Learning
ICLR 2024Rejected
5
Unified Mirror Descent: Towards a Big Unification of Decision Making
ICLR 2024Rejected
3
AdaRec: Adaptive Sequential Recommendation for Reinforcing Long-term User Engagement
ICLR 2024Rejected
3
Towards Complete Expressiveness Capacity of Mixed Multi-Agent Q Value Function
ICLR 2024Rejected
4
Keqing: Knowledge-based Question Answering is A Nature Chain-of-Thought mentor of LLMs
ICLR 2024Rejected