PaperHub

Bo An

~Bo_An2

49
论文总数
24.5
年均投稿
5.8
平均评分
接收情况25/49
会议分布
ICLR
32
NeurIPS
12
ICML
5

发表论文 (49 篇)

202533

7.0
4

Policy Optimization under Imperfect Human Interactions with Agent-Gated Shared Autonomy

ICLR 2025Poster
6.8
4

Group-in-Group Policy Optimization for LLM Agent Training

NeurIPS 2025Poster
8.2
4

Establishing Linear Surrogate Regret Bounds for Convex Smooth Losses via Convolutional Fenchel–Young Losses

NeurIPS 2025Spotlight
4.3
6

Improving Ordinal Conformal Prediction by Stepwise Adaptive Posterior Alignment

ICLR 2025withdrawn
6.8
4

OPHR: Mastering Volatility Trading with Multi-Agent Deep Reinforcement Learning

NeurIPS 2025Poster
6.4
4

Let's Revise Step-by-Step: A Unified Local Search Framework for Code Generation with LLMs

NeurIPS 2025Poster
4.7
3

Dual-Head Knowledge Distillation: Enhancing Logits Utilization with an Auxiliary Head

ICLR 2025withdrawn
6.3
3

A Closer Look at Backdoor Attacks on CLIP

ICML 2025Poster
5.3
4

A Closer Look at Backdoor Attacks on CLIP

ICLR 2025Rejected
6.5
4

AgentStudio: A Toolkit for Building General Virtual Agents

ICLR 2025Poster
7.8
4

MF-LLM: Simulating Population Decision Dynamics via a Mean-Field Large Language Model Framework

NeurIPS 2025Poster
6.4
4

Incentivizing LLMs to Self-Verify Their Answers

NeurIPS 2025Poster
4.8
4

Solving Urban Network Security Games: Learning Platform, Benchmark, and Challenge for AI Research

ICLR 2025Rejected
4.3
3

Guiding VLM Agents with Process Rewards at Inference Time for GUI Navigation

ICLR 2025Rejected
7.8
4

Policy Regularization on Globally Accessible States in Cross-Dynamics Reinforcement Learning

ICML 2025Spotlight
4.8
4

Conformal Prediction for Deep Classifier via Truncating

ICLR 2025Rejected
4.8
4

ASOR: Anchor State Oriented Regularization for Policy Optimization under Dynamics Shift

ICLR 2025Rejected
7.2
4

Representation Surgery in Model Merging with Probabilistic Modeling

ICML 2025Poster
4.3
3

Efficient LLM Alignment via Hierarchical Coarse-to-Fine Refinement

ICLR 2025withdrawn
4.0
3

Computing Ex Ante Equilibrium in Heterogeneous Zero-Sum Team Games

ICLR 2025withdrawn
4.0
4

Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning

ICLR 2025withdrawn
4.7
3

MEMO: Memory-Guided and Emotion-Aware Talking Video Generation

ICLR 2025withdrawn
7.2
4

Towards Efficient Online Tuning of VLM Agents via Counterfactual Soft Reinforcement Learning

ICML 2025Poster
4.5
4

In-Context Learning for Games

ICLR 2025Rejected
4.3
4

Offline Equilibrium Finding in Extensive-form Games: Datasets, Methods, and Analysis

ICLR 2025Rejected
8.2
4

Deciphering the Extremes: A Novel Approach for Pathological Long-tailed Recognition in Scientific Discovery

NeurIPS 2025Spotlight
6.8
4

Efficient Last-Iterate Convergence in Solving Extensive-Form Games

NeurIPS 2025Poster
6.0
4

Q-Adapter: Customizing Pre-trained LLMs to New Preferences with Forgetting Mitigation

ICLR 2025Poster
6.4
4

Improving Reward Models with Proximal Policy Exploration for Preference-Based Reinforcement Learning

NeurIPS 2025Poster
5.5
4

Outward Odyssey: Improving Reward Models with Proximal Policy Exploration for Preference-Based Reinforcement Learning

ICLR 2025Rejected
6.4
4

Empirical Study on Robustness and Resilience in Cooperative Multi-Agent Reinforcement Learning

NeurIPS 2025Poster
6.5
4

Cradle: Empowering Foundation Agents towards General Computer Control

ICLR 2025Rejected
4.8
3

Cradle: Empowering Foundation Agents towards General Computer Control

ICML 2025Poster

202416

6.3
4

DAG-Based Column Generation for Adversarial Team Games

ICLR 2024Rejected
7.3
3

Synapse: Trajectory-as-Exemplar Prompting with Memory for Computer Control

ICLR 2024Poster
4.5
4

Removing Length Bias in RLHF is not Enough

NeurIPS 2024Rejected
7.0
4

Solving Homogeneous and Heterogeneous Cooperative Tasks with Greedy Sequential Execution

ICLR 2024Spotlight
5.7
7

S$2$AC: Energy-Based Reinforcement Learning with Stein Soft Actor Critic

ICLR 2024Poster
6.8
5

Consistent Multi-Class Classification from Multiple Unlabeled Datasets

ICLR 2024Spotlight
7.3
3

On the Vulnerability of Adversarially Trained Models Against Two-faced Attacks

ICLR 2024Poster
6.0
4

AVOID: Alleviating VAE's Overestimation in Unsupervised OOD Detection

ICLR 2024Rejected
6.3
4

MaNo: Exploiting Matrix Norm for Unsupervised Accuracy Estimation Under Distribution Shifts

NeurIPS 2024Poster
6.0
4

True Knowledge Comes from Practice: Aligning Large Language Models with Embodied Environments via Reinforcement Learning

ICLR 2024Poster
5.5
4

Gradient norm as a powerful proxy to out-of-distribution error estimation

ICLR 2024Rejected
5.3
4

Learning Scalable Causal Discovery Policies with Adversarial Reinforcement Learning

ICLR 2024Rejected
4.2
5

Unified Mirror Descent: Towards a Big Unification of Decision Making

ICLR 2024Rejected
5.0
3

AdaRec: Adaptive Sequential Recommendation for Reinforcing Long-term User Engagement

ICLR 2024Rejected
5.3
3

Towards Complete Expressiveness Capacity of Mixed Multi-Agent Q Value Function

ICLR 2024Rejected
4.5
4

Keqing: Knowledge-based Question Answering is A Nature Chain-of-Thought mentor of LLMs

ICLR 2024Rejected