影响力指数

97.28/100

前 0.1%

全站排名 #81

发表论文73 篇

平均评分5.4

年均产出24.3 篇/年

Bo An

Full Professor@Nanyang Technological University·新加坡·OpenReview

研究方向

multi-agent systems · reinforcement learning

SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

ICLR 2026Poster

Hierarchy-of-Groups Policy Optimization for Long-Horizon Agentic Tasks

ICLR 2026Poster

SMAN-Bench: A Cross-System Benchmark for Mobile Agents under Single- and Multi-path, Ambiguous, and Noisy Tasks

ICLR 2026Poster

Generative Auto-Bidding in Large-Scale Auctions via Diffusion Completer-Aligner

ICLR 2026Rejected

MobileIPL: Enhancing Mobile Agents Thinking Process via Iterative Preference Learning

ICLR 2026Poster

Vulnerable Agent Identification in Large-Scale Multi-Agent Reinforcement Learning

ICLR 2026Rejected

Enhancing Extreme Weather Forecasting via Dynamically Weighted MSE

ICLR 2026Rejected

Controlling Video Generation with Vision Language Models

ICLR 2026Rejected

PAC Reasoning: Controlling the Performance Loss for Efficient Reasoning

ICLR 2026Rejected

OSCAR: Orthogonal Stochastic Control for Alignment-Respecting Diversity in Flow Matching

ICLR 2026Desk Rejected

Uncertainty-Aware Tree Search for Efficient LLM Reasoning

ICLR 2026Rejected

Offline Equilibrium Finding in Extensive-form Games: Datasets, Methods, and Analysis

ICLR 2026Rejected

DHEvo: Data-Algorithm Based Heuristic Evolution for Generalizable MILP Solving

ICLR 2026Withdrawn

Model-Heterogeneous Federated Prompt Learning

ICLR 2026Withdrawn

Solving Puzzles? Jailbreaking Multimodal Large Language Models!

ICLR 2026Withdrawn

FM-IRL: Flow-Matching for Reward Modeling and Policy Regularization in Reinforcement Learning

ICLR 2026Rejected

AgentOrchestra: Orchestrating Hierarchical Multi-Agent Intelligence with the Tool-Environment-Agent (TEA) Protocol

ICLR 2026Withdrawn

Fusing LLMs with Scientific Literature for Heuristic Discovery

ICLR 2026Rejected

MASLab: A Unified and Comprehensive Codebase for LLM-based Multi-Agent Systems

ICLR 2026Withdrawn

Anatomy of a Hybrid Mind: Deconstructing Hybrid Reasoning in Large Language Models

ICLR 2026Withdrawn

Adversarial Test Case Generation via Reinforcement Learning Extends Scaling Laws

ICLR 2026Rejected

REAR: Scalable Test-time Preference Realignment through Reward Decomposition

ICLR 2026Rejected

Value Shaping: Bias Reduction in Bellman Error for Deep Reinforcement Learning

ICLR 2026Withdrawn

StarDojo: Benchmarking Open-Ended Behaviors of Agentic Multimodal LLMs in Production–Living Simulations with Stardew Valley

ICLR 2026Desk Rejected

Establishing Linear Surrogate Regret Bounds for Convex Smooth Losses via Convolutional Fenchel–Young Losses

NeurIPS 2025Spotlight

Deciphering the Extremes: A Novel Approach for Pathological Long-tailed Recognition in Scientific Discovery

NeurIPS 2025Spotlight

MF-LLM: Simulating Population Decision Dynamics via a Mean-Field Large Language Model Framework

NeurIPS 2025Poster

Policy Regularization on Globally Accessible States in Cross-Dynamics Reinforcement Learning

ICML 2025Spotlight

Representation Surgery in Model Merging with Probabilistic Modeling

ICML 2025Poster

Towards Efficient Online Tuning of VLM Agents via Counterfactual Soft Reinforcement Learning

ICML 2025Poster

Policy Optimization under Imperfect Human Interactions with Agent-Gated Shared Autonomy

ICLR 2025Poster

Group-in-Group Policy Optimization for LLM Agent Training

NeurIPS 2025Poster

OPHR: Mastering Volatility Trading with Multi-Agent Deep Reinforcement Learning

NeurIPS 2025Poster

Efficient Last-Iterate Convergence in Solving Extensive-Form Games

NeurIPS 2025Poster

AgentStudio: A Toolkit for Building General Virtual Agents

ICLR 2025Poster

Cradle: Empowering Foundation Agents towards General Computer Control

ICLR 2025Rejected

Let's Revise Step-by-Step: A Unified Local Search Framework for Code Generation with LLMs

NeurIPS 2025Poster

Incentivizing LLMs to Self-Verify Their Answers

NeurIPS 2025Poster

Improving Reward Models with Proximal Policy Exploration for Preference-Based Reinforcement Learning

NeurIPS 2025Poster

Empirical Study on Robustness and Resilience in Cooperative Multi-Agent Reinforcement Learning

NeurIPS 2025Poster

A Closer Look at Backdoor Attacks on CLIP

ICML 2025Poster

Q-Adapter: Customizing Pre-trained LLMs to New Preferences with Forgetting Mitigation

ICLR 2025Poster

Outward Odyssey: Improving Reward Models with Proximal Policy Exploration for Preference-Based Reinforcement Learning

ICLR 2025Rejected

A Closer Look at Backdoor Attacks on CLIP

ICLR 2025Rejected

Conformal Prediction for Deep Classifier via Truncating

ICLR 2025Rejected

Solving Urban Network Security Games: Learning Platform, Benchmark, and Challenge for AI Research

ICLR 2025Rejected

ASOR: Anchor State Oriented Regularization for Policy Optimization under Dynamics Shift

ICLR 2025Rejected

Cradle: Empowering Foundation Agents towards General Computer Control

ICML 2025Poster

Dual-Head Knowledge Distillation: Enhancing Logits Utilization with an Auxiliary Head

ICLR 2025Withdrawn

MEMO: Memory-Guided and Emotion-Aware Talking Video Generation

ICLR 2025Withdrawn

In-Context Learning for Games

ICLR 2025Rejected

Improving Ordinal Conformal Prediction by Stepwise Adaptive Posterior Alignment

ICLR 2025Withdrawn

Guiding VLM Agents with Process Rewards at Inference Time for GUI Navigation

ICLR 2025Rejected

Efficient LLM Alignment via Hierarchical Coarse-to-Fine Refinement

ICLR 2025Withdrawn

Offline Equilibrium Finding in Extensive-form Games: Datasets, Methods, and Analysis

ICLR 2025Rejected

Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning

ICLR 2025Withdrawn

Computing Ex Ante Equilibrium in Heterogeneous Zero-Sum Team Games

ICLR 2025Withdrawn

合作者 (20)