Alec Koppel

Senior Sceintist@Johns Hopkins University Applied Physics Laboratory·美国·OpenReview

研究方向

multi-armed bandits · reinforcement learning · Markov Decision Processes · kernel methods · continuous optimization · online learning · supervised learning · stochastic optimization · nonlinear programming

4.7

No One Size Fits All: QueryBandits for Hallucination Mitigation

ICLR 2026Rejected

三作

7.8

Learning in Stackelberg Mean Field Games: A Non-Asymptotic Analysis

NeurIPS 2025Poster

通讯

6.5

GenARM: Reward Guided Generation with Autoregressive Reward Model for Test-Time Alignment

ICLR 2025Poster

三作

6.3

Collab: Controlled Decoding using Mixture of Agents for LLM Alignment

ICLR 2025Poster

5.8

SAIL: Self-improving Efficient Online Alignment of Large Language Models

合作者 (20)

Udari Madhushani Sehwag

2 篇