Zhaoran Wang
~Zhaoran_Wang1
17
论文总数
8.5
年均投稿
平均评分
接收情况8/17
会议分布
ICLR
12
ICML
4
NeurIPS
1
发表论文 (17 篇)
202512 篇
4
Are Transformers Able to Reason by Connecting Separated Knowledge in Training Data?
ICLR 2025Poster
3
An Instrumental Value for Data Production and its Application to Data Pricing
ICML 2025Poster
-
Hindsight Planner: A Closed-loop few-shot planner for Embodied Instruction Following
ICLR 2025withdrawn
4
Provably Efficient and Practical Self-Play for Better LLM Alignment
ICLR 2025Rejected
4
Human-Instruction-Free LLM Self-Alignment with Limited Samples
ICLR 2025Rejected
4
The Sample Complexity of Online Strategic Decision Making with Information Asymmetry and Knowledge Transportability
ICML 2025Poster
4
Progressive LLM Alignments Using Two-Player Games
ICLR 2025Rejected
4
How Can LLM Guide RL? A Value-Based Approach
ICLR 2025withdrawn
-
Self-Exploring Language Models: Active Preference Elicitation for Online Alignment
ICLR 2025withdrawn
5
Reward-Augmented Data Enhances Direct Preference Alignment of LLMs
ICLR 2025Rejected
4
Reward-Augmented Data Enhances Direct Preference Alignment of LLMs
ICML 2025Poster
3
BRiTE: Bootstrapping Reinforced Thinking Process to Enhance Language Model Reasoning
ICML 2025Poster
20245 篇
5
Sample-Efficient Multi-Agent RL: An Optimization Perspective
ICLR 2024Poster
3
What and How does In-Context Learning Learn? Bayesian Model Averaging, Parameterization, and Generalization
ICLR 2024Rejected
4
Reason for Future, Act for Now: A Principled Architecture for Autonomous LLM Agents
ICLR 2024Rejected
4
Let Models Speak Ciphers: Multiagent Debate through Embeddings
ICLR 2024Poster
3
Provably Mitigating Overoptimization in RLHF: Your SFT Loss is Implicitly an Adversarial Regularizer
NeurIPS 2024Poster