Banghua Zhu
~Banghua_Zhu1
13
论文总数
6.5
年均投稿
平均评分
接收情况7/13
会议分布
ICLR
10
COLM
2
ICML
1
发表论文 (13 篇)
20255 篇
4
Taming Overconfidence in LLMs: Reward Calibration in RLHF
ICLR 2025Poster
4
Watermarking using Semantic-aware Speculative Sampling: from Theory to Practice
ICLR 2025Rejected
4
From Crowdsourced Data to High-quality Benchmarks: Arena-Hard and Benchbuilder Pipeline
ICML 2025Poster
3
Bench-O-Matic: Automating Benchmark Curation from Crowdsourced Data
ICLR 2025Rejected
4
How to Evaluate Reward Models for RLHF
ICLR 2025Poster
20248 篇
-
Data Refinement: Mitigating Reward Over-Optimization in Reinforcement Learning with Human Feedback
ICLR 2024withdrawn
4
Starling-7B: Improving Helpfulness and Harmlessness with RLAIF
COLM 2024Poster
4
Fine-Tuning Language Models with Advantage-Induced Policy Alignment
ICLR 2024Rejected
4
Towards the Fundamental Limits of Knowledge Transfer over Finite Domains
ICLR 2024Poster
4
Pairwise Proximal Policy Optimization: Language Model Alignment with Comparative RL
COLM 2024Poster
4
Pairwise Proximal Policy Optimization: Harnessing Relative Feedback for LLM Alignment
ICLR 2024Rejected
4
The Effective Horizon Explains Deep RL Performance in Stochastic Environments
ICLR 2024Spotlight
4
QFT: Quantized Full-parameter Tuning of LLMs with Affordable Resources
ICLR 2024withdrawn