Dieuwke Hupkes
~Dieuwke_Hupkes1
4
论文总数
4.0
年均投稿
平均评分
接收情况1/4
会议分布
ICLR
3
COLM
1
发表论文 (4 篇)
20254 篇
5
Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-Judges
ICLR 2025withdrawn
6
Quantifying Variance in Evaluation Benchmarks
ICLR 2025Rejected
4
Correlating and Predicting Human Evaluations of Language Models from Natural Language Processing Benchmarks
ICLR 2025Rejected
3
MLGym: A New Framework and Benchmark for Advancing AI Research Agents
COLM 2025Poster