Songjun Tu
~Songjun_Tu1
4
论文总数
4.0
年均投稿
平均评分
接收情况4/4
会议分布
NeurIPS
2
COLM
1
ICLR
1
发表论文 (4 篇)
20254 篇
4
Learning When to Think: Shaping Adaptive Reasoning in R1-Style Models via Multi-Stage RL
NeurIPS 2025Poster
4
Enhancing LLM Reasoning with Iterative DPO: A Comprehensive Empirical Investigation
COLM 2025Poster
4
AlphaDecay: Module-wise Weight Decay for Heavy-Tailed Balancing in LLMs
NeurIPS 2025Poster
3
Unsupervised Zero-Shot Reinforcement Learning via Dual-Value Forward-Backward Representation
ICLR 2025Poster