Jiantao Jiao

~Jiantao_Jiao1

19

论文总数

9.5

年均投稿

6.1

平均评分

接收情况12/19

会议分布

ICLR

9

NeurIPS

6

ICML

2

COLM

2

发表论文 (19 篇)

202510 篇

Thinking LLMs: General Instruction Following with Thought Generation

ICML 2025Poster

Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs

ICLR 2025Rejected

Reasoning by Superposition: A Theoretical Perspective on Chain of Continuous Thought

NeurIPS 2025Poster

Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning

ICML 2025Poster

Generalization or Hallucination? Understanding Out-of-Context Reasoning in Transformers

NeurIPS 2025Poster

Information-Driven Design of Imaging Systems

NeurIPS 2025Poster

EmbedLLM: Learning Compact Representations of Large Language Models

ICLR 2025Spotlight

Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge

ICLR 2025Rejected

How to Evaluate Reward Models for RLHF

ICLR 2025Poster

Watermarking using Semantic-aware Speculative Sampling: from Theory to Practice

ICLR 2025Rejected

20249 篇

An Analysis of Tokenization: Transformers under Markov Data

NeurIPS 2024Spotlight

Data Refinement: Mitigating Reward Over-Optimization in Reinforcement Learning with Human Feedback

ICLR 2024withdrawn

Toxicity Detection for Free

NeurIPS 2024Spotlight

Towards a Theoretical Understanding of the 'Reversal Curse' via Training Dynamics

NeurIPS 2024Poster

Pairwise Proximal Policy Optimization: Language Model Alignment with Comparative RL

COLM 2024Poster

Pairwise Proximal Policy Optimization: Harnessing Relative Feedback for LLM Alignment

ICLR 2024Rejected

End-to-end Story Plot Generator

ICLR 2024Rejected

Fine-Tuning Language Models with Advantage-Induced Policy Alignment

ICLR 2024Rejected

Starling-7B: Improving Helpfulness and Harmlessness with RLAIF

COLM 2024Poster

合作者 (20)

Banghua Zhu7 篇

Hanlin Zhu7 篇

Tianhao Wu6 篇

Kannan Ramchandran5 篇

Yuandong Tian5 篇

Michael Jordan4 篇

Zhaojin Wen3 篇

Stuart Russell3 篇