影响力指数

98.55/100

前 0.1%

全站排名 #37

发表论文50 篇

平均评分5.9

年均产出16.7 篇/年

Yuandong Tian

Research Scientist@Meta AI (FAIR)·OpenReview

研究方向

Reinforcement Learning · Representation Learning · Contrastive Learning · Optimization

LLM Pretraining with Continuous Concepts

ICLR 2026Poster

MobileLLM-R1: Exploring the Limits of Sub-Billion Language Model Reasoners with Open Training Recipes

ICLR 2026Poster

Emergence of Superposition: Unveiling the Training Dynamics of Chain of Continuous Thought

ICLR 2026Poster

Deep Think with Confidence

ICLR 2026Poster

STEM: SCALING TRANSFORMERS WITH EMBEDDING MODULES

ICLR 2026Poster

$\mathbf{Li_2}$: A Framework on Dynamics of Feature Emergence and Delayed Generalization

ICLR 2026Poster

Inpainting-Guided Policy Optimization for Diffusion Large Language Models

ICLR 2026Poster

SPG: Sandwiched Policy Gradient for Masked Diffusion Language Models

ICLR 2026Poster

Why RL Updates Look Sparse: An Implicit Compass Drives Optimization Bias

ICLR 2026Rejected

Positional Encoding via Token-Aware Phase Attention

ICLR 2026Rejected

Towards General-Purpose Model-Free Reinforcement Learning

ICLR 2025Spotlight

Reasoning by Superposition: A Theoretical Perspective on Chain of Continuous Thought

NeurIPS 2025Poster

ParetoQ: Improving Scaling Laws in Extremely Low-bit LLM Quantization

NeurIPS 2025Poster

MagicPIG: LSH Sampling for Efficient LLM Generation

ICLR 2025Spotlight

Agent-as-a-Judge: Evaluate Agents with Agents

ICML 2025Poster

Composing Global Solutions to Reasoning Tasks via Algebraic Objects in Neural Nets

NeurIPS 2025Poster

Sail into the Headwind: Alignment via Robust Rewards and Dynamic Labels against Reward Hacking

ICLR 2025Poster

Param$\Delta$ for Direct Mixing: Post-Train Large Language Model At Zero Cost

ICLR 2025Poster

AdvPrefix: An Objective for Nuanced LLM Jailbreaks

NeurIPS 2025Poster

GSM-$\infty$: How Do your LLMs Behave over Infinitely Increasing Reasoning Complexity and Context Length?

ICML 2025Poster

Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning

ICML 2025Poster

Training Large Language Models to Reason in a Continuous Latent Space

COLM 2025Poster

AdvPrompter: Fast Adaptive Adversarial Prompting for LLMs

ICML 2025Poster

From Low Rank Gradient Subspace Stabilization to Low-Rank Weights: Observations, Theories, and Applications

ICML 2025Poster

SpinQuant: LLM Quantization with Learned Rotations

ICLR 2025Poster

Training Large Language Model to Reason in a Continuous Latent Space

ICLR 2025Rejected

Composing Global Optimizers to Reasoning Tasks via Algebraic Objects in Neural Nets

ICLR 2025Rejected

Agent-as-a-Judge: Evaluating Agents with Agents

ICLR 2025Rejected

R-Sparse: Rank-Aware Activation Sparsity for Efficient LLM Inference

ICLR 2025Poster

Tensor-GaLore: Memory-Efficient Training via Gradient Tensor Decomposition

ICLR 2025Rejected

Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge

ICLR 2025Rejected

AdvPrompter: Fast Adaptive Adversarial Prompting for LLMs

ICLR 2025Rejected

Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients

ICLR 2025Withdrawn

From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients

ICLR 2025Rejected

Towards Full Delegation: Designing Ideal Agentic Behaviors for Travel Planning

ICLR 2025Rejected

The Perfect Blend: Redefining RLHF with Mixture of Judges

ICLR 2025Withdrawn

Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces

ICLR 2025Poster

SHARP: Accelerating Language Model Inference by SHaring Adjacent layers with Recovery Parameters

ICLR 2025Rejected

合作者 (20)

Changsheng Zhao