影响力指数

99.06/100

前 0.1%

全站排名 #20

发表论文75 篇

平均评分5.7

年均产出25.0 篇/年

Quanquan Gu

Research Scientist@ByteDance Seed·美国·OpenReview

研究方向

large language models · deep generative models · AI for Science · reinforcement learning · optimization · deep learning · high-dimensional statistics · active learning · online learning

Quanquan Gu

Scalable Spatio-Temporal SE(3) Diffusion for Long-Horizon Protein Dynamics

Breaking the Total Variance Barrier: Sharp Sample Complexity for Linear Heteroscedastic Bandits with Fixed Action Set

Towards a Sharp Analysis of Offline Policy Learning for $f$-Divergence-Regularized Contextual Bandits

SWINGARENA: Adversarial Programming Arena for Long-context GitHub Issue Solving

Understanding SGD with Exponential Moving Average: A Case Study in Linear Regression

Variance-Dependent Regret Lower Bounds for Contextual Bandits

Protein Autoregressive Modeling via Multiscale Structure Generation

Best-of-Majority: Minimax-Optimal Strategy for Pass@k Inference Scaling

On the Design of KL-Regularized Policy Gradient Algorithms for LLM Reasoning

On the Limits of Test-Time Compute: Sequential Reward Filtering for Better Inference

Transformers Trained via Gradient Descent Can Provably Learn a Class of Teacher Models

Causal Attention with Lookahead Keys

RSPO: Regularized Self-Play Alignment of Large Language Models

Towards Simple and Provable Parameter-Free Adaptive Gradient Methods

Group Representational Position Encoding

Retrieval as Reasoning: Learning to Select and Generate with LLMs

MARS: Unleashing the Power of Variance Reduction for Training Large Models

Simultaneous Modeling of Protein Conformation and Dynamics via Autoregression

Elucidating the Design Space of Multimodal Protein Language Models

Logarithmic Regret for Online KL-Regularized Reinforcement Learning

Tensor Product Attention Is All You Need

An All-Atom Generative Model for Designing Protein Complexes

Designing Cyclic Peptides via Harmonic SDE with Atom-Bond Modeling

Convergence of Score-Based Discrete Diffusion Models: A Discrete-Time Analysis

Variance-Aware Feel-Good Thompson Sampling for Contextual Bandits

Sharp Analysis for KL-Regularized Contextual Bandits and RLHF

Mitigating Object Hallucination in Large Vision-Language Models via Image-Grounded Guidance

Unified Convergence Analysis for Score-Based Diffusion Models with Deterministic Samplers

CryoFM: A Flow-based Foundation Model for Cryo-EM Densities

ProteinBench: A Holistic Evaluation of Protein Foundation Models

DPLM-2: A Multimodal Diffusion Protein Language Model

Energy-Weighted Flow Matching for Offline Reinforcement Learning

Nearly Optimal Algorithms for Contextual Dueling Bandits from Adversarial Feedback

Global Convergence and Rich Feature Learning in $L$-Layer Infinite-Width Neural Networks under $\mu$ Parametrization

Beyond Bradley-Terry Models: A General Preference Model for Language Model Alignment

Tensor Product Attention Is All You Need

Mitigating Object Hallucination in Large Vision-Language Models via Image-Grounded Guidance

Beyond-Expert Performance with Limited Demonstrations: Efficient Imitation Learning with Double Exploration

Self-Play Preference Optimization for Language Model Alignment

Variance-Dependent Regret Lower Bounds for Contextual Bandits

Ranking with Multiple Oracles: From Weak to Strong Stochastic Transitivity

CoPS: Empowering LLM Agents with Provable Cross-Task Experience Sharing

Nearly Optimal Algorithms for Contextual Dueling Bandits from Adversarial Feedback

Multi-Step Preference Optimization via Two-Player Markov Games

Accelerated Preference Optimization for Large Language Model Alignment

General Preference Modeling with Preference Representations for Aligning Language Models

Sharp Analysis for KL-Regularized Contextual Bandits and RLHF

LLaVA-Critic: Learning to Evaluate Multimodal Models

Imbalance-Regularized LoRA: A Plug-and-Play Method for Improving Fine-Tuning of Foundation Models

ProteinWeaver: A Divide-and-Assembly Approach for Protein Backbone Design

Towards Simple and Provable Parameter-Free Adaptive Gradient Methods

Decomposed Direct Preference Optimization for Structure-Based Drug Design

Relative-Translation Invariant Wasserstein Distance

Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization