影响力指数

96.76/100

前 0.2%

全站排名 #104

发表论文51 篇

平均评分5.4

年均产出17.0 篇/年

Zhouchen Lin

Professor@Peking University·中国·OpenReview

研究方向

Training of Deep Neural Networks · Neural Network Architecture Design · Convex and Nonconvex Optimization

Online Pseudo-Zeroth-Order Training of Neuromorphic Spiking Neural Networks

ICLR 2026Poster

DistDF: Time-series Forecasting Needs Joint-distribution Wasserstein Alignment

ICLR 2026Poster

Quadratic Direct Forecast for Training Multi-Step Time-Series Forecast Models

ICLR 2026Poster

LogiConBench: Benchmarking Logical Consistencies of LLMs

ICLR 2026Poster

DNT: a Deeply Normalized Transformer that can be trained by Momentum SGD

ICLR 2026Poster

ProofRM: A Scalable Pipeline to Train a Generalized Math Proof Reward Model

ICLR 2026Rejected

Rethinking the Flow-based Gradual Domain Adaption: A Semi-Dual Transport Perspective

ICLR 2026Rejected

Transformers with Endogenous In-Context Learning: Bias Characterization and Mitigation

ICLR 2026Poster

Partial Identification via Optimal Transport under Complex Constraints on Treatments and Potential Outcome Measures

ICLR 2026Rejected

VACT: A Video Automatic Causal Testing System and a Benchmark

ICLR 2026Rejected

Enhancing Logical Reasoning of Large Language Models via Phased Fine-Tuning

ICLR 2026Rejected

Governing Equation Discovery from Data Based on Differential Invariants

ICLR 2026Rejected

Personal Tokens Matter: Towards Token-Aware Training for Personalized LLMs

ICLR 2026Rejected

Conda: Column-Normalized Adam for Training Large Language Models Faster

ICLR 2026Withdrawn

GL-Fusion: Rethinking the Combination of Graph Neural Network and Large Language model

ICLR 2026Rejected

Simple Convergence Proof of Adam From a Sign-like Descent Perspective

ICLR 2026Rejected

Projective Equivariant Networks via Second-order Fundamental Differential Invariants

NeurIPS 2025Spotlight

Time-o1: Time-Series Forecasting Needs Transformed Label Alignment

NeurIPS 2025Poster

On the $O(\frac{\sqrt{d}}{K^{1/4}})$ Convergence Rate of AdamW Measured by $\ell_1$ Norm

NeurIPS 2025Poster

Stepsize anything: A unified learning rate schedule for budgeted-iteration training

NeurIPS 2025Poster

AdaMSS: Adaptive Multi-Subspace Approach for Parameter-Efficient Fine-Tuning

NeurIPS 2025Poster

Active Treatment Effect Estimation via Limited Samples

ICML 2025Poster

Pyramidal Flow Matching for Efficient Video Generative Modeling

ICLR 2025Poster

PaZO: Preconditioned Accelerated Zeroth-Order Optimization for Fine-Tuning LLMs

NeurIPS 2025Poster

Improving Model Representation and Reducing KV Cache via Skip Connections with First Value Heads

NeurIPS 2025Poster

Inverse Methods for Missing Data Imputation

NeurIPS 2025Poster

Unbiased Recommender Learning from Implicit Feedback via Weakly Supervised Learning

ICML 2025Poster

Number Cookbook: Number Understanding of Language Models and How to Improve It

ICLR 2025Poster

TC-MoE: Augmenting Mixture of Experts with Ternary Expert Choice

ICLR 2025Poster

Language Ranker: A Lightweight Ranking framework for LLM Decoding

NeurIPS 2025Poster

Explicit Discovery of Nonlinear Symmetries from Dynamic Data

ICML 2025Poster

SEPARATE: A Simple Low-rank Projection for Gradient Compression in Modern Large-scale Model Training Process

ICLR 2025Poster

Tool Decoding: A Plug-and-Play Approach to Enhancing Language Models for Tool Usage

ICLR 2025Withdrawn

PseuZO: Pseudo-Zeroth-Order Algorithm for Training Deep Neural Networks

NeurIPS 2025Poster

Affine Steerable Equivariant Layer for Canonicalization of Neural Networks

ICLR 2025Poster

Low-Dimension-to-High-Dimension Generalization and Its Implications for Length Generalization

ICML 2025Poster

MLAE: Masked LoRA Experts for Visual Parameter-Efficient Fine-Tuning

ICLR 2025Rejected

GL-Fusion: Rethinking the Combination of Graph Neural Network and Large Language model

ICLR 2025Rejected

Low-Dimension-to-High-Dimension Generalization and Its Implications for Length Generalization

ICLR 2025Rejected

Finding Second-order Stationary Points for Generalized-Smooth Nonconvex Minimax Optimization via Gradient-based Algorithm

ICLR 2025Rejected

Provable Faster Zeroth-order Method for Bilevel Optimization with Optimal Dependency on Error and Dimension

ICLR 2025Rejected

Incorporating Arbitrary Matrix Group Equivariance into KANs

ICML 2025Poster

EKAN: Equivariant Kolmogorov-Arnold Networks

ICLR 2025Withdrawn

Variance-Reduced Normalized Zeroth Order Method for Generalized-Smooth Non-Convex Optimization

ICLR 2025Rejected

合作者 (20)