影响力指数

99.46/100

前 0.1%

全站排名 #8

发表论文124 篇

平均评分5.5

年均产出41.3 篇/年

Dacheng Tao

Full Professor@Nanyang Technological University·新加坡·OpenReview

研究方向

deep learning theory · deep learning · trustworthy artificial intelligence · multitask learning · transfer learning · learning theory · nonnegative matrix factorization · low-rank and sparse decomposition · metric learning · computer vision · machine learning · pattern recognition · image processing

6.0

CoFact: Conformal Factuality Guarantees for Language Models under Covariate Shift

ICLR 2026Poster

通讯

5.5

MaskPro: Linear-Space Probabilistic Learning for Strict (N:M)-Sparsity on LLMs

ICLR 2026Poster

通讯

5.5

Epistemic-Aware Vision–Language Foundation Model for Fetal Ultrasound Interpretation

ICLR 2026Withdrawn

5.3

FACTS: A Future-Aided Causal Teacher-Student Framework for Multimodal Time Series Forecasting

ICLR 2026Rejected

通讯

5.2

Memory Efficient Fine-Tuning of LLMs via Forward-Only Hessian-Free Coordinate Descent

ICLR 2026Rejected

通讯

5.0

AQER: A Scalable and Efficient Data Loader for Digital Quantum Computers

ICLR 2026Poster

通讯

5.0

Theoretical Guarantees for Iterative Alignment of Self-Rewarding Language Models

ICLR 2026Rejected

通讯

5.0

Better, Faster: Harnessing Self-Improvement in Large Reasoning Models

ICLR 2026Rejected

通讯

5.0

Incentivizing LLM Reasoning via Reinforcement Learning with Functional Monte Carlo Tree Search

ICLR 2026Poster

5.0

Unsupervised Reinforcement Learning with Verifiable Rewards via First Repeat Criterion

ICLR 2026Rejected

通讯

5.0

NBSP: A Neuron-Level Framework for Balancing Stability and Plasticity in Deep Reinforcement Learning

ICLR 2026Rejected

通讯

5.0

VeriWeb: Verifiable Long-Chain Web Benchmark for Agentic Information-Seeking

ICLR 2026Rejected

通讯

4.5

GeometryZero: Advancing LLM Geometry Solving via Group Contrastive Policy Optimization

ICLR 2026Withdrawn

4.5

The State of Reinforcement Finetuning for Transformer-based Agents

ICLR 2026Poster

通讯

4.5

Towards a Theoretical Understanding of In-context Learning: Stability and Non-I.I.D Generalisation

ICLR 2026Poster

通讯

4.5

Time Is All It Takes: Spike-Retiming Attacks on Event-Driven Spiking Neural Networks

ICLR 2026Poster

4.5

A Simple "Motivation" Can Enhance Reinforcement Finetuning of Large Reasoning Models

ICLR 2026Poster

通讯

4.5

MedReason-Dx: Benchmarking Step-by-Step Reasoning of Language Models in Medical Diagnosis

ICLR 2026Rejected

通讯

4.4

Revisiting LLM Reasoning via Information Bottleneck

ICLR 2026Rejected

通讯

4.0

TeZO: Empowering the Low-Rankness on the Temporal Dimension in the Zeroth-Order Optimization for Fine-tuning LLMs

ICLR 2026Rejected

通讯

4.0

Rethinking the Role of Dynamic Sparse Training for Scalable Deep Reinforcement Learning

ICLR 2026Rejected

通讯

4.0

EvoSyn: Generalizable Evolutionary Data Synthesis for Verifiable Learning

ICLR 2026Withdrawn

通讯

4.0

SimReg: Achieving Higher Convergence and Generalization in the LLM Pretraining via Embedding Similarity Regularization

ICLR 2026Rejected

4.0

SeWA: Selective Weight Average via Probabilistic Masking

ICLR 2026Rejected

通讯

3.5

A Bias–Variance Tradeoff Perspective for Improving Test-Time Scaling

ICLR 2026Rejected

三作

3.5

Q-learning Penalized Transformer for Safe Offline Reinforcement Learning

ICLR 2026Rejected

通讯

3.5

Minutes to Converage: Dataset Distillation for Rapid SNN Training on Event Streams

ICLR 2026Withdrawn

3.5

Intra-Trajectory Consistency for Reward Modeling

ICLR 2026Rejected

通讯

3.3

SEA-SpeechBench: A Large-Scale Multitask Benchmark for Speech Understanding Across Southeast Asia

ICLR 2026Withdrawn

3.0

Auditing Test Data Contamination with Error Rate Control for Reliable LLM Evaluation

ICLR 2026Withdrawn

通讯

2.5

What Makes Large Language Models Undistillable?

ICLR 2026Withdrawn

通讯

2.5

MAPO: MIXED ADVANTAGE POLICY OPTIMIZATION

ICLR 2026Withdrawn

通讯

8.3

Network Sparsity Unlocks the Scaling Potential of Deep Reinforcement Learning

ICML 2025Oral

通讯

8.3

Retrieval-Augmented Perception: High-resolution Image Perception Meets Visual RAG

ICML 2025Oral

通讯

8.2

Adaptive Defense against Harmful Fine-Tuning for Large Language Models via Bayesian Data Scheduler

NeurIPS 2025Spotlight

通讯

8.0

Open-Vocabulary Customization from CLIP via Data-Free Knowledge Distillation

ICLR 2025Oral

通讯

8.0

Dacheng Tao

Convergent Differential Privacy Analysis for General Federated Learning

OptMerge: Unifying Multimodal LLM Capabilities and Modalities via Model Merging

EchoBench: Benchmarking Sycophancy in Medical Large Vision-Language Models

CoFact: Conformal Factuality Guarantees for Language Models under Covariate Shift

MaskPro: Linear-Space Probabilistic Learning for Strict (N:M)-Sparsity on LLMs

Epistemic-Aware Vision–Language Foundation Model for Fetal Ultrasound Interpretation

FACTS: A Future-Aided Causal Teacher-Student Framework for Multimodal Time Series Forecasting

Memory Efficient Fine-Tuning of LLMs via Forward-Only Hessian-Free Coordinate Descent

AQER: A Scalable and Efficient Data Loader for Digital Quantum Computers

Theoretical Guarantees for Iterative Alignment of Self-Rewarding Language Models

Better, Faster: Harnessing Self-Improvement in Large Reasoning Models

Incentivizing LLM Reasoning via Reinforcement Learning with Functional Monte Carlo Tree Search

Unsupervised Reinforcement Learning with Verifiable Rewards via First Repeat Criterion

NBSP: A Neuron-Level Framework for Balancing Stability and Plasticity in Deep Reinforcement Learning

VeriWeb: Verifiable Long-Chain Web Benchmark for Agentic Information-Seeking

GeometryZero: Advancing LLM Geometry Solving via Group Contrastive Policy Optimization

The State of Reinforcement Finetuning for Transformer-based Agents

Towards a Theoretical Understanding of In-context Learning: Stability and Non-I.I.D Generalisation

Time Is All It Takes: Spike-Retiming Attacks on Event-Driven Spiking Neural Networks

A Simple "Motivation" Can Enhance Reinforcement Finetuning of Large Reasoning Models

MedReason-Dx: Benchmarking Step-by-Step Reasoning of Language Models in Medical Diagnosis

Revisiting LLM Reasoning via Information Bottleneck

TeZO: Empowering the Low-Rankness on the Temporal Dimension in the Zeroth-Order Optimization for Fine-tuning LLMs

Rethinking the Role of Dynamic Sparse Training for Scalable Deep Reinforcement Learning

EvoSyn: Generalizable Evolutionary Data Synthesis for Verifiable Learning

SimReg: Achieving Higher Convergence and Generalization in the LLM Pretraining via Embedding Similarity Regularization

SeWA: Selective Weight Average via Probabilistic Masking

A Bias–Variance Tradeoff Perspective for Improving Test-Time Scaling

Q-learning Penalized Transformer for Safe Offline Reinforcement Learning

Minutes to Converage: Dataset Distillation for Rapid SNN Training on Event Streams

Intra-Trajectory Consistency for Reward Modeling

SEA-SpeechBench: A Large-Scale Multitask Benchmark for Speech Understanding Across Southeast Asia

Auditing Test Data Contamination with Error Rate Control for Reliable LLM Evaluation

What Makes Large Language Models Undistillable?

MAPO: MIXED ADVANTAGE POLICY OPTIMIZATION

Network Sparsity Unlocks the Scaling Potential of Deep Reinforcement Learning

Retrieval-Augmented Perception: High-resolution Image Perception Meets Visual RAG

Adaptive Defense against Harmful Fine-Tuning for Large Language Models via Bayesian Data Scheduler

Open-Vocabulary Customization from CLIP via Data-Free Knowledge Distillation

Combatting Dimensional Collapse in LLM Pre-Training Data via Submodular File Selection

Detoxifying Large Language Models via Autoregressive Reward Guided Representation Editing

NBSP: A Neuron-Level Framework for Balancing Stability and Plasticity in Deep Reinforcement Learning

Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search

SPAZER: Spatial-Semantic Progressive Reasoning Agent for Zero-shot 3D Visual Grounding

Energy-based Backdoor Defense Against Federated Graph Learning

D$^2$GS: Dense Depth Regularization for LiDAR-free Urban Scene Reconstruction

Effective Policy Learning for Multi-Agent Online Coordination Beyond Submodular Objectives

Value-Guided Decision Transformer: A Unified Reinforcement Learning Framework for Online and Offline Settings

Analytic Energy-Guided Policy Optimization for Offline Reinforcement Learning

The Energy Loss Phenomenon in RLHF: A New Perspective on Mitigating Reward Hacking

Decision Mixer: Integrating Long-term and Local Dependencies via Dynamic Token Selection for Decision-Making

Learn from Downstream and Be Yourself in Multimodal Large Language Models Fine-Tuning

SEGA: Shaping Semantic Geometry for Robust Hashing under Noisy Supervision

Self-Verification Provably Prevents Model Collapse in Recursive Synthetic Training

AiDE-Q: Synthetic Labeled Datasets Can Enhance Learning Models for Quantum Property Estimation

FreDF: Learning to Forecast in the Frequency Domain

On Geometry-Enhanced Parameter-Efficient Fine-Tuning for 3D Scene Segmentation

Convergent Differential Privacy Analysis for General Federated Learning

Safety Reasoning with Guidelines

T2V-OptJail: Discrete Prompt Optimization for Text-to-Video Jailbreak Attacks

VORTA: Efficient Video Diffusion via Routing Sparse Attention

AdaVideoRAG: Omni-Contextual Adaptive Retrieval-Augmented Efficient Long Video Understanding

Ada-R1: Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization

SeRL: Self-play Reinforcement Learning for Large Language Models with Limited Data

Panacea: Mitigating Harmful Fine-tuning for Large Language Models via Post-fine-tuning Perturbation

Tackling Continual Offline RL through Selective Weights Activation on Aligned Spaces

Near-Optimal Online Learning for Multi-Agent Submodular Coordination: Tight Approximation and Communication Efficiency

Dynamic Neural Fortresses: An Adaptive Shield for Model Extraction Defense

A Theoretical Perspective: How to Prevent Model Collapse in Self-consuming Training Loops

Be Confident: Uncovering Overfitting in MLLM Multi-Task Tuning

A Statistical Approach for Controlled Training Data Detection

Consistent Paths Lead to Truth: Self-Rewarding Reinforcement Learning for LLM Reasoning

Merging on the Fly Without Retraining: A Sequential Approach to Scalable Continual Model Merging

R1-ShareVL: Incentivizing Reasoning Capabilities of Multimodal Large Language Models via Share-GRPO

ICLShield: Exploring and Mitigating In-Context Learning Backdoor Attacks

Learning system dynamics without forgetting

Multinoulli Extension: A Lossless Yet Effective Probabilistic Framework for Subset Selection over Partition Constraints

MD-LSM: An Efficient Tool for Real-time Monitoring Linear Separability of Hidden-layer Outputs of Deep Networks

NoVo: Norm Voting off Hallucinations with Attention Heads in Large Language Models