影响力指数

95.79/100

前 0.2%

全站排名 #150

发表论文68 篇

平均评分4.9

年均产出22.7 篇/年

Di Wang

Assistant Professor@KAUST·沙特阿拉伯·OpenReview

研究方向

interpretability · fairness · learning theory · Differential Privacy

Controlling Repetition in Protein Language Models

ICLR 2026Poster

Dual-Kernel Adapter: Expanding Spatial Horizons for Data-Constrained Medical Image Analysis

ICLR 2026Poster

Predicting LLM Output Length via Entropy-Guided Representations

ICLR 2026Poster

Understanding and Improving Continuous LLM Adversarial Training via In-context Learning Theory

ICLR 2026Poster

The Price of Amortized inference in Sparse Autoencoders

ICLR 2026Poster

When Modalities Conflict: How Unimodal Reasoning Uncertainty Governs Preference Dynamics in MLLMs

ICLR 2026Rejected

Dissecting Representation Misalignment in Contrastive Learning via Influence Function

ICLR 2026Poster

Evaluating Data Influence in Meta Learning

ICLR 2026Poster

Mechanistic Analysis of Demonstration Conflicts in In-Context Learning

ICLR 2026Rejected

Understanding Private Learning From Feature Perspective

ICLR 2026Rejected

Benign Overfitting in Adversarial Training for Vision Transformers

ICLR 2026Rejected

Untargeted Jailbreak Attack

ICLR 2026Withdrawn

MSRS: Adaptive Multi-Subspace Representation Steering for Attribute Alignment in Large Language Models

ICLR 2026Rejected

MONICA: Real-Time Monitoring and Calibration of Chain-of-Thought Sycophancy in Large Reasoning Models

ICLR 2026Rejected

Robust Learning of Diffusion Models with Extremely Noisy Conditions

ICLR 2026Rejected

D-LEAF: Localizing and Correcting Hallucinations in Multimodal LLMs via Layer-to-head Attention Diagnostics

ICLR 2026Withdrawn

Efficient and Stable Grouped RL Training for Large Language Models

ICLR 2026Withdrawn

Dynamic Target Attack

ICLR 2026Rejected

Curriculum-RLAIF: Curriculum Alignment with Reinforcement Learning from AI Feedback

ICLR 2026Withdrawn

Flattery in Motion: Benchmarking and Analyzing Sycophancy in Video-LLMs

ICLR 2026Withdrawn

CACE-Net: Cascade Coupling Effect for Link Prediction in Multi-layer Networks

ICLR 2026Withdrawn

Investigating CoT Monitorability in Large Reasoning Models

ICLR 2026Withdrawn

Attributing Data for Sharpness-Aware Minimization

ICLR 2026Withdrawn

Pre-trained Encoder Inference: Revealing Upstream Encoders In Downstream Machine Learning Services

ICLR 2026Withdrawn

TAD-Net: Reinforced Anomaly Generation and Wavelet-enhanced Prediction for Temporal Anomaly Detection

ICLR 2026Withdrawn

Concept-Based Dictionary Learning for Inference-Time Safety in Vision–Language–Action Models

ICLR 2026Withdrawn

Flexible Feature Distillation for Large Language Models

ICLR 2026Rejected

CoLa: A Choice Leakage Attack Framework To Expose Privacy Risks In Subset Training

ICLR 2026Withdrawn

Goal-oriented Backdoor Attack against Vision-Language-Action Models via Physical Objects

ICLR 2026Rejected

Model-agnostic Adversarial Attack and Defense for Vision-Language-Action Models

ICLR 2026Rejected

Backdooring CLIP through Concept Confusion

ICLR 2026Withdrawn

Compositional Architecture of Regret in Large Language Models

ICLR 2026Withdrawn

Locate-then-edit for Multi-hop Factual Recall under Knowledge Editing

ICML 2025Poster

Second-Order Convergence in Private Stochastic Non-Convex Optimization

NeurIPS 2025Poster

Editable Concept Bottleneck Models

ICML 2025Poster

Short-length Adversarial Training Helps LLMs Defend Long-length Jailbreak Attacks: Theoretical and Empirical Evidence

NeurIPS 2025Poster

EAP-GP: Mitigating Saturation Effect in Gradient-based Automated Circuit Identification

NeurIPS 2025Poster

Private Training Large-scale Models with Efficient DP-SGD

NeurIPS 2025Poster

Private Stochastic Optimization for Achieving Second-Order Stationary Points

ICLR 2025Rejected

Scalable Zeroth-Order Fine-Tuning for Extremely Large Language Models with Limited GPU Memory

COLM 2025Poster

Locate-then-edit for Multi-hop Factual Recall under Knowledge Editing

ICLR 2025Rejected

Towards User-level Private Reinforcement Learning with Human Feedback

COLM 2025Poster

Editable Concept Bottleneck Models

ICLR 2025Rejected

Dissecting Misalignment of Multimodal Large Language Models via Influence Function

ICLR 2025Rejected

FlashDP: Memory-Efficient and High-Throughput DP-SGD Training for Large Language Models

ICLR 2025Withdrawn

Private Stochastic Convex Optimization with Tysbakov Noise Condition and Large Lipschitz Constant

ICLR 2025Withdrawn

Low-cost Enhancer for Text Attributed Graph Learning via Graph Alignment

ICLR 2025Withdrawn

Representation Confusion: Towards Representation Backdoor on CLIP via Concept Activation

ICLR 2025Rejected

XTraffic: A Dataset Where Traffic Meets Incidents with Explainability and More

ICLR 2025Withdrawn

ZO-Offloading: Fine-Tuning LLMs with 100 Billion Parameters on a Single GPU

ICLR 2025Withdrawn

Understanding Reasoning in Chain-of-Thought from the Hopfieldian View

ICLR 2025Withdrawn

What Makes Your Model a Low-empathy or Warmth Person: Exploring the Origins of Personality in LLMs

ICLR 2025Withdrawn

合作者 (20)

PhD Advisee30 篇