Daniel Soudry

Associate Professor@Technion - Israel Institute of Technology, Technion·以色列·OpenReview

研究方向

Implicit bias and Generalization in neural networks · Deep learning theory · optimization in neural networks · Grid cells · Statistical Neuroscience · Sparse optimization · Non-negative matrix factorization · Hardware Neural Networks · Memristors · Low precision deep learning · Bayesian Neural Networks · Theoretical Neuroscience · Ion Channels · Single Neurons

Daniel Soudry

PLUMAGE: probablistic low-rank unbiased min variance gradient estimation framework for efficient large model training

FP4 All the Way: Fully Quantized Training of Large Language Models

Temperature is All You Need for Generalization in Langevin Dynamics and other Markov Processes

Scaling FP8 training to trillion-token LLMs

Alias-Free ViT: Fractional Shift Invariance via Linear Attention

Optimal Rates in Continual Linear Regression via Increasing Regularization

When Diffusion Models Memorize: Inductive Biases in Probability Flow of Minimum-Norm Shallow Neural Nets

Tensor-Parallelism with Partially Synchronized Activations

Are Greedy Task Orderings Better Than Random in Continual Linear Regression?

The Inductive Bias of Minimum-Norm Shallow Diffusion Models That Perfectly Fit the Data

De-biasing Diffusion: Data-Free FP8 Quantization of Text-to-Image Models with Billions of Parameters

Foldable SuperNets: Scalable Merging of Transformers with Different Initializations and Tasks