Christian Schroeder de Witt
~Christian_Schroeder_de_Witt1
14
论文总数
7.0
年均投稿
平均评分
接收情况7/14
会议分布
ICLR
9
NeurIPS
3
ICML
1
COLM
1
发表论文 (14 篇)
20259 篇
4
SAGE: Scalable Ground Truth Evaluations for Large Sparse Autoencoders
ICLR 2025withdrawn
4
Mixture of Experts Made Intrinsically Interpretable
ICML 2025Poster
4
Mitigating Goal Misgeneralization via Minimax Regret
ICLR 2025Rejected
4
Efficient Dictionary Learning with Switch Sparse Autoencoders
ICLR 2025Poster
4
Hidden in Plain Text: Emergence & Mitigation of Steganographic Collusion in LLMs
ICLR 2025Rejected
4
Toward Robust Real-World Audio Deepfake Detection: Closing the Explainability Gap
ICLR 2025Rejected
4
MAD-Sherlock: Multi-Agent Debates for Out-of-Context Misinformation Detection
ICLR 2025Rejected
4
Fundamental Limitations in Pointwise Defences of LLM Finetuning APIs
NeurIPS 2025Poster
3
MALT: Improving Reasoning with Multi-Agent LLM Training
COLM 2025Poster
20245 篇
3
Computing Low-Entropy Couplings for Large-Support Distributions
ICLR 2024Rejected
3
Bayesian Exploration Networks
ICLR 2024withdrawn
5
Unelicitable Backdoors via Cryptographic Transformer Circuits
NeurIPS 2024Poster
4
Secret Collusion among AI Agents: Multi-Agent Deception via Steganography
NeurIPS 2024Poster
3
Illusory Attacks: Information-theoretic detectability matters in adversarial attacks
ICLR 2024Spotlight