Mantas Mazeika
~Mantas_Mazeika3
8
论文总数
4.0
年均投稿
平均评分
接收情况2/8
会议分布
ICLR
7
NeurIPS
1
发表论文 (8 篇)
20254 篇
4
Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs
NeurIPS 2025Spotlight
4
Which Network is Trojaned? Increasing Trojan Evasiveness for Model-Level Detectors
ICLR 2025withdrawn
4
Evaluating Model Robustness Against Unforeseen Adversarial Attacks
ICLR 2025Rejected
6
Tamper-Resistant Safeguards for Open-Weight LLMs
ICLR 2025Poster
20244 篇
4
How Hard is Trojan Detection in DNNs? Fooling Detectors With Evasive Trojans
ICLR 2024Rejected
4
Robustness Evaluation of Proxy Models against Adversarial Optimization
ICLR 2024Rejected
4
Evaluating Robustness to Unforeseen Adversarial Attacks
ICLR 2024Rejected
3
Enhancing Neural Network Transparency through Representation Analysis
ICLR 2024Rejected