Dawn Song
~Dawn_Song1
34
论文总数
17.0
年均投稿
平均评分
接收情况20/34
会议分布
ICLR
22
NeurIPS
6
COLM
4
ICML
2
发表论文 (34 篇)
202525 篇
4
Capturing the Temporal Dependence of Training Data Influence
ICLR 2025Oral
4
MultiTrust: Enhancing Safety and Trustworthiness of Large Language Models from Multiple Perspectives
ICLR 2025Rejected
4
Data Shapley in One Training Run
ICLR 2025Oral
5
Scalable Best-of-N Selection for Large Language Models via Self-Certainty
NeurIPS 2025Poster
4
An Undetectable Watermark for Generative Image Models
ICLR 2025Poster
3
IDS-Agent: An LLM Agent for Explainable Intrusion Detection in IoT Networks
ICLR 2025Rejected
4
KnowData: Knowledge-Enabled Data Generation for Improving Multimodal Models
ICLR 2025Rejected
4
Which Network is Trojaned? Increasing Trojan Evasiveness for Model-Level Detectors
ICLR 2025withdrawn
4
Assessing the Knowledge-intensive Reasoning Capability of Large Language Models with Realistic Benchmarks Generated Programmatically at Scale
ICLR 2025Rejected
3
AutoScale: Scale-Aware Data Mixing for Pre-Training LLMs
COLM 2025Poster
4
AutoScale: Automatic Prediction of Compute-optimal Data Compositions for Training LLMs
ICLR 2025Rejected
5
Multimodal Situational Safety
ICLR 2025Poster
4
An Illusion of Progress? Assessing the Current State of Web Agents
COLM 2025Poster
3
KnowHalu: Multi-Form Knowledge Enhanced Hallucination Detection
ICLR 2025Rejected
4
Improving LLM Safety Alignment with Dual-Objective Optimization
ICML 2025Poster
4
Why and How LLMs Hallucinate: Connecting the Dots with Subsequence Associations
NeurIPS 2025Poster
4
SecCodePLT: A Unified Platform for Evaluating the Security of Code GenAI
ICLR 2025Rejected
4
Assessing Judging Bias in Large Reasoning Models: An Empirical Study
COLM 2025Poster
4
LeakAgent: RL-based Red-teaming Agent for LLM Privacy Leakage
COLM 2025Poster
4
AIR-BENCH 2024: A Safety Benchmark based on Regulation and Policies Specified Risk Categories
ICLR 2025Spotlight
4
GuardAgent: Safeguard LLM Agent by a Guard Agent via Knowledge-Enabled Reasoning
ICLR 2025Rejected
4
GuardAgent: Safeguard LLM Agents via Knowledge-Enabled Reasoning
ICML 2025Poster
6
Tamper-Resistant Safeguards for Open-Weight LLMs
ICLR 2025Poster
5
Can Editing LLMs Inject Harm?
ICLR 2025Rejected
4
MMDT: Decoding the Trustworthiness and Safety of Multimodal Foundation Models
ICLR 2025Poster
20249 篇
5
GREATS: Online Selection of High-Quality Data for LLM Training in Every Iteration
NeurIPS 2024Spotlight
5
AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge Bases
NeurIPS 2024Poster
3
Agent Instructs Large Language Models to be General Zero-Shot Reasoners
ICLR 2024Rejected
4
SHINE: Shielding Backdoors in Deep Reinforcement Learning
ICLR 2024Rejected
3
Tree-as-a-Prompt: Boosting Black-Box Large Language Models on Few-Shot Classification of Tabular Data
ICLR 2024withdrawn
4
Boosting Alignment for Post-Unlearning Text-to-Image Generative Models
NeurIPS 2024Poster
5
Data Free Backdoor Attacks
NeurIPS 2024Poster
4
Effective and Efficient Federated Tree Learning on Hybrid Data
ICLR 2024Poster
3
Enhancing Neural Network Transparency through Representation Analysis
ICLR 2024Rejected