PaperHub

Dawn Song

~Dawn_Song1

34
论文总数
17.0
年均投稿
5.8
平均评分
接收情况20/34
会议分布
ICLR
22
NeurIPS
6
COLM
4
ICML
2

发表论文 (34 篇)

202525

8.0
4

Capturing the Temporal Dependence of Training Data Influence

ICLR 2025Oral
5.0
4

MultiTrust: Enhancing Safety and Trustworthiness of Large Language Models from Multiple Perspectives

ICLR 2025Rejected
7.5
4

Data Shapley in One Training Run

ICLR 2025Oral
6.4
5

Scalable Best-of-N Selection for Large Language Models via Self-Certainty

NeurIPS 2025Poster
6.5
4

An Undetectable Watermark for Generative Image Models

ICLR 2025Poster
3.0
3

IDS-Agent: An LLM Agent for Explainable Intrusion Detection in IoT Networks

ICLR 2025Rejected
5.5
4

KnowData: Knowledge-Enabled Data Generation for Improving Multimodal Models

ICLR 2025Rejected
3.5
4

Which Network is Trojaned? Increasing Trojan Evasiveness for Model-Level Detectors

ICLR 2025withdrawn
5.3
4

Assessing the Knowledge-intensive Reasoning Capability of Large Language Models with Realistic Benchmarks Generated Programmatically at Scale

ICLR 2025Rejected
5.3
3

AutoScale: Scale-Aware Data Mixing for Pre-Training LLMs

COLM 2025Poster
5.5
4

AutoScale: Automatic Prediction of Compute-optimal Data Compositions for Training LLMs

ICLR 2025Rejected
6.4
5

Multimodal Situational Safety

ICLR 2025Poster
6.8
4

An Illusion of Progress? Assessing the Current State of Web Agents

COLM 2025Poster
5.7
3

KnowHalu: Multi-Form Knowledge Enhanced Hallucination Detection

ICLR 2025Rejected
5.5
4

Improving LLM Safety Alignment with Dual-Objective Optimization

ICML 2025Poster
7.8
4

Why and How LLMs Hallucinate: Connecting the Dots with Subsequence Associations

NeurIPS 2025Poster
5.0
4

SecCodePLT: A Unified Platform for Evaluating the Security of Code GenAI

ICLR 2025Rejected
7.5
4

Assessing Judging Bias in Large Reasoning Models: An Empirical Study

COLM 2025Poster
6.8
4

LeakAgent: RL-based Red-teaming Agent for LLM Privacy Leakage

COLM 2025Poster
7.5
4

AIR-BENCH 2024: A Safety Benchmark based on Regulation and Policies Specified Risk Categories

ICLR 2025Spotlight
6.0
4

GuardAgent: Safeguard LLM Agent by a Guard Agent via Knowledge-Enabled Reasoning

ICLR 2025Rejected
6.1
4

GuardAgent: Safeguard LLM Agents via Knowledge-Enabled Reasoning

ICML 2025Poster
5.8
6

Tamper-Resistant Safeguards for Open-Weight LLMs

ICLR 2025Poster
4.4
5

Can Editing LLMs Inject Harm?

ICLR 2025Rejected
7.0
4

MMDT: Decoding the Trustworthiness and Safety of Multimodal Foundation Models

ICLR 2025Poster

20249