PaperHub

William Yang Wang

~William_Yang_Wang2

39
论文总数
19.5
年均投稿
5.5
平均评分
接收情况21/39
会议分布
ICLR
27
COLM
6
NeurIPS
4
ICML
2

发表论文 (39 篇)

202523

3.0
3

DebUnc: Improving Large Language Model Agent Communication Via Uncertainty Metrics

ICLR 2025withdrawn
6.0
4

ThoughtTerminator: Benchmarking, Calibrating, and Mitigating Overthinking in Reasoning Models

COLM 2025Poster
5.6
5

Gödel Agent: A Self-Referential Framework Helps for Recursively Self-Improvement

ICLR 2025Rejected
4.4
4

MELON: Provable Defense Against Indirect Prompt Injection Attacks in AI Agents

ICML 2025Poster
4.8
4

VSP: Assessing the dual challenges of perception and reasoning in spatial planning tasks for MLLMs

ICLR 2025withdrawn
4.0
4

Pixelated Instructions: Can Multimodal Large Language Models Follow Printed Instructions in Images?

ICLR 2025Rejected
5.0
5

Understanding the Interplay between Parametric and Contextual Knowledge for Large Language Models

ICLR 2025Rejected
4.3
3

Compact Multimodal Context Represenations Using Visual Tokens

ICLR 2025Rejected
4.8
4

TC-Bench: Benchmarking Temporal Compositionality in Conditional Video Generation

ICLR 2025Rejected
4.0
4

SWE-Search: Enhancing Software Agents with Monte Carlo Tree Search and Iterative Refinement

ICLR 2025Poster
3.8
4

Detecting Training Data of Large Language Models via Expectation Maximization

ICLR 2025Rejected
5.3
3

Discovering Factor Level Preferences to Improve Human-Model Alignment

ICLR 2025Rejected
5.4
5

Weak-to-Strong Jailbreaking on Large Language Models

ICLR 2025Rejected
5.0
4

Generalization v.s. Memorization: Tracing Language Models’ Capabilities Back to Pretraining Data

ICLR 2025Poster
6.0
3

Speculative Knowledge Distillation: Bridging the Teacher-Student Gap Through Interleaved Sampling

ICLR 2025Poster
6.0
5

T2V-Turbo-v2: Enhancing Video Model Post-Training through Data, Reward, and Conditional Guidance Design

ICLR 2025Poster
6.3
3

Weak-to-Strong Jailbreaking on Large Language Models

ICML 2025Poster
5.8
4

COrAL: Order-Agnostic Language Modeling for Efficient Iterative Refinement

ICLR 2025Rejected
7.3
4

MuSLR: Multimodal Symbolic Logical Reasoning

NeurIPS 2025Poster
6.0
5

MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos

ICLR 2025Poster
4.4
5

Can Editing LLMs Inject Harm?

ICLR 2025Rejected
6.0
4

MMSci: A Dataset for Graduate-Level Multi-Discipline Multimodal Scientific Understanding

ICLR 2025Rejected
5.7
3

MLGym: A New Framework and Benchmark for Advancing AI Research Agents

COLM 2025Poster

202416

5.8
4

Benchmarks as Microscopes: A Call for Model Metrology

COLM 2024Poster
6.0
4

ToolDec: Syntax Error-Free and Generalizable Tool Use for LLMs via Finite-State Decoding

ICLR 2024Rejected
6.3
4

FASTopic: Pretrained Transformer is a Fast, Adaptive, Stable, and Transferable Topic Model

NeurIPS 2024Poster
6.3
4

Neuroformer: Multimodal and Multitask Generative Pretraining for Brain Data

ICLR 2024Poster
7.0
4

Guiding Instruction-based Image Editing via Multimodal Large Language Models

ICLR 2024Spotlight
5.8
4

Language Control Diffusion: Efficiently Scaling through Space, Time, and Tasks

ICLR 2024Poster
7.5
4

Commonsense-T2I Challenge: Can Text-to-Image Generation Models Understand Commonsense?

COLM 2024Poster
4.8
4

Shadow Alignment: The Ease of Subverting Safely-Aligned Language Models

ICLR 2024withdrawn
6.7
3

DNA-GPT: Divergent N-Gram Analysis for Training-Free Detection of GPT-Generated Text

ICLR 2024Poster
7.0
4

Guiding Language Model Reasoning with Planning Tokens

COLM 2024Poster
6.3
4

Who Evaluates the Evaluations? Objectively Scoring Text-to-Image Prompt Coherence Metrics with T2IScoreScore (TS2)

NeurIPS 2024Spotlight
4.7
3

Multimodal Procedural Planning via Dual Text-Image Prompting

ICLR 2024withdrawn
5.6
5

T2V-Turbo: Breaking the Quality Bottleneck of Video Consistency Model with Mixed Reward Feedback

NeurIPS 2024Poster
6.3
3

Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models

COLM 2024Poster
5.5
4

Mastering Robot Manipulation with Multimodal Prompts through Pretraining and Multi-task Fine-tuning

ICLR 2024Rejected
4.3
4

Visual Chain of Thought: Bridging Logical Gaps with Multimodal Infillings

ICLR 2024withdrawn