影响力指数

94.94/100

前 0.3%

全站排名 #173

发表论文37 篇

平均评分5.4

年均产出12.3 篇/年

Yu-Xiong Wang

PhD student@School of Computer Science, Carnegie Mellon University·美国·OpenReview

研究方向

few-shot learning · meta-learning · transfer learning · human motion prediction

BEAT: Visual Backdoor Attacks on VLM-based Embodied Agents via Contrastive Trigger Learning

ICLR 2026Poster

Capturing Visual Environment Structure Correlates with Control Performance

ICLR 2026Poster

Latent Wasserstein Adversarial Imitation Learning

ICLR 2026Poster

Unleashing Guidance Without Classifiers for Human-Object Interaction Animation

ICLR 2026Poster

Dress&Dance: Dress up and Dance as You Like It

ICLR 2026Rejected

LongVTG-R1: Reinforcement Learning for Robust Long-Video Temporal Grounding

ICLR 2026Rejected

Dissecting Demystifying Region-Based Representations in MLLMs

ICLR 2026Withdrawn

One Token per Highly Selective Frame: Towards Extreme Compression for Long Video Understanding

NeurIPS 2025Poster

3DGS-Drag: Dragging Gaussians for Intuitive Point-Based 3D Editing

ICLR 2025Poster

Aligning Generative Denoising with Discriminative Objectives Unleashes Diffusion for Visual Perception

ICLR 2025Poster

MR. Video: MapReduce as an Effective Principle for Long Video Understanding

NeurIPS 2025Poster

Swiss Army Knife: Synergizing Biases in Knowledge from Vision Foundation Models for Multi-Task Learning

ICLR 2025Poster

Self-Guided Hierarchical Exploration for Generalist Foundation Model Web Agents

NeurIPS 2025Poster

Diff-2-in-1: Bridging Generation and Dense Perception with Diffusion Models

ICLR 2025Poster

Proposer-Agent-Evaluator (PAE): Autonomous Skill Discovery For Foundation Model Internet Agents

ICLR 2025Rejected

RTDiff: Reverse Trajectory Synthesis via Diffusion for Offline Reinforcement Learning

ICLR 2025Poster

ReferEverything: Towards segmenting everything we can speak of in videos

ICLR 2025Rejected

Virtual Fitting Room: Generating Arbitrarily Long Videos of Virtual Try-On from a Single Image

NeurIPS 2025Poster

Emerging Pixel Grounding in Large Multimodal Models Without Grounding Supervision

ICLR 2025Rejected

Latent Wasserstein Adversarial Imitation Learning

ICLR 2025Rejected

LayeredGS: Efficient Dynamic Scene Rendering and Point Tracking with Multi-Layer Deformable Gaussian Splatting

ICLR 2025Withdrawn

Proposer-Agent-Evaluator (PAE): Autonomous Skill Discovery For Foundation Model Internet Agents

ICML 2025Poster

Video Diffusion Models Learn the Structure of the Dynamic World

ICLR 2025Withdrawn

合作者 (20)