Jianwei Yang
~Jianwei_Yang1
22
论文总数
11.0
年均投稿
平均评分
接收情况14/22
会议分布
ICLR
12
NeurIPS
7
ICML
2
COLM
1
发表论文 (22 篇)
202512 篇
5
Evaluating Graphical Perception of Large Multimodal Models
ICLR 2025withdrawn
4
OmniParser for Pure Vision Based GUI Agent
ICLR 2025Rejected
4
Matryoshka Multimodal Models
ICLR 2025Poster
4
GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents
NeurIPS 2025Poster
6
Latent Action Pretraining from Videos
ICLR 2025Poster
4
Elevating Visual Perception in Multimodal LLMs with Visual Embedding Distillation
NeurIPS 2025Poster
4
Struct2D: A Perception-Guided Framework for Spatial Reasoning in MLLMs
NeurIPS 2025Poster
4
MindJourney: Test-Time Scaling with World Models for Spatial Reasoning
NeurIPS 2025Poster
4
Simplifying DINO via Coding Rate Regularization
ICML 2025Poster
4
ReFocus: Visual Editing as a Chain of Thought for Structured Image Understanding
ICML 2025Poster
4
TraceVLA: Visual Trace Prompting Enhances Spatial-Temporal Awareness for Generalist Robotic Policies
ICLR 2025Poster
5
TemporalBench: Towards Fine-grained Temporal Understanding for Multimodal Video Models
ICLR 2025withdrawn
202410 篇
4
Towards Flexible Visual Relationship Segmentation
NeurIPS 2024Poster
3
DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effective for LMMs
NeurIPS 2024Poster
3
MedJourney: Counterfactual Medical Image Generation by Instruction-Learning from Multimodal Patient Journeys
ICLR 2024Rejected
-
Knowledge-Augmented Large Vision-and-Language Assistant
ICLR 2024withdrawn
4
Efficient Modulation for Vision Networks
ICLR 2024Poster
4
IMProv: Inpainting-based Multimodal Prompting for Computer Vision Tasks
ICLR 2024Rejected
4
Interfacing Foundation Models' Embeddings
NeurIPS 2024Poster
6
List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs
COLM 2024Poster
4
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents
ICLR 2024Rejected
3
Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection
ICLR 2024Rejected