Yinfei Yang
~Yinfei_Yang1
19
论文总数
9.5
年均投稿
平均评分
接收情况14/19
会议分布
ICLR
14
NeurIPS
2
COLM
2
ICML
1
发表论文 (19 篇)
202512 篇
4
MIA-Bench: Towards Better Instruction Following Evaluation of Multimodal LLMs
ICLR 2025Poster
4
Improve Vision Language Model Chain-of-thought Reasoning
ICLR 2025withdrawn
4
UniGen: Enhanced Training & Test-Time Strategies for Unified Multimodal Understanding and Generation
NeurIPS 2025Poster
4
SlowFast-LLaVA-1.5: A Family of Token-Efficient Video Large Language Models for Long-Form Video Understanding
COLM 2025Poster
3
Ferret-UI 2: Mastering Universal User Interface Understanding Across Platforms
ICLR 2025Poster
4
CAR-Flow: Condition-Aware Reparameterization Aligns Source and Target for Better Flow Matching
NeurIPS 2025Spotlight
3
Understanding Alignment in Multimodal LLMs: A Comprehensive Study
ICLR 2025Rejected
4
Contrastive Localized Language-Image Pre-Training
ICML 2025Poster
4
Contrastive Localized Language-Image Pre-Training
ICLR 2025Rejected
4
MMEgo: Towards Building Egocentric Multimodal LLMs for Video QA
ICLR 2025Poster
4
Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models
ICLR 2025Poster
4
MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning
ICLR 2025Poster
20247 篇
4
Data Curation for Large Scale Detection Pretraining
ICLR 2024withdrawn
4
Guiding Instruction-based Image Editing via Multimodal Large Language Models
ICLR 2024Spotlight
4
Compressing LLMs: The Truth is Rarely Pure and Never Simple
ICLR 2024Poster
3
Ferret: Refer and Ground Anything Anywhere at Any Granularity
ICLR 2024Spotlight
3
From Scarcity to Efficiency: Improving CLIP Training via Visual-enriched Captions
ICLR 2024withdrawn
4
MOFI: Learning Image Representations from Noisy Entity Annotated Images
ICLR 2024Poster
3
Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models
COLM 2024Poster