影响力指数

56.79/100

前 5.8%

全站排名 #3,726

发表论文29 篇

平均评分4.9

年均产出9.7 篇/年

Ruiyi Zhang

Machine Learning Researcher@Apple AIML·美国·OpenReview

研究方向

Vision-Language · Natural Language Processing · Reinforcement Learning · Machine Learning

Bayesian Data Reweighting Improves Retrieval in Knowledge-Based VQA

ICLR 2026Rejected

MusiXQA: Advancing Visual Music Understanding in Multimodal LLMs

ICLR 2026Rejected

Towards Visual Text Grounding of Multimodal Large Language Model

ICLR 2026Rejected

Spinning Straw into Gold: Relabeling LLM Agent Trajectories in Hindsight for Successful Demonstrations

ICLR 2026Poster

VisR-Bench: An Empirical Study on Visual Retrieval-Augmented Generation for Multilingual Long Document Understanding

ICLR 2026Rejected

CURV: Enhancing Chart Understanding through Visual Grounded Reasoning

ICLR 2026Withdrawn

GUI‑AIMA: Aligning Intrinsic Multi-Modal Attention with a Context Anchor for GUI Grounding

ICLR 2026Rejected

CaughtCheating: Is Your MLLM a Good Cheating Detective? Exploring the Boundary of Visual Perception and Reasoning

ICLR 2026Withdrawn

MDocAgent: A Multi-Modal Multi-Agent Framework for Document Question Answering

ICLR 2026Rejected

Reasoning-Based Personalized Generation for Users with Sparse Data

ICLR 2026Rejected

A Personalized Conversational Benchmark: Towards Simulating Personalized Conversations

ICLR 2026Withdrawn

DynaSaur: Large Language Agents Beyond Predefined Actions

COLM 2025Poster

SV-RAG: LoRA-Contextualizing Adaptation of MLLMs for Long Document Understanding

ICLR 2025Poster

VipAct: Visual-Perception Enhancement via Specialized VLM Agent Collaboration and Tool-use

ICLR 2025Rejected

VaQuitA: Enhancing Alignment in LLM-Assisted Zero-Shot Video Understanding

ICLR 2025Withdrawn

ADOPD-Instruct: A Large-Scale Multimodal Dataset for Document Editing

ICLR 2025Rejected

OIDA-QA: A Multimodal Benchmark for Analyzing the Opioid Industry Document Archive

ICLR 2025Rejected

Taipan: Efficient and Expressive State Space Language Models with Selective Attention

ICLR 2025Rejected

LLaVA-Read: Enhancing Reading Ability of Multimodal Large Language Models

ICLR 2025Rejected

Enhancing Diffusion Posterior Sampling for Inverse Problems by Integrating Crafted Measurements

ICLR 2025Withdrawn

合作者 (20)

Franck Dernoncourt