PaperHub

Peng Gao

~Peng_Gao3

27
论文总数
13.5
年均投稿
5.5
平均评分
接收情况13/27
会议分布
ICLR
23
NeurIPS
3
ICML
1

发表论文 (27 篇)

202516

7.2
5

Lumina-T2X: Scalable Flow-based Large Diffusion Transformer for Flexible Resolution Generation

ICLR 2025Spotlight
5.3
4

LongHalQA: Long-Context Hallucination Evaluation for MultiModal Large Language Models

ICLR 2025Rejected
5.7
3

Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want

ICLR 2025Poster
3.0
3

EfficientQAT: Efficient Quantization-Aware Training for Large Language Models

ICLR 2025Rejected
5.0
4

VEnhancer: Generative Space-Time Enhancement for Video Generation

ICLR 2025Rejected
5.0
4

AR-1-to-3: Single Image to Consistent 3D Object Generation via Next-View Prediction

ICLR 2025withdrawn
4.6
5

Customize Your Visual Autoregressive Recipe with Set Autoregressive Modeling

ICLR 2025Rejected
6.0
5

Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining

ICLR 2025Rejected
5.3
4

I-Max: Maximize the Resolution Potential of Pre-trained Rectified Flow Transformers with Projected Flow

ICLR 2025Rejected
7.8
4

EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation

NeurIPS 2025Poster
6.0
4

PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions

ICLR 2025Poster
4.0
4

Exploring the Design Space of Autoregressive Models for Efficient and Scalable Image Generation

ICLR 2025withdrawn
6.5
4

MMSearch: Unveiling the Potential of Large Models as Multi-modal Search Engines

ICLR 2025Poster
4.0
4

TerDiT: Ternary Diffusion Models with Transformers

ICLR 2025withdrawn
6.5
4

MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data Engine

ICLR 2025Poster
5.5
4

MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency

ICML 2025Poster

202411