Peng Gao
~Peng_Gao3
27
论文总数
13.5
年均投稿
平均评分
接收情况13/27
会议分布
ICLR
23
NeurIPS
3
ICML
1
发表论文 (27 篇)
202516 篇
5
Lumina-T2X: Scalable Flow-based Large Diffusion Transformer for Flexible Resolution Generation
ICLR 2025Spotlight
4
LongHalQA: Long-Context Hallucination Evaluation for MultiModal Large Language Models
ICLR 2025Rejected
3
Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want
ICLR 2025Poster
3
EfficientQAT: Efficient Quantization-Aware Training for Large Language Models
ICLR 2025Rejected
4
VEnhancer: Generative Space-Time Enhancement for Video Generation
ICLR 2025Rejected
4
AR-1-to-3: Single Image to Consistent 3D Object Generation via Next-View Prediction
ICLR 2025withdrawn
5
Customize Your Visual Autoregressive Recipe with Set Autoregressive Modeling
ICLR 2025Rejected
5
Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining
ICLR 2025Rejected
4
I-Max: Maximize the Resolution Potential of Pre-trained Rectified Flow Transformers with Projected Flow
ICLR 2025Rejected
4
EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation
NeurIPS 2025Poster
4
PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions
ICLR 2025Poster
4
Exploring the Design Space of Autoregressive Models for Efficient and Scalable Image Generation
ICLR 2025withdrawn
4
MMSearch: Unveiling the Potential of Large Models as Multi-modal Search Engines
ICLR 2025Poster
4
TerDiT: Ternary Diffusion Models with Transformers
ICLR 2025withdrawn
4
MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data Engine
ICLR 2025Poster
4
MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency
ICML 2025Poster
202411 篇
4
Masked AutoDecoder is Effective Multi-Task Vision Generalist
ICLR 2024withdrawn
4
Improving Compositional Text-to-image Generation with Large Vision-Language Models
ICLR 2024Rejected
3
Phased Consistency Models
NeurIPS 2024Poster
4
Instruct2Act: Mapping Multi-modality Instructions to Robotic Arm Actions with Large Language Model
ICLR 2024Rejected
3
PointMLLM: Aligning multi-modality with LLM for point cloud understanding, generation and editing
ICLR 2024withdrawn
4
BESA: Pruning Large Language Models with Blockwise Parameter-Efficient Sparsity Allocation
ICLR 2024Poster
3
LLaMA-Adapter: Efficient Fine-tuning of Large Language Models with Zero-initialized Attention
ICLR 2024Poster
5
OmniQuant: Omnidirectionally Calibrated Quantization for Large Language Models
ICLR 2024Spotlight
3
Personalize Segment Anything Model with One Shot
ICLR 2024Poster
3
Point-Bind & Point-LLM: Aligning Point Cloud with Multi-modality for 3D Understanding, Generation, and Instruction Following
ICLR 2024Rejected
5
Lumina-Next : Making Lumina-T2X Stronger and Faster with Next-DiT
NeurIPS 2024Poster