PaperHub

Ping Luo

~Ping_Luo2

49
论文总数
24.5
年均投稿
5.6
平均评分
接收情况26/49
会议分布
ICLR
35
NeurIPS
12
ICML
2

发表论文 (49 篇)

202523

4.5
4

HRVMamba: High-Resolution Visual State Space Model for Dense Prediction

ICLR 2025withdrawn
5.0
3

An Empirical Study of Multiple Masking in Masked Autoencoder

ICLR 2025withdrawn
7.2
4

BOOD: Boundary-based Out-Of-Distribution Data Generation

ICML 2025Poster
5.5
4

BOOD: Boundary-based Out-Of-Distribution Data Generation

ICLR 2025Rejected
8.2
4

TREND: Unsupervised 3D Representation Learning via Temporal Forecasting for LiDAR Perception

NeurIPS 2025Spotlight
4.8
4

Diffree: Text-Guided Shape Free Object Inpainting with Diffusion Model

ICLR 2025Rejected
3.5
4

DriveE2E: Benchmarking Closed-Loop End-to-End Autonomous Driving Based-on Real-World Traffic Scenarios

ICLR 2025withdrawn
5.4
5

CompGS: Unleashing 2D Compositionality for Compositional Text-to-3D via Dynamically Optimizing 3D Gaussians

ICLR 2025withdrawn
3.0
3

PrefixQuant: Static Quantization Beats Dynamic through Prefixed Outliers in LLMs

ICLR 2025Rejected
3.5
4

MatchMask: Mask-Centric Generative Data Augmentation for Label-Scarce Semantic Segmentation

ICLR 2025withdrawn
7.5
4

Dynamic Multimodal Evaluation with Flexible Complexity by Vision-Language Bootstrapping

ICLR 2025Oral
4.2
5

Task-Oriented Diffusion Inversion for High-Fidelity Text-based Editing

ICLR 2025withdrawn
6.0
4

SAMRefiner: Taming Segment Anything Model for Universal Mask Refinement

ICLR 2025Poster
3.0
3

EfficientQAT: Efficient Quantization-Aware Training for Large Language Models

ICLR 2025Rejected
6.5
4

IDA-VLM: Towards Movie Understanding via ID-Aware Large Vision-Language Model

ICLR 2025Poster
6.8
4

OWMM-Agent: Open World Mobile Manipulation With Multi-modal Agentic Data Synthesis

NeurIPS 2025Poster
6.6
4

Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation

ICML 2025Poster
6.8
5

FUDOKI: Discrete Flow-based Unified Understanding and Generation via Kinetic-Optimal Velocities

NeurIPS 2025Spotlight
6.4
3

WorldWeaver: Generating Long-Horizon Video Worlds via Rich Perception

NeurIPS 2025Poster
6.0
4

LLaMA Decoder As Vision Transformer

ICLR 2025Rejected
5.2
5

Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation

ICLR 2025Rejected
6.0
5

MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models

ICLR 2025Poster
8.2
4

OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation

NeurIPS 2025Poster

202426

6.3
6

Scalable and Effective Arithmetic Tree Generation for Adder and Multiplier Designs

NeurIPS 2024Spotlight
4.8
4

Conditional MAE: An Empirical Study of Multiple Masking in Masked Autoencoder

ICLR 2024Rejected
6.3
4

MoLE: Enhancing Human-centric Text-to-image Diffusion via Mixture of Low-rank Experts

NeurIPS 2024Poster
5.5
6

Rethinking the Noise Schedule of Diffusion-Based Generative Models

ICLR 2024Rejected
5.3
4

MoLE: Human-centric Text-to-image Diffusion with Mixture of Low-rank Experts

ICLR 2024Rejected
6.2
5

PROGRAM: PROtotype GRAph Model based Pseudo-Label Learning for Test-Time Adaptation

ICLR 2024Poster
4.8
4

Multi-Level Contrastive Learning for Dense Prediction Task

ICLR 2024withdrawn
6.0
3

VDT: General-purpose Video Diffusion Transformers via Mask Modeling

ICLR 2024Poster
6.3
3

Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies

NeurIPS 2024Poster
5.0
5

Language-driven Open-Vocabulary Keypoint Detection for Animal Body and Face

ICLR 2024withdrawn
5.7
3

SearchLVLMs: A Plug-and-Play Framework for Augmenting Large Vision-Language Models by Searching Up-to-Date Internet Knowledge

NeurIPS 2024Poster
6.3
4

Closed-Loop Visuomotor Control with Generative Expectation for Robotic Manipulation

NeurIPS 2024Poster
6.0
3

Large Language Models as Automated Aligners for benchmarking Vision-Language Models

ICLR 2024Poster
4.3
4

Advancing Vision Transformers with Group-Mix Attention

ICLR 2024withdrawn
4.5
4

StyleAdapter: A Unified Stylized Image Generation Model without Test-Time Fine-Tuning

ICLR 2024withdrawn
3.8
4

Large Language Models as Decision Makers for Autonomous Driving

ICLR 2024Rejected
7.0
4

PixArt-$\alpha$: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis

ICLR 2024Spotlight
4.3
3

SPOT: Scalable 3D Pre-training via Occupancy Prediction for Autonomous Driving

ICLR 2024Rejected
5.5
4

GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest

ICLR 2024Rejected
7.0
4

BESA: Pruning Large Language Models with Blockwise Parameter-Efficient Sparsity Allocation

ICLR 2024Poster
6.4
5

OmniQuant: Omnidirectionally Calibrated Quantization for Large Language Models

ICLR 2024Spotlight
5.3
4

Tree-Planner: Efficient Close-loop Task Planning with Large Language Models

ICLR 2024Poster
4.3
4

Convolution on Your 12× Wide Feature: A ConvNet with Nested Design

ICLR 2024withdrawn
5.3
4

Rethinking Human Evaluation Protocol for Text-to-Video Models: Enhancing Reliability, Reproducibility, and Practicality

NeurIPS 2024Poster
5.7
3

VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks

NeurIPS 2024Poster
7.0
4

InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation

ICLR 2024Spotlight