影响力指数

98.61/100

前 0.1%

全站排名 #34

发表论文54 篇

平均评分5.9

年均产出18.0 篇/年

Ion Stoica

Full Professor@University of California, Berkeley·美国·OpenReview

研究方向

AI · big data · peer-to-peer networks · systems · networking

EditBench: Evaluating LLM Abilities to Perform Real-World Instructed Code Edits

Computer Agent Arena: Toward Human-Centric Evaluation and Analysis of Computer-Use Agents

ICLR 2026Poster

SonicMoE: Accelerating MoE with IO and Tile-aware Optimizations

ICLR 2026Poster

GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning

lmgame-Bench: How Good are LLMs at Playing Games?

ICLR 2026Poster

DeepScholarBench: A Live Benchmark and Automated Evaluation for Generative Research Synthesis

ICLR 2026Rejected

LLMSELECTOR: Towards Model Selection Optimization for Compound AI Systems

ICLR 2026Rejected

vAttention: Verified Sparse Attention via Sampling

ICLR 2026Poster

SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse–Linear Attention

ICLR 2026Poster

SPECS: Faster Test-Time Scaling through Speculative Drafts and Dynamic Switching

ICLR 2026Rejected

DeepScaleR: Effective RL Scaling of Reasoning Models via Iterative Context Lengthening

ICLR 2026Rejected

SciPro Arena: a Case Study of AI Agent Capabilities in Scientific Analysis Tasks

ICLR 2026Rejected

BARE: Leveraging Base Language Models for Few-Shot Synthetic Data Generation

ICLR 2026Rejected

Shepherd: Pattern-Guided Trajectory Selection for Coding Agents on SWE-Bench

ICLR 2026Rejected

R2E-Gym: Procedural Environment Generation and Hybrid Verifiers for Scaling Open-Weights SWE Agents

COLM 2025Poster

HashAttention: Semantic Sparsity for Faster Inference

ICML 2025Poster

Twilight: Adaptive Attention Sparsity with Hierarchical Top-$p$ Pruning

NeurIPS 2025Spotlight

Radial Attention: $\mathcal O(n \log n)$ Sparse Attention for Long Video Generation

NeurIPS 2025Poster

Sparse VideoGen2: Accelerate Video Generation with Sparse Attention via Semantic-Aware Permutation

NeurIPS 2025Spotlight

Faster Video Diffusion with Trainable Sparse Attention

NeurIPS 2025Poster

Efficiently Scaling LLM Reasoning Programs with Certaindex

NeurIPS 2025Poster

OR-Bench: An Over-Refusal Benchmark for Large Language Models

ICML 2025Poster

Fast Video Generation with Sliding Tile Attention

ICML 2025Poster

GameArena: Evaluating LLM Reasoning through Live Computer Games

ICLR 2025Poster

JudgeBench: A Benchmark for Evaluating LLM-Based Judges

ICLR 2025Poster

MPC-Minimized Secure LLM Inference

ICLR 2025Rejected

RouteLLM: Learning to Route LLMs from Preference Data

ICLR 2025Poster

How to Evaluate Reward Models for RLHF

ICLR 2025Poster

LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code

ICLR 2025Poster

Exploring and Mitigating Adversarial Manipulation of Voting-Based Leaderboards

From Crowdsourced Data to High-quality Benchmarks: Arena-Hard and Benchbuilder Pipeline

ICML 2025Poster

Sparse Video-Gen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity

ICML 2025Poster

Bench-O-Matic: Automating Benchmark Curation from Crowdsourced Data

ICLR 2025Rejected

Prompt-to-Leaderboard: Prompt-Adaptive LLM Evaluations

ICML 2025Poster

Efficient-vDiT: Efficient Video Diffusion Transformers With Attention Tile

ICLR 2025Rejected

A Statistical Framework for Ranking LLM-based Chatbots

ICLR 2025Poster

Copilot Arena: A Platform for Code LLM Evaluation in the Wild

ICML 2025Poster

OR-Bench: An Over-Refusal Benchmark for Large Language Models

ICLR 2025Rejected

Post-Training Sparse Attention with Double Sparsity

ICLR 2025Rejected

The Berkeley Function Calling Leaderboard (BFCL): From Tool Use to Agentic Evaluation of Large Language Models

ICML 2025Poster

Test-Time RAG: Enhancing Long Context Understanding in LLMs with Retrieval-Augmented Mechanisms

ICLR 2025Rejected

合作者 (20)

Joseph E. Gonzalez