PaperHub

Beidi Chen

~Beidi_Chen1

30
论文总数
15.0
年均投稿
6.1
平均评分
接收情况25/30
会议分布
NeurIPS
13
ICLR
12
ICML
3
COLM
2

发表论文 (30 篇)

202514

202416

-

On the Similarity between Attention and SVM on the Token Separation and Selection Behavior

ICLR 2024withdrawn
6.0
4

Prompt-prompted Adaptive Structured Pruning for Efficient LLM Generation

COLM 2024Poster
5.8
4

Compress, Then Prompt: Improving Accuracy-Efficiency Trade-off of LLM Inference with Transferable Prompt

ICLR 2024Rejected
5.5
4

On the Surprising Effectiveness of Attention Transfer for Vision Transformers

NeurIPS 2024Poster
7.5
4

Efficient Streaming Language Models with Attention Sinks

ICLR 2024Poster
6.0
3

Mini-Sequence Transformers: Optimizing Intermediate Memory for Long Sequences Training

NeurIPS 2024Poster
5.8
4

SpecExec: Massively Parallel Speculative Decoding For Interactive LLM Inference on Consumer Devices

NeurIPS 2024Poster
6.0
4

Nearest Neighbor Speculative Decoding for LLM Generation and Attribution

NeurIPS 2024Poster
5.3
4

Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length

NeurIPS 2024Poster
5.8
4

JoMA: Demystifying Multilayer Transformers via Joint Dynamics of MLP and Attention

ICLR 2024Poster
6.3
4

Learn To be Efficient: Build Structured Sparsity in Large Language Models

NeurIPS 2024Spotlight
5.3
4

SIRIUS : Contexual Sparisty with Correction for Efficient LLMs

NeurIPS 2024Poster
6.7
3

TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding

COLM 2024Poster
5.7
3

Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding

NeurIPS 2024Poster
6.5
4

Sequoia: Scalable and Robust Speculative Decoding

NeurIPS 2024Spotlight
5.5
4

S$^{2}$FT: Efficient, Scalable and Generalizable LLM Fine-tuning by Structured Sparsity

NeurIPS 2024Poster