PaperHub

暂无评分数据

ICLR 2024

Attention-Only Transformers and Implementing MLPs with Attention Heads

OpenReviewPDF
提交: 2023-09-20更新: 2024-03-26
TL;DR

We show that MLP neurons can be implemented by masked, rank-1 attention heads, allowing one to convert an MLP-and-attention transformer into an attention-only transformer.

摘要

关键词
transformerneural networkarchitectureattention

评审与讨论

暂无评审记录