影响力指数

62.86/100

前 4%

全站排名 #2,606

发表论文9 篇

平均评分5.2

年均产出4.5 篇/年

Jun Suzuki

Full Professor@Tohoku University·日本·OpenReview

研究方向

Data Selection · Eficient Transformers · Interpretability and Explainability in NLP · Language Models · Grammatical Error Correction · Neural Encoder-Decoder Models · Neural Word Embeddings · Model Compression and Efficient Models for NLP · Semi-supervised Learning · Machine Translation · Question Answering · Natural Language Processing / Understanding · Kernel Methods

Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks

Pre-training LLM without Learning Rate Decay Enhances Supervised Fine-Tuning

ICLR 2026Poster

Vertical Attention: Automatic Exploration of Inter-Layer Connections in Transformer-based Language Models

ICLR 2026Rejected

Transformer Key-Value Memories Are Nearly as Interpretable as Sparse Autoencoders

NeurIPS 2025Poster

Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization

ICLR 2025Poster

Layerwise Importance Analysis of Feed-Forward Networks in Transformer-based Language Models

COLM 2025Poster

Efficient Construction of Model Family through Progressive Training Using Model Expansion

COLM 2025Poster

Spike No More: Stabilizing the Pre-training of Large Language Models

COLM 2025Poster

Spike No More: Stabilizing the Pre-training of Large Language Models

ICLR 2025Rejected

合作者 (20)

Sosuke Kobayashi

Taishi Nakamura

Tatsuki Kuribayashi