Noah A. Smith
~Noah_A._Smith2
25
论文总数
12.5
年均投稿
平均评分
接收情况22/25
会议分布
ICLR
9
NeurIPS
8
COLM
7
ICML
1
发表论文 (25 篇)
202513 篇
4
On Linear Representations and Pretraining Data Frequency in Language Models
ICLR 2025Poster
4
Signal and Noise: A Framework for Reducing Uncertainty in Language Model Evaluation
NeurIPS 2025Spotlight
4
SuperBPE: Space Travel for Language Models
COLM 2025Poster
4
Broken Tokens? Your Language Model can Secretly Handle Non-Canonical Tokenizations
NeurIPS 2025Spotlight
4
Hybrid Preferences: Learning to Route Instances for Human vs. AI Feedback
ICLR 2025Rejected
4
Establishing Task Scaling Laws via Compute-Efficient Model Ladders
COLM 2025Poster
5
MUSE: Machine Unlearning Six-Way Evaluation for Language Models
ICLR 2025Poster
3
Fluid Language Model Benchmarking
COLM 2025Poster
3
DataDecide: How to Predict Best Pretraining Data with Small Experiments
ICML 2025Poster
4
FlexOLMo: Open Language Models for Flexible Data Use
NeurIPS 2025Spotlight
3
Tulu 3: Pushing Frontiers in Open Language Model Post-Training
COLM 2025Poster
3
OLMoE: Open Mixture-of-Experts Language Models
ICLR 2025Oral
4
2 OLMo 2 Furious (COLM’s Version)
COLM 2025Poster
202412 篇
4
ACID: Abstractive, Content-Based IDs for Document Retrieval with Language Models
ICLR 2024withdrawn
4
Data Mixture Inference Attack: BPE Tokenizers Reveal Training Data Compositions
NeurIPS 2024Poster
4
SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore
ICLR 2024Spotlight
4
Decoding-Time Language Model Alignment with Multiple Objectives
NeurIPS 2024Poster
4
Tuning Language Models by Proxy
COLM 2024Poster
4
Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback
NeurIPS 2024Poster
4
In-Context Pretraining: Language Modeling Beyond Document Boundaries
ICLR 2024Spotlight
4
Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models
NeurIPS 2024Poster
4
MAGNET: Improving the Multilingual Fairness of Language Models with Adaptive Gradient-Based Tokenization
NeurIPS 2024Poster
4
Does Collaborative Human–LM Dialogue Generation Help Information Extraction from Human–Human Dialogues?
COLM 2024Poster
4
What's In My Big Data?
ICLR 2024Spotlight
4
Efficiency Pentathlon: A Standardized Benchmark for Efficiency Evaluation
ICLR 2024Rejected