Maosong Sun
~Maosong_Sun1
37
论文总数
18.5
年均投稿
平均评分
接收情况27/37
会议分布
ICLR
23
COLM
7
NeurIPS
6
ICML
1
发表论文 (37 篇)
202523 篇
4
Stuffed Mamba: Oversized States Lead to the Inability to Forget
COLM 2025Poster
5
A Multi-Power Law for Loss Curve Prediction Across Learning Rate Schedules
ICLR 2025Poster
4
Optima: Optimizing Effectiveness and Efficiency for LLM-Based Multi-Agent System
ICLR 2025Rejected
5
Stuffed Mamba: State Collapse and State Capacity of RNN-Based Long-Context Modeling
ICLR 2025Rejected
3
The Overthinker's DIET: Cutting Token Calories with DIfficulty-AwarE Training
NeurIPS 2025Poster
4
Sparsing Law: Towards Large Language Models with Greater Activation Sparsity
ICLR 2025Rejected
4
A*-Thought: Efficient Reasoning via Bidirectional Compression for Low-Resource Settings
NeurIPS 2025Poster
3
Rational Decision-Making Agent with Learning Internal Utility Judgment
ICLR 2025Poster
3
BlockFFN: Towards End-Side Acceleration-Friendly Mixture-of-Experts with Chunk-Level Activation Sparsity
COLM 2025Poster
4
StreamingBench: Assessing the Gap for MLLMs to Achieve Streaming Video Understanding
ICLR 2025Rejected
4
ParamMute: Suppressing Knowledge-Critical FFNs for Faithful Retrieval-Augmented Generation
NeurIPS 2025Poster
3
Sparsing Law: Towards Large Language Models with Greater Activation Sparsity
ICML 2025Poster
4
Selecting Influential Samples for Long Context Alignment via Homologous Models’ Guidance and Contextual Awareness Measurement
ICLR 2025withdrawn
4
WorkflowLLM: Enhancing Workflow Orchestration Capability of Large Language Models
ICLR 2025Poster
5
Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence
ICLR 2025Spotlight
4
VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents
ICLR 2025Poster
5
RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rewards
ICLR 2025Poster
4
Scaling Large Language Model-based Multi-Agent Collaboration
ICLR 2025Poster
5
Improving Zero-Shot Generalization of Instruction Tuning by Data Arrangement
ICLR 2025withdrawn
4
Multi-Agent Collaboration via Evolving Orchestration
NeurIPS 2025Poster
3
AIR: A Systematic Analysis of Annotations, Instructions, and Response Pairs in Preference Dataset
COLM 2025Poster
4
Proactive Agent: Shifting LLM Agents from Reactive Responses to Active Assistance
ICLR 2025Poster
4
Advancing LLM Reasoning Generalists with Preference Trees
ICLR 2025Poster
202414 篇
3
Two Heads Are Better Than One: Exploiting Both Sequence and Graph Models in AMR-To-Text Generation
ICLR 2024Rejected
4
Unified View of Grokking, Double Descent and Emergent Abilities: A Comprehensive Study on Algorithm Task
COLM 2024Poster
5
BurstAttention: An Efficient Distributed Attention Framework for Extremely Long Sequences
ICLR 2024Rejected
4
Rational Decision-Making Agent with Internalized Utility Judgment
ICLR 2024Rejected
4
InfLLM: Training-Free Long-Context Extrapolation for LLMs with an Efficient Context Memory
NeurIPS 2024Poster
3
CA-LoRA: Adapting Existing LoRA for Compressed LLMs to Enable Efficient Multi-Tasking on Personal Devices
COLM 2024Poster
4
UltraFeedback: Boosting Language Models with High-quality Feedback
ICLR 2024Rejected
4
Delta-CoMe: Training-Free Delta-Compression with Mixed-Precision for Large Language Models
NeurIPS 2024Poster
4
Predicting Emergent Abilities with Infinite Resolution Evaluation
ICLR 2024Poster
4
UniMem: Towards a Unified View of Long-Context Large Language Models
COLM 2024Poster
4
AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors
ICLR 2024Poster
4
Large Multilingual Models Pivot Zero-Shot Multimodal Learning across Languages
ICLR 2024Spotlight
4
ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs
ICLR 2024Spotlight
4
MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies
COLM 2024Poster