Xu Tan
~Xu_Tan1
24
论文总数
12.0
年均投稿
平均评分
接收情况11/24
会议分布
ICLR
18
NeurIPS
5
ICML
1
发表论文 (24 篇)
202510 篇
3
GETMusic: Generating Music Tracks with a Unified Representation and Diffusion Framework
ICLR 2025withdrawn
-
RALL-E: Robust Codec Language Modeling with Chain-of-Thought Prompting for Text-to-Speech Synthesis
ICLR 2025desk_rejected
4
Chain-of-Model Learning for Language Model
NeurIPS 2025Poster
4
Sparse Training: Do All Tokens Matter for Long Sequence Generalization?
ICLR 2025withdrawn
4
VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text to Speech Synthesizers
ICLR 2025Rejected
4
VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling
ICLR 2025withdrawn
3
Semantic-Aware Diffusion Model for Sequential Recommendation
ICLR 2025withdrawn
4
ALMTokenizer: A Low-bitrate and Semantic-rich Audio Codec Tokenizer for Audio Language Modeling
ICML 2025Poster
4
MoonCast: High-Quality Zero-Shot Podcast Generation
NeurIPS 2025Poster
4
MuPT: A Generative Symbolic Music Pretrained Transformer
ICLR 2025Poster
202414 篇
4
GETMusic: Generating Music Tracks with a Unified Representation and Diffusion Framework
ICLR 2024Rejected
4
UniAudio: An Audio Foundation Model Toward Universal Audio Generation
ICLR 2024Rejected
4
TaskBench: Benchmarking Large Language Models for Task Automation
ICLR 2024Rejected
4
NaturalSpeech 2: Latent Diffusion Models are Natural and Zero-Shot Speech and Singing Synthesizers
ICLR 2024Spotlight
3
ERA-Solver: Error-Robust Adams Solver for Fast Sampling of Diffusion Probabilistic Models
ICLR 2024Rejected
5
ResiDual: Transformer with Dual Residual Connections
ICLR 2024Rejected
4
Bridge-TTS: Text-to-Speech Synthesis with Schrodinger Bridge
ICLR 2024withdrawn
4
PromptTTS 2: Describing and Generating Voices with Text Prompt
ICLR 2024Poster
4
MuseCoco: Generating Symbolic Music from Text
ICLR 2024Rejected
4
Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers
ICLR 2024Poster
4
UniAudio 1.5: Large Language Model-Driven Audio Codec is A Few-Shot Audio Task Learner
NeurIPS 2024Poster
4
Predictor-Corrector Enhanced Transformers with Exponential Moving Average Coefficient Learning
NeurIPS 2024Poster
4
GAIA: Zero-shot Talking Avatar Generation
ICLR 2024Poster
3
D-CPT Law: Domain-specific Continual Pre-Training Scaling Law for Large Language Models
NeurIPS 2024Poster