Large-Scale Pretrained Model for Self-Supervised Music Audio Representation Learning
View/ Open
Metadata
Show full item recordAbstract
Self-supervised learning technique is an under-explored topic for music audio due to the challenge of designing an appropriate training paradigm. We hence propose MAP-MERT, a large-scale music audio pre-trained model for general music understanding. We achieve performance that is comparable to the state-of-the-art pre-trained model Jukebox using less than 2% of parameters.