An efficient temporally-constrained probabilistic model for multiple-instrument music transcription

Benetos, E; Weyde, T; 16th International Society for Music Information Retrieval Conference (ISMIR)

View/Open

Accepted Version (514.8Kb)

Editors

Wiering, F

Müller, M

Pagination

701 - 707 (7)

Publisher

International Society for Music Information Retrieval

Publisher URL

http://www.ismir.net/

Metadata

Show full item record

Abstract

In this paper, an efficient, general-purpose model for multiple instrument polyphonic music transcription is proposed. The model is based on probabilistic latent component analysis and supports the use of sound state spectral templates, which represent the temporal evolution of each note (e.g. attack, sustain, decay). As input, a variable-Q transform (VQT) time-frequency representation is used. Computational efficiency is achieved by supporting the use of pre-extracted and pre-shifted sound state templates. Two variants are presented: without temporal constraints and with hidden Markov model-based constraints controlling the appearance of sound states. Experiments are performed on benchmark transcription datasets: MAPS, TRIOS, MIREX multiF0, and Bach10; results on multi-pitch detection and instrument assignment show that the proposed models outperform the state-of-the-art for multiple-instrument transcription and is more than 20 times faster compared to a previous sound state-based model. We finally show that a VQT representation can lead to improved multi-pitch detection performance compared with constant-Q representations.

Authors

Benetos, E; Weyde, T; 16th International Society for Music Information Retrieval Conference (ISMIR)

URI

http://qmro.qmul.ac.uk/xmlui/handle/123456789/9852

Collections

Centre for Digital Music (C4DM) [210]

Licence information

http://ismir2015.uma.es/articles/131_Paper.pdf