An efficient temporally-constrained probabilistic model for multiple-instrument music transcription

Benetos, E; Weyde, T; 16th International Society for Music Information Retrieval Conference (ISMIR)

dc.contributor.author	Benetos, E	en_US
dc.contributor.author	Weyde, T	en_US
dc.contributor.author	16th International Society for Music Information Retrieval Conference (ISMIR)	en_US
dc.contributor.editor	Wiering, F	en_US
dc.contributor.editor	Müller, M	en_US
dc.date.accessioned	2015-12-16T13:48:13Z
dc.date.accessioned	2015-12-16T13:59:59Z
dc.date.available	2015-06-22	en_US
dc.date.issued	2015-10-26	en_US
dc.date.submitted	2015-10-30T12:12:07.745Z
dc.date.submitted	2015-12-16T13:57:37.755Z
dc.identifier.uri	http://qmro.qmul.ac.uk/xmlui/handle/123456789/9852
dc.description.abstract	In this paper, an efficient, general-purpose model for multiple instrument polyphonic music transcription is proposed. The model is based on probabilistic latent component analysis and supports the use of sound state spectral templates, which represent the temporal evolution of each note (e.g. attack, sustain, decay). As input, a variable-Q transform (VQT) time-frequency representation is used. Computational efficiency is achieved by supporting the use of pre-extracted and pre-shifted sound state templates. Two variants are presented: without temporal constraints and with hidden Markov model-based constraints controlling the appearance of sound states. Experiments are performed on benchmark transcription datasets: MAPS, TRIOS, MIREX multiF0, and Bach10; results on multi-pitch detection and instrument assignment show that the proposed models outperform the state-of-the-art for multiple-instrument transcription and is more than 20 times faster compared to a previous sound state-based model. We finally show that a VQT representation can lead to improved multi-pitch detection performance compared with constant-Q representations.	en_US
dc.format.extent	701 - 707 (7)	en_US
dc.language.iso	en	en_US
dc.publisher	International Society for Music Information Retrieval	en_US
dc.relation.replaces	http://qmro.qmul.ac.uk/xmlui/handle/123456789/9849
dc.relation.replaces	123456789/9849
dc.rights	http://ismir2015.uma.es/articles/131_Paper.pdf
dc.title	An efficient temporally-constrained probabilistic model for multiple-instrument music transcription	en_US
dc.type	Conference Proceeding
dc.rights.holder	© The Author(s) 2015
pubs.author-url	http://www.eecs.qmul.ac.uk/~emmanouilb/	en_US
pubs.notes	No embargo	en_US
pubs.notes	Open access CC-BY paper, no embargo	en_US
pubs.publication-status	Published	en_US
pubs.publisher-url	http://www.ismir.net/	en_US
dcterms.dateAccepted	2015-06-22	en_US
qmul.funder	A Machine Learning Framework for Audio Analysis and Retrieval::Royal Academy of Engineering	en_US
qmul.funder	A Machine Learning Framework for Audio Analysis and Retrieval::Royal Academy of Engineering	en_US

Files in this item

Name:: [Benetos15a] An efficient ...
Size:: 514.8Kb
Format:: application/
Description:: Accepted Version

View/Open

This item appears in the following Collection(s)

Centre for Digital Music (C4DM) [210]

Show simple item record