A study on LSTM networks for polyphonic music sequence modelling

Ycart, A; Benetos, E; 18th International Society for Music Information Retrieval Conference (ISMIR 2017)

dc.contributor.author	Ycart, A	en_US
dc.contributor.author	Benetos, E	en_US
dc.contributor.author	18th International Society for Music Information Retrieval Conference (ISMIR 2017)	en_US
dc.date.accessioned	2017-07-21T09:49:55Z
dc.date.available	2017-06-23	en_US
dc.date.issued	2017-10-23	en_US
dc.date.submitted	2017-07-16T10:12:00.941Z
dc.identifier.uri	http://qmro.qmul.ac.uk/xmlui/handle/123456789/24946
dc.description.abstract	Neural networks, and especially long short-term memory networks (LSTM), have become increasingly popular for sequence modelling, be it in text, speech, or music. In this paper, we investigate the predictive power of simple LSTM networks for polyphonic MIDI sequences, using an empirical approach. Such systems can then be used as a music language model which, combined with an acoustic model, can improve automatic music transcription (AMT) performance. As a first step, we experiment with synthetic MIDI data, and we compare the results obtained in various settings, throughout the training process. In particular, we compare the use of a fixed sample rate against a musically-relevant sample rate. We test this system both on synthetic and real MIDI data. Results are compared in terms of note prediction accuracy. We show that the higher the sample rate is, the better the prediction is, because self transitions are more frequent. We suggest that for AMT, a musically-relevant sample rate is crucial in order to model note transitions, beyond a simple smoothing effect.	en_US
dc.format.extent	421 - 427 (7)	en_US
dc.publisher	ISMIR	en_US
dc.rights	Licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0). Attribution: Adrien Ycart and Emmanouil Benetos. “A study on LSTM networks for polyphonic music sequence modelling”, 18th International Society for Music Information Retrieval Conference, Suzhou, China, 2017.
dc.title	A study on LSTM networks for polyphonic music sequence modelling	en_US
dc.type	Conference Proceeding
dc.rights.holder	© Adrien Ycart and Emmanouil Benetos.
pubs.author-url	http://www.eecs.qmul.ac.uk/~ay304/	en_US
pubs.notes	No embargo	en_US
pubs.notes	Conference proceedings are CC-BY, there is no embargo	en_US
pubs.publication-status	Accepted	en_US
pubs.publisher-url	https://ismir2017.smcnus.org/	en_US
dcterms.dateAccepted	2017-06-23	en_US
qmul.funder	A Machine Learning Framework for Audio Analysis and Retrieval::Royal Academy of Engineering	en_US

Files in this item

Name:: Benetos A study on LSTM 2017 ...
Size:: 575.8Kb
Format:: application/
Description:: Accepted version

View/Open

This item appears in the following Collection(s)

Electronic Engineering and Computer Science [3475]

Show simple item record