dc.contributor.author | Ycart, A | en_US |
dc.contributor.author | Benetos, E | en_US |
dc.contributor.author | 18th International Society for Music Information Retrieval Conference (ISMIR 2017) | en_US |
dc.date.accessioned | 2017-07-21T09:49:55Z | |
dc.date.available | 2017-06-23 | en_US |
dc.date.issued | 2017-10-23 | en_US |
dc.date.submitted | 2017-07-16T10:12:00.941Z | |
dc.identifier.uri | http://qmro.qmul.ac.uk/xmlui/handle/123456789/24946 | |
dc.description.abstract | Neural networks, and especially long short-term memory networks (LSTM), have become increasingly popular for sequence modelling, be it in text, speech, or music. In this paper, we investigate the predictive power of simple LSTM networks for polyphonic MIDI sequences, using an empirical approach. Such systems can then be used as a music language model which, combined with an acoustic model, can improve automatic music transcription (AMT) performance. As a first step, we experiment with synthetic MIDI data, and we compare the results obtained in various settings, throughout the training process. In particular, we compare the use of a fixed sample rate against a musically-relevant sample rate. We test this system both on synthetic and real MIDI data. Results are compared in terms of note prediction accuracy. We show that the higher the sample rate is, the better the prediction is, because self transitions are more frequent. We suggest that for AMT, a musically-relevant sample rate is crucial in order to model note transitions, beyond a simple smoothing effect. | en_US |
dc.format.extent | 421 - 427 (7) | en_US |
dc.publisher | ISMIR | en_US |
dc.rights | Licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0). Attribution: Adrien Ycart and Emmanouil Benetos. “A study on LSTM networks for polyphonic music sequence modelling”, 18th International Society for Music Information Retrieval Conference, Suzhou, China, 2017. | |
dc.title | A study on LSTM networks for polyphonic music sequence modelling | en_US |
dc.type | Conference Proceeding | |
dc.rights.holder | © Adrien Ycart and Emmanouil Benetos. | |
pubs.author-url | http://www.eecs.qmul.ac.uk/~ay304/ | en_US |
pubs.notes | No embargo | en_US |
pubs.notes | Conference proceedings are CC-BY, there is no embargo | en_US |
pubs.publication-status | Accepted | en_US |
pubs.publisher-url | https://ismir2017.smcnus.org/ | en_US |
dcterms.dateAccepted | 2017-06-23 | en_US |
qmul.funder | A Machine Learning Framework for Audio Analysis and Retrieval::Royal Academy of Engineering | en_US |