dc.contributor.author | Phaye, SSR | |
dc.contributor.author | BENETOS, E | |
dc.contributor.author | Wang, Y | |
dc.contributor.author | IEEE International Conference on Acoustics, Speech, and Signal Processing | |
dc.date.accessioned | 2019-03-05T10:21:19Z | |
dc.date.available | 2019-02-01 | |
dc.date.available | 2019-03-05T10:21:19Z | |
dc.date.issued | 2019-05-12 | |
dc.identifier.citation | Phaye, S., Benetos, E. and Wang, Y. (2019). SubSpectralNet - Using Sub-Spectrogram based Convolutional Neural Networks for Acoustic Scene Classification. [online] arXiv.org. Available at: https://arxiv.org/abs/1810.12642 [Accessed 5 Mar. 2019]. | en_US |
dc.identifier.uri | https://qmro.qmul.ac.uk/xmlui/handle/123456789/55777 | |
dc.description.abstract | Acoustic Scene Classification (ASC) is one of the core research problems in the field of Computational Sound Scene Analysis. In this work, we present SubSpectralNet, a novel model which captures discriminative features by incorporating frequency band-level differences to model soundscapes. Using mel-spectrograms, we propose the idea of using band-wise crops of the input time-frequency representations and train a convolutional neural network~(CNN) on the same. We also propose a modification in the training method for more efficient learning of the CNN models. We first give a motivation for using sub-spectrograms by giving intuitive and statistical analyses and finally we develop a sub-spectrogram based CNN architecture for ASC. The system is evaluated on the public ASC development dataset provided for the "Detection and Classification of Acoustic Scenes and Events" (DCASE) 2018 Challenge. Our best model achieves an improvement of +14% in terms of classification accuracy with respect to the DCASE 2018 baseline system. Code and figures are available at https://github.com/ssrp/SubSpectralNet | en_US |
dc.format.extent | ? - ? (5) | |
dc.publisher | IEEE | en_US |
dc.title | SubSpectralNet - Using sub-spectrogram based convolutional neural networks for acoustic scene classification | en_US |
dc.type | Conference Proceeding | en_US |
dc.rights.holder | © 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. | |
pubs.author-url | https://ssrp.github.io/ | en_US |
pubs.notes | No embargo | en_US |
pubs.notes | IEEE conference, allows postprints to be uploaded to institutional repositories. | en_US |
pubs.publication-status | Accepted | en_US |
dcterms.dateAccepted | 2019-02-01 | |
rioxxterms.funder | Default funder | en_US |
rioxxterms.identifier.project | Default project | en_US |
qmul.funder | A Machine Learning Framework for Audio Analysis and Retrieval::Royal Academy of Engineering | en_US |
qmul.funder | A Machine Learning Framework for Audio Analysis and Retrieval::Royal Academy of Engineering | en_US |