Show simple item record

dc.contributor.authorPhaye, SSR
dc.contributor.authorBENETOS, E
dc.contributor.authorWang, Y
dc.contributor.authorIEEE International Conference on Acoustics, Speech, and Signal Processing
dc.date.accessioned2019-03-05T10:21:19Z
dc.date.available2019-02-01
dc.date.available2019-03-05T10:21:19Z
dc.date.issued2019-05-12
dc.identifier.citationPhaye, S., Benetos, E. and Wang, Y. (2019). SubSpectralNet - Using Sub-Spectrogram based Convolutional Neural Networks for Acoustic Scene Classification. [online] arXiv.org. Available at: https://arxiv.org/abs/1810.12642 [Accessed 5 Mar. 2019].en_US
dc.identifier.urihttps://qmro.qmul.ac.uk/xmlui/handle/123456789/55777
dc.description.abstractAcoustic Scene Classification (ASC) is one of the core research problems in the field of Computational Sound Scene Analysis. In this work, we present SubSpectralNet, a novel model which captures discriminative features by incorporating frequency band-level differences to model soundscapes. Using mel-spectrograms, we propose the idea of using band-wise crops of the input time-frequency representations and train a convolutional neural network~(CNN) on the same. We also propose a modification in the training method for more efficient learning of the CNN models. We first give a motivation for using sub-spectrograms by giving intuitive and statistical analyses and finally we develop a sub-spectrogram based CNN architecture for ASC. The system is evaluated on the public ASC development dataset provided for the "Detection and Classification of Acoustic Scenes and Events" (DCASE) 2018 Challenge. Our best model achieves an improvement of +14% in terms of classification accuracy with respect to the DCASE 2018 baseline system. Code and figures are available at https://github.com/ssrp/SubSpectralNeten_US
dc.format.extent? - ? (5)
dc.publisherIEEEen_US
dc.titleSubSpectralNet - Using sub-spectrogram based convolutional neural networks for acoustic scene classificationen_US
dc.typeConference Proceedingen_US
dc.rights.holder© 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
pubs.author-urlhttps://ssrp.github.io/en_US
pubs.notesNo embargoen_US
pubs.notesIEEE conference, allows postprints to be uploaded to institutional repositories.en_US
pubs.publication-statusAccepteden_US
dcterms.dateAccepted2019-02-01
rioxxterms.funderDefault funderen_US
rioxxterms.identifier.projectDefault projecten_US
qmul.funderA Machine Learning Framework for Audio Analysis and Retrieval::Royal Academy of Engineeringen_US
qmul.funderA Machine Learning Framework for Audio Analysis and Retrieval::Royal Academy of Engineeringen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record