Show simple item record

dc.contributor.authorLiang, J
dc.contributor.authorPhan, QH
dc.contributor.authorBenetos, E
dc.contributor.authorIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
dc.date.accessioned2024-01-18T15:20:29Z
dc.date.available2023-12-13
dc.date.available2024-01-18T15:20:29Z
dc.date.issued2024-04-14
dc.identifier.citationJ. Liang, H. Phan and E. Benetos, "Learning from Taxonomy: Multi-Label Few-Shot Classification for Everyday Sound Recognition," ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Seoul, Korea, Republic of, 2024, pp. 771-775, doi: 10.1109/ICASSP48485.2024.10446908. keywords: {Training;Smoothing methods;Taxonomy;Semantics;Speech recognition;Acoustics;Recording;Few-shot learning;multi-label classification;audio taxonomy;everyday sound recognition},
dc.identifier.urihttps://qmro.qmul.ac.uk/xmlui/handle/123456789/94057
dc.description.abstractHumans categorise and structure perceived acoustic signals into hierarchies of auditory objects. The semantics of these objects are thus informative in sound classification, especially in few-shot scenarios. However, existing works have only represented audio semantics as binary labels (e.g., whether a recording contains dog barking or not), and thus failed to learn a more generic semantic relationship among labels. In this work, we introduce an ontology-aware framework to train multi-label few-shot audio networks with both relative and absolute relationships in an audio taxonomy. Specifically, we propose label-dependent prototypical networks (LaD-ProtoNet) to learn coarse-to-fine acoustic patterns by exploiting direct connections between parent and children classes of sound events. We also present a label smoothing method to take into account the taxonomic knowledge by taking into account absolute distance between two labels w.r.t the taxonomy. For evaluation in a real-world setting, we curate a new dataset, namely FSD-FS, based on the FSD50K dataset and compare the proposed methods and other few-shot classifiers using this dataset. Experiments demonstrate that the proposed method outperforms non-ontology-based methods on the FSD-FS dataset.en_US
dc.format.extent? - ? (5)
dc.publisherIEEEen_US
dc.rights© 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
dc.subjectFew-shot learningen_US
dc.subjectmulti-label classificationen_US
dc.subjectaudio taxonomyen_US
dc.subjecteveryday sound recognitionen_US
dc.titleLearning from taxonomy: multi-label few-shot classification for everyday sound recognitionen_US
dc.typeConference Proceedingen_US
dc.identifier.doi10.1109/ICASSP48485.2024.10446908
pubs.notesNot knownen_US
pubs.publication-statusAccepteden_US
pubs.publisher-urlhttps://2024.ieeeicassp.org/en_US
dcterms.dateAccepted2023-12-13
rioxxterms.funderDefault funderen_US
rioxxterms.identifier.projectDefault projecten_US
qmul.funderAI for everyday sounds::Engineering and Physical Sciences Research Councilen_US
qmul.funderAI for everyday sounds::Engineering and Physical Sciences Research Councilen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record