End-to-End Probabilistic Inference for Nonstationary Audio Analysis

Wilkinson, WJ; Andersen, MR; Reiss, JD; Stowell, D; Solin, A

dc.contributor.author	Wilkinson, WJ	en_US
dc.contributor.author	Andersen, MR	en_US
dc.contributor.author	Reiss, JD	en_US
dc.contributor.author	Stowell, D	en_US
dc.contributor.author	Solin, A	en_US
dc.date.accessioned	2019-05-16T09:48:43Z
dc.identifier.uri	https://qmro.qmul.ac.uk/xmlui/handle/123456789/57579
dc.description	Accepted to the Thirty-sixth International Conference on Machine Learning (ICML) 2019	en_US
dc.description	Accepted to the Thirty-sixth International Conference on Machine Learning (ICML) 2019	en_US
dc.description	Accepted to the Thirty-sixth International Conference on Machine Learning (ICML) 2019	en_US
dc.description.abstract	A typical audio signal processing pipeline includes multiple disjoint analysis stages, including calculation of a time-frequency representation followed by spectrogram-based feature analysis. We show how time-frequency analysis and nonnegative matrix factorisation can be jointly formulated as a spectral mixture Gaussian process model with nonstationary priors over the amplitude variance parameters. Further, we formulate this nonlinear model's state space representation, making it amenable to infinite-horizon Gaussian process regression with approximate inference via expectation propagation, which scales linearly in the number of time steps and quadratically in the state dimensionality. By doing so, we are able to process audio signals with hundreds of thousands of data points. We demonstrate, on various tasks with empirical data, how this inference scheme outperforms more standard techniques that rely on extended Kalman filtering.	en_US
dc.subject	stat.ML	en_US
dc.subject	stat.ML	en_US
dc.subject	cs.LG	en_US
dc.subject	cs.SD	en_US
dc.subject	eess.AS	en_US
dc.subject	eess.SP	en_US
dc.title	End-to-End Probabilistic Inference for Nonstationary Audio Analysis	en_US
dc.type	Conference Proceeding
dc.rights.holder	© The Author(s) 2019
pubs.author-url	http://arxiv.org/abs/1901.11436v5	en_US
pubs.notes	Not known	en_US
rioxxterms.funder	Default funder	en_US
rioxxterms.identifier.project	Default project	en_US
qmul.funder	Structured machine listening for soundscapes with multiple birds::EPSRC	en_US

Files in this item

Name:: Stowell End-to-End 2019 Accept ...
Size:: 740.2Kb
Format:: application/
Description:: Accepted version

View/Open

This item appears in the following Collection(s)

Electronic Engineering and Computer Science [3424]

Show simple item record