• Login
    JavaScript is disabled for your browser. Some features of this site may not work without it.
    Deep Generative Variational Autoencoding for Replay Spoof Detection in Automatic Speaker Verification 
    •   QMRO Home
    • School of Electronic Engineering and Computer Science
    • Electronic Engineering and Computer Science
    • Deep Generative Variational Autoencoding for Replay Spoof Detection in Automatic Speaker Verification
    •   QMRO Home
    • School of Electronic Engineering and Computer Science
    • Electronic Engineering and Computer Science
    • Deep Generative Variational Autoencoding for Replay Spoof Detection in Automatic Speaker Verification
    ‌
    ‌

    Browse

    All of QMROCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects
    ‌
    ‌

    Administrators only

    Login
    ‌
    ‌

    Statistics

    Most Popular ItemsStatistics by CountryMost Popular Authors

    Deep Generative Variational Autoencoding for Replay Spoof Detection in Automatic Speaker Verification

    View/Open
    Accepted version (1.560Mb)
    Publisher
    Elsevier
    DOI
    10.1016/j.csl.2020.101092
    Journal
    Computer Speech and Language
    ISSN
    0885-2308
    Metadata
    Show full item record
    Abstract
    Automatic speaker verification (ASV) systems are highly vulnerable to presentation attacks, also called spoofing attacks. Replay is among the simplest attacks to mount - yet difficult to detect reliably. The generalization failure of spoofing countermeasures (CMs) has driven the community to study various alternative deep learning CMs. The majority of them are supervised approaches that learn a human-spoof discriminator. In this paper, we advocate a different, deep generative approach that leverages from powerful unsupervised manifold learning in classification. The potential benefits include the possibility to sample new data, and to obtain insights to the latent features of genuine and spoofed speech. To this end, we propose to use variational autoencoders (VAEs) as an alternative backend for replay attack detection, via three alternative models that differ in their class-conditioning. The first one, similar to the use of Gaussian mixture models (GMMs) in spoof detection, is to train independently two VAEs - one for each class. The second one is to train a single conditional model (C-VAE) by injecting a one-hot class label vector to the encoder and decoder networks. Our final proposal integrates an auxiliary classifier to guide the learning of the latent space. Our experimental results using constant-Q cepstral coefficient (CQCC) features on the ASVspoof 2017 and 2019 physical access subtask datasets indicate that the C-VAE offers substantial improvement in comparison to training two separate VAEs for each class. On the 2019 dataset, the C-VAE outperforms the VAE and the baseline GMM by an absolute 9-10% in both equal error rate (EER) and tandem detection cost function (t-DCF) metrics. Finally, we propose VAE residuals --- the absolute difference of the original input and the reconstruction as features for spoofing detection. The proposed frontend approach augmented with a convolutional neural network classifier demonstrated substantial improvement over the VAE backend use case.
    Authors
    Chettri, B; Kinnunen, T; Benetos, E
    URI
    https://qmro.qmul.ac.uk/xmlui/handle/123456789/63242
    Collections
    • Electronic Engineering and Computer Science [1878]
    Licence information
    This is a pre-copyedited, author-produced version of an article accepted for publication in Computer Speech and Language following peer review.
    Copyright statements
    © Elsevier 2020
    Twitter iconFollow QMUL on Twitter
    Twitter iconFollow QM Research
    Online on twitter
    Facebook iconLike us on Facebook
    • Site Map
    • Privacy and cookies
    • Disclaimer
    • Accessibility
    • Contacts
    • Intranet
    • Current students

    Modern Slavery Statement

    Queen Mary University of London
    Mile End Road
    London E1 4NS
    Tel: +44 (0)20 7882 5555

    © Queen Mary University of London.