• Login
    JavaScript is disabled for your browser. Some features of this site may not work without it.
    Ensemble Models for Spoofing Detection in Automatic Speaker Verification 
    •   QMRO Home
    • School of Electronic Engineering and Computer Science
    • Centre for Digital Music (C4DM)
    • Ensemble Models for Spoofing Detection in Automatic Speaker Verification
    •   QMRO Home
    • School of Electronic Engineering and Computer Science
    • Centre for Digital Music (C4DM)
    • Ensemble Models for Spoofing Detection in Automatic Speaker Verification
    ‌
    ‌

    Browse

    All of QMROCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects
    ‌
    ‌

    Administrators only

    Login
    ‌
    ‌

    Statistics

    Most Popular ItemsStatistics by CountryMost Popular Authors

    Ensemble Models for Spoofing Detection in Automatic Speaker Verification

    View/Open
    Accepted version (150.0Kb)
    Pagination
    1018 - 1022
    Publisher
    International Speech Communication Association (ISCA)
    Publisher URL
    https://www.interspeech2019.org/
    Metadata
    Show full item record
    Abstract
    Detecting spoofing attempts of automatic speaker verification (ASV) systems is challenging, especially when using only one modelling approach. For robustness, we use both deep neural networks and traditional machine learning models and combine them as ensemble models through logistic regression. They are trained to detect logical access (LA) and physical access (PA) attacks on the dataset released as part of the ASV Spoofing and Countermeasures Challenge 2019. We propose dataset partitions that ensure different attack types are present during training and validation to improve system robustness. Our ensemble model outperforms all our single models and the baselines from the challenge for both attack types. We investigate why some models on the PA dataset strongly outperform others and find that spoofed recordings in the dataset tend to have longer silences at the end than genuine ones. By removing them, the PA task becomes much more challenging, with the tandem detection cost function (t-DCF) of our best single model rising from 0.1672 to 0.5018 and equal error rate (EER) increasing from 5.98% to 19.8% on the development set.
    Authors
    Chettri, B; Stoller, D; Morfi, V; Martinez Ramirez, M; Benetos, E; Sturm, B; 20th Annual Conference of the International Speech Communication Association (INTERSPEECH 2019)
    URI
    https://qmro.qmul.ac.uk/xmlui/handle/123456789/58459
    Collections
    • Centre for Digital Music (C4DM) [210]
    Copyright statements
    © The Author(s) 2019
    Twitter iconFollow QMUL on Twitter
    Twitter iconFollow QM Research
    Online on twitter
    Facebook iconLike us on Facebook
    • Site Map
    • Privacy and cookies
    • Disclaimer
    • Accessibility
    • Contacts
    • Intranet
    • Current students

    Modern Slavery Statement

    Queen Mary University of London
    Mile End Road
    London E1 4NS
    Tel: +44 (0)20 7882 5555

    © Queen Mary University of London.