Leveraging synthetic data for improving chamber ensemble separation

Sarkar, S; Thorpe, L; Benetos, E; Sandler, M; IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

View/Open

Accepted version (186.4Kb)

Publisher

IEEE

Metadata

Show full item record

Abstract

In this work, we tackle the challenging problem of separating monophonic instrument mixtures found in chamber music from monaural recordings. This task differs from the Music Demixing Challenge where the task is to separate vocals, drums, and bass stems from mastered stereo tracks. In our task, we separate the instruments in a permutation invariant fashion such that our model is capable of separating any two monophonic instruments, including mixtures of the same instrument. This task is particularly difficult due to label ambiguity and high spectral overlap. In this paper, we present a pre-training strategy and data augmentation pipeline using the multi-mic renders from the synthetic chamber ensemble dataset EnsembleSet and evaluate its impact using real-world chamber ensemble recordings from the URMP dataset. Our data augmentation pipeline, using synthetic data, has resulted in up to a remarkable +5.14 dB cross-dataset performance improvement for time-domain separation models when tested on real data. Our fine-tuning strategy in conjunction with our data augmentation pipeline results in up to +10.62 dB performance improvement w.r.t. our baseline for chamber ensemble separation. We report a strong negative correlation between pitch overlap and separation performance with an average of 5 dB performance drop for examples with pitch overlaps. We also show that pre-training our model with string, wind, and brass ensembles helps with separation of vocal harmony mixtures from Bach Chorales and Barbershop Quartet datasets with up to +17.92 dB SI-SDR improvement for 2 source vocal harmony mixtures.

Authors

Sarkar, S; Thorpe, L; Benetos, E; Sandler, M; IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

URI

https://qmro.qmul.ac.uk/xmlui/handle/123456789/89844

Collections

Electronic Engineering and Computer Science [3387]

Licence information

This item is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.