Computational Pronunciation Analysis in Sung Utterances

Demirel, E; Ahlback, S; Dixon, S

View/Open

Accepted Version (694.6Kb)

Publisher

arXiv

DOI

10.48550/arxiv.2106.10977

Journal

arXiv

Metadata

Show full item record

Abstract

Recent automatic lyrics transcription (ALT) approaches focus on building stronger acoustic models or indomain language models, while the pronunciation aspect is seldom touched upon. This paper applies a novel computational analysis on the pronunciation variances in sung utterances and further proposes a new pronunciation model adapted for singing. The singing-adapted model is tested on multiple public datasets via word recognition experiments. It performs better than the standard speech dictionary in all settings reporting the best results on ALT in a capella recordings using n-gram language models. For reproducibility, we share the sentencelevel annotations used in testing, providing a new benchmark evaluation set for ALT.

Authors

Demirel, E; Ahlback, S; Dixon, S

URI

https://qmro.qmul.ac.uk/xmlui/handle/123456789/97929

Collections

Electronic Engineering and Computer Science [3490]