Towards automatic extraction of harmony information from music signals
In this thesis we address the subject of automatic extraction of harmony information from audio recordings. We focus on chord symbol recognition and methods for evaluating algorithms designed to perform that task. We present a novel six-dimensional model for equal tempered pitch space based on concepts from neo-Riemannian music theory. This model is employed as the basis of a harmonic change detection function which we use to improve the performance of a chord recognition algorithm. We develop a machine readable text syntax for chord symbols and present a hand labelled chord transcription collection of 180 Beatles songs annotated using this syntax. This collection has been made publicly available and is already widely used for evaluation purposes in the research community. We also introduce methods for comparing chord symbols which we subsequently use for analysing the statistics of the transcription collection. To ensure that researchers are able to use our transcriptions with confidence, we demonstrate a novel alignment algorithm based on simple audio fingerprints that allows local copies of the Beatles audio files to be accurately aligned to our transcriptions automatically. Evaluation methods for chord symbol recall and segmentation measures are discussed in detail and we use our chord comparison techniques as the basis for a novel dictionary-based chord symbol recall calculation. At the end of the thesis, we evaluate the performance of fifteen chord recognition algorithms (three of our own and twelve entrants to the 2009 MIREX chord detection evaluation) on the Beatles collection. Results are presented for several different evaluation measures using a range of evaluation parameters. The algorithms are compared with each other in terms of performance but we also pay special attention to analysing and discussing the benefits and drawbacks of the different evaluation methods that are used.
- Theses