Linking Music Metadata.
Abstract
The internet has facilitated music metadata production and distribution on an unprecedented
scale. A contributing factor of this data deluge is a change in the
authorship of this data from the expert few to the untrained crowd. The resulting
unordered flood of imperfect annotations provides challenges and opportunities in
identifying accurate metadata and linking it to the music audio in order to provide
a richer listening experience. We advocate novel adaptations of Dynamic Programming
for music metadata synchronisation, ranking and comparison. This thesis
introduces Windowed Time Warping, Greedy, Constrained On-Line Time Warping
for synchronisation and the Concurrence Factor for automatically ranking metadata.
We begin by examining the availability of various music metadata on the web.
We then review Dynamic Programming methods for aligning and comparing two
source sequences whilst presenting novel, specialised adaptations for efficient, realtime
synchronisation of music and metadata that make improvements in speed and
accuracy over existing algorithms. The Concurrence Factor, which measures the
degree in which an annotation of a song agrees with its peers, is proposed in order to
utilise the wisdom of the crowds to establish a ranking system. This attribute uses
a combination of the standard Dynamic Programming methods Levenshtein Edit
Distance, Dynamic Time Warping, and Longest Common Subsequence to compare
annotations.
We present a synchronisation application for applying the aforementioned methods
as well as a tablature-parsing application for mining and analysing guitar tablatures
from the web. We evaluate the Concurrence Factor as a ranking system on a largescale
collection of guitar tablatures and lyrics to show a correlation with accuracy
that is superior to existing methods currently used in internet search engines, which
are based on popularity and human ratings
Authors
Macrae, RobertCollections
- Theses [3822]