Computational Modelling and Quantitative Analysis of Dynamics in Performed Music
MetadataShow full item record
Musical dynamics- loudness and changes in loudness - forms one of the key aspects of expressive music performance. Surprisingly this rather important research area has received little attention. A reason is the fact that while the concept of dynamics is related to signal amplitude, which is a low-level feature, the process of deriving perceived loudness from the signal is far from straightforward. This thesis advances the state of the art in the analysis of perceived loudness by modelling dynamic variations in expressive music performance and by studying the relation between dynamics in piano recordings and markings in the score. In particular, we show that dynamic changes: a) depend on the evolution of the performance and the local context of the piece; b) correspond to important score markings and music structures; and, c) can reflect wide divergences in performers' expressive strategies within and across pieces. In a preparatory stage, dynamic changes are obtained by linking existing music audio and score databases. All studies in this thesis are based on loudness levels extracted from 2000 recordings of 44 Mazurkas by Frederic Chopin. We propose a new method for efficiently aligning and annotating the data in score beat time representation, based on dynamic time warping applied to chroma features. Using the score-aligned recordings, we examine the relationship between loudness values and dynamic level categories. The research can be broadly categorised into two parts. The first investigates how dynamic markings map to performed loudness levels. Empirical results show that different dynamic markings do not correspond to fixed loudness thresholds. Rather, the important factors are the relative loudness of neighbouring markings, the inter-relations of nearby markings and other score information, the structural location of the markings, and the creative license exercised by the performer in inserting further interpretive dynamic shaping. The second part seeks to determine how changes in loudness levels map to score features using statistical change-point techniques. The results show that significant dynamic score markings do indeed correspond to change points. Furthermore, evidence suggests that change points in score positions without dynamic markings highlight structurally salient events or events based on temporal changes. In a separate bidirectional study, we investigate the relationship between dynamic mark- ings in the score and performed loudness using machine learning techniques. The techniques are applied to the prediction of loudness levels corresponding to dynamic markings, and to the classification of dynamic markings given loudness values. The results show that loudness values and markings can be predicted relatively well when trained across recordings of the same piece, but fail dismally when trained across a pianist's recordings of other pieces. The findings demonstrate that score features may trump individual style when modelling loudness choices. The analysis of the results reveal that form|such as the return of the theme - and structure - such as repetitions -influence predictability of loudness and markings. This research is a first step towards automatic audio-to-score transcription of dynamic markings. This insight will serve as a tool for expression synthesis and musicological studies.
- Theses