Sparse Approximation and Dictionary Learning with Applications to Audio Signals
MetadataShow full item record
Over-complete transforms have recently become the focus of a wide wealth of research in signal processing, machine learning, statistics and related fields. Their great modelling flexibility allows to find sparse representations and approximations of data that in turn prove to be very efficient in a wide range of applications. Sparse models express signals as linear combinations of a few basis functions called atoms taken from a so-called dictionary. Finding the optimal dictionary from a set of training signals of a given class is the objective of dictionary learning and the main focus of this thesis. The experimental evidence presented here focuses on the processing of audio signals, and the role of sparse algorithms in audio applications is accordingly highlighted. The first main contribution of this thesis is the development of a pitch-synchronous transform where the frame-by-frame analysis of audio data is adapted so that each frame analysing periodic signals contains an integer number of periods. This algorithm presents a technique for adapting transform parameters to the audio signal to be analysed, it is shown to improve the sparsity of the representation if compared to a non pitchsynchronous approach and further evaluated in the context of source separation by binary masking. A second main contribution is the development of a novel model and relative algorithm for dictionary learning of convolved signals, where the observed variables are sparsely approximated by the atoms contained in a convolved dictionary. An algorithm is devised to learn the impulse response applied to the dictionary and experimental results on synthetic data show the superior approximation performance of the proposed method compared to a state-of-the-art dictionary learning algorithm. Finally, a third main contribution is the development of methods for learning dictionaries that are both well adapted to a training set of data and mutually incoherent. Two novel algorithms namely the incoherent k-svd and the iterative projections and rotations (ipr) algorithm are introduced and compared to different techniques published in the literature in a sparse approximation context. The ipr algorithm in particular is shown to outperform the benchmark techniques in learning very incoherent dictionaries while maintaining a good signal-to-noise ratio of the representation.
- Theses