Sparse Approximation and Dictionary Learning with Applications to Audio Signals
Abstract
Over-complete transforms have recently become the focus of a wide wealth of research in
signal processing, machine learning, statistics and related fields. Their great modelling
flexibility allows to find sparse representations and approximations of data that in turn
prove to be very efficient in a wide range of applications. Sparse models express signals as
linear combinations of a few basis functions called atoms taken from a so-called dictionary.
Finding the optimal dictionary from a set of training signals of a given class is the objective
of dictionary learning and the main focus of this thesis. The experimental evidence
presented here focuses on the processing of audio signals, and the role of sparse algorithms
in audio applications is accordingly highlighted.
The first main contribution of this thesis is the development of a pitch-synchronous
transform where the frame-by-frame analysis of audio data is adapted so that each frame
analysing periodic signals contains an integer number of periods. This algorithm presents
a technique for adapting transform parameters to the audio signal to be analysed, it
is shown to improve the sparsity of the representation if compared to a non pitchsynchronous
approach and further evaluated in the context of source separation by binary
masking.
A second main contribution is the development of a novel model and relative algorithm
for dictionary learning of convolved signals, where the observed variables are sparsely approximated
by the atoms contained in a convolved dictionary. An algorithm is devised to
learn the impulse response applied to the dictionary and experimental results on synthetic
data show the superior approximation performance of the proposed method compared to
a state-of-the-art dictionary learning algorithm.
Finally, a third main contribution is the development of methods for learning dictionaries
that are both well adapted to a training set of data and mutually incoherent. Two
novel algorithms namely the incoherent k-svd and the iterative projections and rotations
(ipr) algorithm are introduced and compared to different techniques published in the
literature in a sparse approximation context. The ipr algorithm in particular is shown
to outperform the benchmark techniques in learning very incoherent dictionaries while
maintaining a good signal-to-noise ratio of the representation.
Authors
Barchiesi, DanieleCollections
- Theses [4282]