Dense Motion Capture of Deformable Surfaces from Monocular Video
MetadataShow full item record
Accurate motion capture of deformable objects from monocular video sequences is a challenging Computer Vision problem with immense applicability to domains ranging from virtual reality, animation to image guided surgery. Existing dense motion capture methods rely on expensive setups with multiple calibrated cameras,structured light, active markers or prior scene knowledge learned from a large 3D dataset. In this thesis, we propose an end-to-end pipeline for 3D reconstruction of deformable scenes from a monocular video sequence. Our method relies on a two step pipeline in which temporally consistent video registration is followed by a dense non-rigid structure from motion approach. We present a data-driven method to reconstruct non-rigid smooth surfaces densely, using only a single video as input, without the need for any prior models or shape templates. We focus on the well explored low-rank prior for deformable shape reconstruction and propose its convex relaxation to introduce the first variational energy minimisation approach to non-rigid structure from motion. To achieve realistic dense reconstruction of sparsely textured surfaces, we incorporate an edge preserving spatial smoothness prior into the low-rank factorisation framework and design a single variational energy to address the non-rigid structure from motion problem. We also discuss the importance of long-term 2D trajectories for several vision problems and explain how subspace constraints can be used to exploit the redundancy present in the motion of real scenes for dense video registration. To that end, we adopt a variational optimisation approach to design a robust multi-frame video registration algorithm that combines a robust subspace prior with a total variation spatial regulariser. Throughout this thesis, we advocate the use of GPU-portable and scalable energy minimisation algorithms to progress towards practical dense non-rigid 3D motion capture from a single video in the presence of occlusions and illumination changes.
- Theses