Unsupervised Training of Deep Neural Networks for Motion Estimation
MetadataShow full item record
This thesis addresses the problem of motion estimation, that is, the estimation of a eld that describes how pixels move from a reference frame to a target frame, using Deep Neural Networks (DNNs). In contrast to classic methods, we don't solve an optimization problem at test time. We train DNNs once and apply it in one pass during the test which reduces the computational complexity. The major contribution is that in contrast to a supervised method, we train our DNNs in an unsupervised way. By unsupervised, we mean without the need for ground truth motion elds which are expensive to obtain for real scenes. More speci cally, we have trained our networks by designing cost functions inspired by classical optical ow estimation schemes and generative methods in Computer Vision. We rst propose a straightforward CNN method that is trained to optimize the brightness constancy constraint and we embed it in a classical multiscale scheme in order to predict motions that are large in magnitude (GradNet). We show that GradNet generalizes well to an unknown dataset and performed comparably with state-of-the-art unsupervised methods at that time. Second, we propose a convolutional Siamese architecture wherein is embedded a new soft warping scheme applied in a multiscale framework and is trained to optimize a higher-level feature constancy constraint (LikeNet). The architecture of LikeNet allows a trade-o between the computational load and memory and is 98% smaller than other SOA methods in terms of learned parameters. We show that LikeNet performs on par with SOA approaches and the best among uni-directional methods, methods that calculate motion eld in one pass. Third, we propose a novel approach to distill slower LikeNet in a much faster regression neural network without losing much of the accuracy (QLikeNet). The results show that using DNNs is a promising direction for motion estimation, although further improvements are required as classical methods yet perform the best.
- Theses