Ground-truth-based trajectory evaluation in videos
MetadataShow full item record
Video tracking involves estimating the state(s) of target(s) over time on the image plane, where the sequence of target states is termed as a trajectory. Trajectory evaluation refers to the assessment of a tracker’s results that may be based on quantification of the discrepancy in the estimated states with respect to the corresponding ground-truth states. In this thesis, after presenting a review of the related work, we make the following proposals for the ground-truthbased trajectory evaluation in videos. We propose three overlap-based measures that account for the key aspects of multi-target tracking evaluation including accuracy, cardinality error and ID changes. The measures quantify tracking performance by combining accuracy and cardinality errors at frame level, computing the sequence-level tracking accuracy at varying accuracy levels, and measuring ID changes while considering the length of the track in which they occur. An extensive experimental validation of the proposed measures is conducted using four state-of-the-art multi-target trackers on challenging real-world publicly-available datasets. The proposed measures show advantages (because they are parameter independent and numerically bounded) over the existing measures and enable a thorough evaluation of trackers while identifying their strengths and weaknesses. We present a protocol that is composed of a set of trials that evaluate the robustness of trackers on a range of test scenarios representing several real-world conditions. To compare single-target trackers’ performance on trials, we present a single-score parameter-independent evaluation measure that quantifies tracking success and failure, and combines them for both summative and formative performance assessment. The protocol is validated on publicly available sequences with a diversity of targets and challenges using eight state-of-the-art single-target trackers. Through an extensive experimental analysis, the framework facilitates the selection of trackers for different operational conditions in real-world applications and for different target types. Finally, to quantitatively compare the relative performances of evaluation measures we propose a methodology based on determining the probabilistic agreement between tracking result decisions made by measures and those made by humans. We use tracking results on publicly available datasets with different target types and varying challenges, and collect the judgments of 90 skilled, semi-skilled and unskilled human subjects using a web-based performance assessment test. The analysis of the agreements allows us to highlight the variation in performance of the different measures and the most appropriate ones for the evaluation and comparison of trackers.
AuthorsNawaz, Tahir Habib
- Theses