Hand Gesture Recognition using a Low-Cost Sensor with Digital Signal Processing
Abstract
Our research concerns a hand gesture recognition framework that makes use of a low cost “off-the-shelf” device. The device is a visual markerless sensor system called the Leap Motion controller (LM). However, before deploying the LM, we investigate its accuracy and limitations in measuring finger joint angles. We consider a user that flexes and extends all the fingers on the hand i.e. users with missing fingers are not considered in this research. In addition, we assume a user's hand does not shake and can maintain a required position for the duration of the experiments. During finger joint angle error analysis, we conducted a series of experiments to assess the accuracy of the LM in terms of parameters such as elevation, lateral (side-to-side) positioning, forward-backward positioning, and rotation of the hand relative to the LM. We used an “artist’s hand” placed above the LM. The artist’s hand is more accurate than a human hand in performing static hand gestures as it can maintain a fixed position as long as is necessary. According to the results of the error analysis, we apply Principal Component Analysis (PCA) to the LM raw data to see whether it can compensate for these errors. Reasons for choosing PCA are described in Section 1.1. The experimental results show that the PCA is feasible, effective and can be applied such that accurate measurements can be obtained. Specifically, PCA was able to reduce AEs (Absolute Errors) by 37.5%, 28.3%, 33.0%, and 22.4% for the experimental results of elevation, lateral (side-to-side), forward-backward, and rotation, respectively. Furthermore, we have applied machine learning techniques such as Linear Discriminant Analysis (LDA) and Support Vector Machines (SVMs). The reasons for choosing these techniques rather than others can be found in Section 1.1. These techniques help in recognising and classifying performed hand gestures. In addition, while classifying hand gestures, these techniques can learn about measurement errors and compensate for them. Experimental results show a significant benefit when applying LDA and SVM, yielding a performance accuracy above 88.0%, which is far better than the baseline performance of 67.1%. The baseline performance is the accuracy obtained when we directly observe and assign all the test samples to the gestures they supposedly represent. Further explanation of the baseline performance can be found in the fourth paragraph of Section 5.2.3. We also propose and evaluate the use of Multi-dimensional Dynamic Time Warping (MDTW) for simulating a comparison of dynamic hand gestures that would be performed by a patient relative to hand gestures that could be prepared by a physiotherapist. MDTW enables us to determine how similar or different a query dynamic hand gesture is to a reference one whilst filtering out unwanted sources of error resulting from positional, rotational or speed differences between the query and the reference actions. It produces a minimum-distance value of a warp path after aligning a query dynamic hand gesture with a reference one. A low minimum-distance value implies the two gestures being compared are similar and high minimum-distance value implies the two gestures vary to a greater extent. When we deliberately compare a specific hand gesture with itself, we obtain a minimum-distance value of 0.0o indicating the similarity is 100.0%. Furthermore, when we compare two closely similar hand gestures i.e. gesture 1 and gesture 4 as described in Section 6.1.4, a minimum-distance value of 35.9o is obtained. However, when we compare two quite different gestures i.e. gesture 2 and gesture 3, a minimum-distance value of 248.5o is obtained. Therefore, one can establish whether a user performs hand gestures satisfactorily or an adjustment is required based on the minimum-distance values of the warp paths. Finally, we propose and implement PCA to investigate whether it is capable of improving the performance of LDA, SVM and MDTW. During PCA implementation, a feature vector that consists of the retained Principal Components (PCs) should be carefully selected. When we discard the first PC and retain the remainder as the feature vector, we obtain superior results where the performance of LDA, SVM and MDTW improves.
Authors
Walugembe, HusseinCollections
- Theses [4203]