Alba Ribo; Dawid Warchoi; Mariusz Oszust

An Approach to Gesture Recognition with Skeletal Data Using Dynamic Time Warping and Nearest Neighbour Classifier

Full Text (PDF, 605KB), PP.1-8

Views: 0 Downloads: 0

Author(s)

Alba Ribo ^1,* Dawid Warchoi ² Mariusz Oszust ²

1. University of Lleida, 25001, Catalonia, Spain

2. Department of Computer and Control Engineering, Rzeszow University of Technology W. Pola 2, 35-959 Rzeszow, Poland

* Corresponding author.

DOI: https://doi.org/10.5815/ijisa.2016.06.01

Received: 20 Aug. 2015 / Revised: 11 Dec. 2015 / Accepted: 4 Feb. 2016 / Published: 8 Jun. 2016

Index Terms

Gesture Recognition, Nearest Neighbour Classifier, Dynamic Time Warping, Kinect, Skeletal Data, Matlab

Abstract

Gestures are natural means of communication between humans, and therefore their application would benefit to many fields where usage of typical input devices, such as keyboards or joysticks is cumbersome or unpractical (e.g., in noisy environment). Recently, together with emergence of new cameras that allow obtaining not only colour images of observed scene, but also offer the software developer rich information on the number of seen humans and, what is most interesting, 3D positions of their body parts, practical applications using body gestures have become more popular. Such information is presented in a form of skeletal data. In this paper, an approach to gesture recognition based on skeletal data using nearest neighbour classifier with dynamic time warping is presented. Since similar approaches are widely used in the literature, a few practical improvements that led to better recognition results are proposed. The approach is extensively evaluated on three publicly available gesture datasets and compared with state-of-the-art classifiers. For some gesture datasets, the proposed approach outperformed its competitors in terms of recognition rate and time of recognition.

Cite This Paper

Alba Ribó, Dawid Warchoł, Mariusz Oszust, "An Approach to Gesture Recognition with Skeletal Data Using Dynamic Time Warping and Nearest Neighbour Classifier", International Journal of Intelligent Systems and Applications (IJISA), Vol.8, No.6, pp.1-8, 2016. DOI:10.5815/ijisa.2016.06.01

Reference

[1]M. Oszust, and M. Wysocki, “Recognition of Signed Expressions Observed by Kinect Sensor,” Advanced Video and Signal Based Surveillance (AVSS), 2013 10th IEEE International Conference, pp. 220-225, 2013, doi: 10.1109/AVSS.2013.6636643.
[2]T. Kapuscinski, M. Oszust, M. Wysocki, and D. Warchoł, “Recognition of Hand Gestures Observed by Depth Cameras,” International Journal of Advanced Robotic Systems, January 2015, doi: 10.5772/60091.
[3]C. Sun, T. Zhang, and C. Xu, “Latent Support Vector Machine Modeling for Sign Language Recognition with Kinect,” ACM Trans. Intell. Syst. Technol., 6(2), 2015, doi: 10.1145/2629481.
[4]L. Pigou, S. Dieleman, P. J. Kindermans, and B. Schrauwen, “Sign Language Recognition Using Convolutional Neural Networks”, European Conference on Computer Vision (ECCV) Workshops, Lecture Notes in Computer Science, vol. 8925, pp. 572-578, Springer, 2015, doi: 10.1007/978-3-319-16178-5_40.
[5]Z. Halim and G. Abbas, “A Kinect-Based Sign Language Hand Gesture Recognition System for Hearing- and Speech-Impaired: A Pilot Study of Pakistani Sign Language,” Assistive Technology: The Official Journal of RESNA, pp. 34-43, 2014, doi: 10.1080/10400435.2014.952845.
[6]Z. Zafrulla, H. Sahni, A. Bedri, P. Thukral, and T. Starner, “Hand detection in American Sign Language depth data using domain-driven random forest regression,” 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), pp. 1-7, 2015, doi: 10.1109/FG.2015.7163135.
[7]H. D. Yang, “Sign language recognition with the Kinect sensor based on conditional random fields,” Sensors, 15(1), 135-147, 2015, doi: 10.3390/s150100135.
[8]C. Wang, Z. Liu, and S.-C. Chan, “Superpixel-Based Hand Gesture Recognition with Kinect Depth Camera,” IEEE Transactions on Multimedia, 17(1), pp. 29-39, 2015, doi: 10.1109/TMM.2014.2374357.
[9]F. Jiang, S. Zhang, S. Wu, Y. Gao, and D. Zhao, “Multi-layered Gesture Recognition with Kinect,” Journal of Machine Learning Research, vol. 16, pp. 227-254, 2015.
[10]Z. Xinshuang, A. M. Naguib, and S. Lee, “Kinect based calling gesture recognition for taking order service of elderly care robot,” 23rd IEEE International Symposium on Robot and Human Interactive Communication RO-MAN, pp. 525-530 2014, doi: 10.1109/ROMAN.2014.6926306.
[11]A. Cunhaa, L. Páduab, L. Costab, and P. Trigueiros, “Evaluation of MS Kinect for Elderly Meal Intake Monitoring,” 6th Conference on ENTERprise Information Systems– aligning technology, organizations and people, CENTERIS, vol. 16, pp. 1383–1390, 2014, doi: 10.1016/j.protcy.2014.10.156.
[12]C. Vogler and D. N. Metaxas, “Toward Scalability in ASL Recognition: Breaking Down Signs into Phonemes,” A. Braffort, R. Gherbi, S. Gibet, J. Richardson, and D. Teil editors, Gesture Workshop, vol. 1739, Lecture Notes in Computer Science, pp. 211–224. 1999, doi: 10.1007/3-540-46616-9_19.
[13]Gupta, S. Kundu, R. Pandey, R. Ghosh, R. Bag, and A. Mallik, “Hand Gesture Recognition and Classification by Discriminant and Principal Component Analysis Using Machine Learning Techniques,” IJARAI, 1(9), 2012.
[14]G. Awad, J. Han, and A. Sutherland, “Novel Boosting Framework for Subunit-Based Sign Language Recognition,” Proceedings of the 16th IEEE international conference on Image processing, pp. 2693–2696, 2009, doi: 10.1109/ICIP.2009.5414159.
[15]M. Zahedi and A. R. Manashty, “Robust Sign Language Recognition System Using Tof Depth Cameras,” Computing Research Repository - CORR, abs/1105.0, 2011.
[16]V. Dixit, and A. Agrawal, “Real Time Hand Detection & Tracking for Dynamic Gesture Recognition,” IJISA, 7(8), pp. 38-44, 2015, doi: 10.5815/ijisa.2015.08.05.
[17]W. Gao, G. Fang, D. Zhao, and Y. Chen. “A Chinese Sign Language Recognition System Based on SOFM/SRN/HMM,” Pattern Recognition, 37(12), pp. 2389–2402, 2004, doi: 10.1016/j.patcog.2004.04.008.
[18]V. Athitsos, C. Neidle, S. Sclaroff, J. Nash, R. Stefan, A. Thangali, H. Wang, and Q. Yuan. “Large Lexicon Project: American Sign Language Video Corpus and Sign Language Indexing/ Retrieval Algorithms,” Workshop on the Representation and Processing of Sign Languages: Corpora and Sign Language Technologies (CSLT), pp. 11-14, 2010.
[19]J. Shotton, A. Fitzgibbon, M. Cook, T. Sharp, M. Finocchio, R. Moore, et al. “Real-time Human Pose Recognition in Parts From Single Depth Images,” Communications of the ACM, 56(1), pp. 116-124, 2013, doi: 10.1007/978-3-642-28661-2_5.
[20]K. Lai, J. Konrad, and P. Ishwar, “A Gesture-Driven Computer Interface Using Kinect,” Image Analysis and Interpretation (SSIAI), 2012 IEEE Southwest Symposium, pp. 185–188, 2012, doi: 10.1109/SSIAI.2012.6202484.
[21]Z. Ren, J. Meng, J. Yuan, and Z. Zhang. “Robust Hand Gesture Recognition with Kinect Sensor,” Proceedings of the19th ACM international conference on Multimedia, pp. 759–760, ACM, 2011, doi: 10.1145/2072298.2072443.
[22]K. K. Biswas and S. Basu,“Gesture Recognition Using Microsoft Kinect,” Automation, Robotics and Applications(ICARA), 2011 5th International Conference, pp. 100–103, 2011, doi: 10.1109/ICARA.2011.6144864.
[23]O. Patsadu, C. Nukoolkit, and B. Watanapa, “Human Gesture Recognition Using Kinect camera,” Computer Science and Software Engineering (JCSSE), 2012 International Joint Conference, pp. 28–32, 2012, doi: 10.1109/JCSSE.2012.6261920.
[24]S. Lang, M. Block, and R. Rojas, “Sign Language Recognition Using Kinect,” L. Rutkowski, M. Korytkowski, R. Scherer, R. Tadeusiewicz, L. Zadeh, and J. Zurada editors, Artificial Intelligence and Soft Computing, vol. 7267, Lecture Notes in Computer Science, Springer, pp. 394–402, 2012, doi: 10.1007/978-3-642-29347-4_46.
[25]Z. Zafrulla, H. Brashear, T. Starner, H. Hamilton, and P. Presti. “American Sign Language Recognition with the Kinect,” Proceedings of the 13th international conference on multimodal interfaces, pp. 279–286, ACM, 2011, doi: 10.1145/2070481.2070532.
[26]D. Uebersax, J. Gall, M. Van den Bergh, and L. Van Gool, “Real-time sign language letter and word recognition from depth data,” Computer Vision Workshops (ICCV) Workshops, 2011 IEEE International Conference, pp. 383–390, 2011, doi: 10.1109/ICCVW.2011.6130267.
[27]A. M. Ulaşlı, U. Türkmen, H. Toktas, and O. Solak, “The Complementary Role of the Kinect Virtual Reality Game Training in a Patient With Metachromatic Leukodystrophy,” PM&R, 6(6), pp. 564-7, 2014, doi: 10.1016/j.pmrj.2013.11.010
[28]H. M. Hondori and M. Khademi, “A Review on Technical and Clinical Impact of Microsoft Kinect on Physical Therapy and Rehabilitation,” Journal of Medical Engineering, vol. 2014, Article ID 846514, 16 pages, 2014, doi:10.1155/2014/846514.
[29]Z. Zhang, “Microsoft Kinect Sensor and its Effect,” Multimedia, IEEE, 19(2), pp. 4-10, 2012, doi: 10.1109/MMUL.2012.24.
[30]S. Celebi, A. S. Aydin, T. T. Temiz, and T. Arici, “Gesture Recognition using Skeleton Data with Weighted Dynamic Time Warping,” VISAPP 2013 (1), pp. 620-625. 2013, doi: 10.5220/0004217606200625.
[31]W. Li, Z. Zhang, and Z. Liu “Action Recognition Based on A Bag of 3D Points,” Computer Vision and Pattern Recognition Workshops (CVPRW), 2010 IEEE Computer Society Conference, pp. 9-14, 2010, doi: 10.1109/CVPRW.2010.5543273.
[32]J. Wang, Z. Liu, Y. Wu, AND J. Yuan, “Mining Actionlet Ensemble for Action Recognition with Depth Cameras,” Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference, pp. 1290-1297, 2012, doi: 10.1109/CVPR.2012.6247813.
[33]J. Martens and I. Sutskever, “Learning Recurrent Neural Networks with Hessian-Free Optimization,” Proceedings of the 28th International Conference on Machine Learning (ICML), 2011.
[34]M. Muller, and T. Roder, “Motion Templates for Automatic Classification and Retrieval of Motion Capture Data,” Proceedings of the 2006 ACM SIGGRAPH/ Eurographics symposium on Computer animation, pp. 137–146, Eurographics Association, 2006.
[35]F. Lv and R. Nevatia. “Recognition and Segmentation of 3-D Human Action Using HMM and Multi-class AdaBoost,” European Conference on Computer Vision (ECCV), pp. 359–372, 2006, doi: 10.1007/11744085_28.
[36]Saloni, R. K. Sharma, and Anil K. Gupta, “Voice Analysis for Telediagnosis of Parkinson Disease Using Artificial Neural Networks and Support Vector Machines,” IJISA, 7(6), pp. 41-47, 2015, doi: 10.5815/ijisa.2015.06.04.

International Journal of Intelligent Systems and Applications (IJISA)