1. bookVolume 17 (2018): Issue 2 (December 2018)
Journal Details
License
Format
Journal
eISSN
1684-4769
First Published
16 Apr 2016
Publication timeframe
2 times per year
Languages
English
access type Open Access

Mining Automatically Estimated Poses from Video Recordings of Top Athletes

Published Online: 31 Dec 2018
Volume & Issue: Volume 17 (2018) - Issue 2 (December 2018)
Page range: 94 - 112
Journal Details
License
Format
Journal
eISSN
1684-4769
First Published
16 Apr 2016
Publication timeframe
2 times per year
Languages
English
Abstract

Human pose detection systems based on state-of-the-art DNNs are about to be extended, adapted and re-trained to fit the application domain of specific sports. Therefore, plenty of noisy pose data will soon be available from videos recorded at a regular and frequent basis. This work is among the first to develop mining algorithms that can mine the expected abundance of noisy and annotation-free pose data from video recordings in individual sports. Using swimming as an example of a sport with dominant cyclic motion, we show how to determine unsupervised time-continuous cycle speeds and temporally striking poses as well as measure unsupervised cycle stability over time. The average error in cycle length estimation across all strokes is 0.43 frames at 50 fps compared to manual annotations. Additionally, we use long jump as an example of a sport with a rigid phase-based motion to present a technique to automatically partition the temporally estimated pose sequences into their respective phases with a mAP of 0.89. This enables the extraction of performance relevant, pose-based metrics currently used by national professional sports associations. Experimental results prove the effectiveness of our mining algorithms, which can also be applied to other cycle-based or phase-based types of sport.

Keywords

Andriluka, M., Pishchulin, L., Gehler, P., & Schiele, B. (2014). 2D human pose estimation: New benchmark and state of the art analysis. In IEEE Conference on Computer Vision and Pattern Recognition (cvpr), 3686–3693.10.1109/CVPR.2014.471Search in Google Scholar

Baysal, S., Kurt, M. C., & Duygulu, P. (2010). Recognizing human actions using key poses. In 20th International Conference on Pattern Recognition (ICPR), 1727–1730.10.1109/ICPR.2010.427Search in Google Scholar

Chu, X., Yang, W., Ouyang, W., Ma, C., Yuille, A. L., & Wang, X. (2017). Multicontext attention for human pose estimation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1831–1840.Search in Google Scholar

de Souza Vicente, C. M., Nascimento, E. R., Emery, L. E. C., Flor, C. A. G., Vieira, T., & Oliveira, L. B. (2016). High performance moves recognition and sequence segmentation based on key poses filtering. In IEEE Winter Conference on Applications of Computer Vision (WACV), 1–8.10.1109/WACV.2016.7477711Search in Google Scholar

Einfalt, M., Zecha, D., & Lienhart, R. (2018). Activity-conditioned continuous human pose estimation for performance analysis of athletes using the example of swimming. In IEEE Winter Conference on Applications of Computer Vision (WACV), 446–455.10.1109/WACV.2018.00055Search in Google Scholar

Frey, B. J., & Dueck, D. (2007). Clustering by passing messages between data points. Science, 315 (5814), 972–976.Search in Google Scholar

Gorban, A., Idrees, H., Jiang, Y.-G., Roshan Zamir, A., Laptev, I., Shah, M., & Sukthankar, R. (2015). THUMOS challenge: Action recognition with a large number of classes. http://www.thumos.info/.Search in Google Scholar

Heilbron, F. C., Escorcia, V., Ghanem, B., & Niebles, J. C. (2015). Activitynet: A large-scale video benchmark for human activity understanding. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 961–970.10.1109/CVPR.2015.7298698Search in Google Scholar

Levenshtein, V. I. (1966). Binary codes capable of correcting deletions, insertions, and reversals. In Soviet physics doklady, 10, 707–710.Search in Google Scholar

Li, H., Tang, J., Wu, S., Zhang, Y., & Lin, S. (2010). Automatic detection and analysis of player action in moving background sports video sequences. IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 20 (3), 351–364.Search in Google Scholar

Lv, F., & Nevatia, R. (2007). Single view human action recognition using key pose matching and viterbi path searching. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1–8.10.1109/CVPR.2007.383131Search in Google Scholar

Meyers, E. W. (1994). A sublinear algorithm for approximate keyword matching. Algorithmica, 12 (4-5), 345–374.10.1007/BF01185432Search in Google Scholar

Newell, A., Yang, K., & Deng, J. (2016). Stacked hourglass networks for human pose estimation. In B. Leibe, J. Matas, N. Sebe, & M. Welling (Eds.), European Conference on Computer Vision (ECCV) (pp. 483–499). Cham: Springer International Publishing.Search in Google Scholar

Pansold, B., Zinner, J., & Gabriel, B. (1985). Zum einsatz und zur interpretation von laktatbestimmungen in der leistungsdiagnostik. Theorie und Praxis des Leistungssports, 23, 98–195.Search in Google Scholar

Pyne, D. B., Lee, H., & Swanwick, K. M. (2001). Monitoring the lactate threshold in world-ranked swimmers. Medicine and Science in Sports and Exercise, 33 (2), 291–297.10.1097/00005768-200102000-0001911224820Search in Google Scholar

Rabiner, L. R. (1989, Feb). A tutorial on hidden markov models and selected applications in speech recognition. Proceedings of the IEEE, 77 (2), 257-286. doi: 10.1109/5.1862610.1109/5.18626Open DOISearch in Google Scholar

Ren, C., Lei, X., & Zhang, G. (2011). Motion data retrieval from very large motion databases. In International Conference on Virtual Reality and Visualization (ICVRV), 70–77.10.1109/ICVRV.2011.50Search in Google Scholar

Rowley, H. A., Baluja, S., & Kanade, T. (1998). Neural network-based face detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20 (1), 23–38.10.1109/34.655647Search in Google Scholar

Sedmidubsky, J., Valcik, J., & Zezula, P. (2013). A key-pose similarity algorithm for motion data retrieval. In 15th International Conference on Advanced Concepts for Intelligent Vision Systems (ACIVS), 669–681.10.1007/978-3-319-02895-8_60Search in Google Scholar

Victor, B., He, Z., Morgan, S., & Miniutti, D. (2017). Continuous video to simple signals for swimming stroke detection with convolutional neural networks. In IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 122–131.10.1109/CVPRW.2017.21Search in Google Scholar

Vögele, A., Krüger, B., & Klein, R. (2014). Efficient unsupervised temporal segmentation of human motion. In Proceedings of the ACM Siggraph/Eurographics Symposium on Computer Animation, 167–176.Search in Google Scholar

Wang, C., Wang, Y., & Yuille, A. L. (2013). An approach to pose-based action recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 915–922.10.1109/CVPR.2013.123Search in Google Scholar

Wei, S.-E., Ramakrishna, V., Kanade, T., & Sheikh, Y. (2016). Convolutional pose machines. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 4724–4732.10.1109/CVPR.2016.511Search in Google Scholar

Wu, C., Ma, Y.-F., Zhan, H.-J., & Zhong, Y.-Z. (2002). Events recognition by semantic inference for sports video. In IEEE International Conference on Multimedia and Expo (ICME), 1, 805–808.Search in Google Scholar

Yang, W., Li, S., Ouyang, W., Li, H., & Wang, X. (2017, Oct). Learning feature pyramids for human pose estimation. In IEEE International Conference on Computer Vision (ICCV).10.1109/ICCV.2017.144Search in Google Scholar

Zecha, D., Eggert, C., & Lienhart, R. (2017). Pose estimation for deriving kinematic parameters of competitive swimmers. In Computer Vision Applications in Sports, part of IS&T Electronic Imaging (pp. 21–29). Society for Imaging Science and Technology.10.2352/ISSN.2470-1173.2017.16.CVAS-345Search in Google Scholar

Recommended articles from Trend MD

Plan your remote conference with Sciendo