Skeleton-based Human Action/Interaction Classification in Sparse Image Sequences

[1] C. Coppola, S. Cosar, D. R. Faria, and N. Bellotto. “Automatic detection of human interactions from RGB-D data for social activity classification,” 2017 26th IEEE International Symposium on Robot and Human Interactive Communication “RO-MAN”, Lisbon, 2017, pp. 871–876; doi: 10.1109/ROMAN.2017.8172405. Search in Google Scholar

[2] A. M. Zanchettin, A. Casalino, L. Piroddi, and P. Rocco. “Prediction of Human Activity Patterns for Human–Robot Collaborative Assembly Tasks,” IEEE Transactions on Industrial Informatics, vol. 15(2019), no. 7, pp. 3934–3942; doi: 10.1109/TII.2018.2882741. Search in Google Scholar

[3] Z. Zhang, G. Peng, W. Wang, Y. Chen, Y. Jia, and S. Liu. “Prediction-Based Human-Robot Collaboration in Assembly Tasks Using a Learning from Demonstration Model,” Sensors, 2022, no. 22(11):4279; doi: 10.3390/s22114279. Search in Google Scholar

[4] M. S. Ryoo. “Human activity prediction: Early Recognition of Ongoing Activities from Streaming Videos,” 2011 International Conference on Computer Vision, Barcelona, Spain, 2011, pp. 1036–1043; doi: 10.1109/ICCV.2011.6126349. Search in Google Scholar

[5] K. Viard, M. P. Fanti, G. Faraut, and J.-J. Lesage. “Human Activity Discovery and Recognition using Probabilistic Finite-State Automata. “IEEE Transactions on Automation Science and Engineering, vol. 17 (2020), no. 4, pp. 2085–2096; doi: 10.1109/TASE.2020.2989226. Search in Google Scholar

[6] S. Zhang, Z. Wei, J. Nie, L. Huang, S. Wang, and Z. Li. “A review on human activity recognition using vision-based method,” Journal of Healthcare Engineering, Hindawi, vol. 2017, Article ID 3090343; doi: 10.1155/2017/3090343. Search in Google Scholar

[7] A. Stergiou and R. Poppe. “Analyzing human-human interactions: a survey,” Computer Vision and Image Understanding, Elsevier, vol. 188 (2019), 102799; doi: 10.1016/j.cviu.2019.102799. Search in Google Scholar

[8] A. Bevilacqua, K. MacDonald, A. Rangarej, V. Widjaya, B. Caulfield, and T. Kechadi. “Human Activity Recognition with Convolutional Neural Networks,” Machine Learning and Knowledge Discovery in Databases (ECML PKDD 2018), LNAI vol. 11053, Springer, Cham, Switzerland, 2019, pp. 541–552; doi: 10.1007/978-3-030-10997-4_33. Search in Google Scholar

[9] M. Liu, and J. Yuan. “Recognizing Human Actions as the Evolution of Pose Estimation Maps,” 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2018), Salt Lake City, UT, USA, June 18-22, 2018, pp. 1159–1168; doi: 10.1109/CVPR.2018.00127. Search in Google Scholar

[10] E. Cippitelli, E. Gambi, S. Spinsante, and F. Florez-Revuelta. “Evaluation of a skeleton-based method for human activity recognition on a large-scale RGB-D dataset,” 2nd IET International Conference on Technologies for Active and Assisted Living (TechAAL 2016), London, UK, 2016; doi: 10.1049/IC.2016.0063. Search in Google Scholar

[11] Z. Cao, G. Hidalgo, T. Simon, S.-E. Wei, and Y. Sheikh, ”OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(1):172–186, 2021; doi: 10.1109/TPAMI.2019.2929257. Search in Google Scholar

[12] A. Toshev, and C. Szegedy. “DeepPose: Human Pose Estimation via Deep Neural Networks,” 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 2014, pp. 1653–1660; doi: 10.1109/CVPR.2014.214. Search in Google Scholar

[13] E. Insafutdinov, L. Pishchulin, B. Andres, M. Andriluka, and B. Schiele. “Deepercut: a deeper, stronger, and faster multi-person pose estimation model,” Computer Vision – ECCV 2016, LNCS vol. 9907, Springer, Cham, Switzerland, 2016, pp. 34–50; doi: 10.1007/978-3-319-46466-4_3. Search in Google Scholar

[14] [Online]. NTU RGB+D 120 Dataset. Papers With Code. Available online: https://paperswithcode.com/dataset/ntu-rgb-d-120 (accessed on 30 June 2022). Search in Google Scholar

[15] M. Perez, J. Liu, and A.C. Kot, “Interaction Relational Network for Mutual Action Recognition,” arXiv:1910.04963 [cs.CV], 2019; https://arxiv.org/abs/1910.04963 (accessed on 15.07.2022). Search in Google Scholar

[16] L.-P. Zhu, B. Wan, C.-Y. Li, G. Tian, Y. Hou, and K. Yuan. “Dyadic relational graph convolutional networks for skeleton-based human interaction recognition,” Pattern Recognition, Elsevier, vol. 115, 2021, p. 107920; doi: 10.1016/j.patcog.2021.107920. Search in Google Scholar

[17] R.-A. Jacobs, M.-I. Jordan, S.-J. Nowlan, and G.-E. Hinton. “Adaptive mixtures of local experts,” Neural Comput., 3(1):79–87, 1991. Search in Google Scholar

[18] S. Puchała, W. Kasprzak, and P. Piwowarski. “Feature engineering techniques for skeleton-based two-person interaction classification in video,” 17th International Conference on Control, Automation, Robotics and Vision (ICARCV), Singapore, 2022, IEEE Explore, pp. 66–71; doi: 10.1109/ICARCV57592.2022.10004329. Search in Google Scholar

[19] P.-F. Felzenszwalb, R.-B. Girshick, D. McAllester, and D. Ramanan, ”Object detection with discriminatively trained part-based models,” IEEE Trans. Pattern Anal. Mach. Intell., 2010, vol. 32, no. 9, pp. 1627–1645; doi: 10.1109/TPAMI.2009.167. Search in Google Scholar

[20] A. Krizhevsky, I. Sutskever, and G.-E. Hinton, “ImageNet classification with deep convolutional neural networks,” Communications of the ACM, 2017, vol. 60(6), pp. 84–90; doi: 10.1145/3065386. Search in Google Scholar

[21] K. Simonyan, and A. Zisserman. “Very Deep Convolutional Networks for Large-Scale Image Recognition,” arXiv, 2015, arXiv:1409.1556; https://arxiv.org/abs/1409.1556. Search in Google Scholar

[22] K. He, X. Zhang, S. Ren, and J. Sun. “Deep Residual Learning for Image Recognition,” Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016, pp. 770–778; doi: 10.1109/CVPR.2016.90. Search in Google Scholar

[23] T.-L. Munea, Y.-Z. Jembre, H.-T. Weldegebriel, L. Chen, C. Huang, and C. Yang. “The Progress of Human Pose Estimation: A Survey and Taxonomy of Models Applied in 2D Human Pose Estimation,” IEEE Access, 2020, vol. 8, pp. 133330–133348; doi: 10.1109/ACCESS.2020.3010248. Search in Google Scholar

[24] K. Wei, and X. Zhao. “Multiple-Branches Faster RCNN for Human Parts Detection and Pose Estimation,” Computer Vision – ACCV 2016 Workshops, Lecture Notes in Computer Science, vol. 10118, Springer, Cham, 2017; doi: 10.1007/978-3-319-54526-4. Search in Google Scholar

[25] Z. Su, M. Ye, G. Zhang, L. Dai, and J. Sheng. “Cascade feature aggregation for human pose estimation,” arXiv, 2019, arXiv:1902.07837; https://arxiv.org/abs/1902.07837. Search in Google Scholar

[26] H. Meng, M. Freeman, N. Pears, and C. Bailey. “Real-time human action recognition on an embedded, reconfigurable video processing architecture,” J. Real-Time Image Proc., vol. 3, 2008, no. 3, pp. 163–176; doi: 10.1007/s11554-008-0073-1. Search in Google Scholar

[27] K.-G. Manosha Chathuramali, and R. Rodrigo. “Faster human activity recognition with SVM,” International Conference on Advances in ICT for Emerging Regions (ICTer2012), Colombo, Sri Lanka, 12–15 December 2012, IEEE, 2012, pp. 197–203; doi: 10.1109/icter.2012.6421415. Search in Google Scholar

[28] X. Yan, and Y. Luo. “Recognizing human actions using a new descriptor based on spatial–temporal interest points and weighted-output classifier,” Neurocomputing, Elsevier, vol. 87, 2012, pp. 51–61, 15 June 2012; doi: 10.1016/j.neucom.2012.02.002. Search in Google Scholar

[29] R. Vemulapalli, F. Arrate, and R. Chellappa. “Human Action Recognition by Representing 3D Skeletons as Points in a Lie Group,” 2014 IEEE Conference on Computer Vision and Pattern Recognition, 23–28 June 2014, Columbus, OH, USA, IEEE, pp. 588–595; doi: 10.1109/cvpr.2014.82. Search in Google Scholar

[30] J. Liu, A. Shahroudy, D. Xu, and G. Wang, ”Spatio-Temporal LSTM with Trust Gates for 3D Human Action Recognition,” Computer Vision – ECCV 2016, Lecture Notes in Computer Science, vol. 9907, Springer, Cham, Switzerland, 2016, pp. 816–833; doi: 10.1007/978-3-319-46487-9_50. Search in Google Scholar

[31] A. Shahroudy, J. Liu, T.-T. Ng, and G. Wang. “NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis,” arXiv:1604.02808[cs.CV], 2016; https://arxiv.org/abs/1604.02808. Search in Google Scholar

[32] C. Li, Q. Zhong, D. Xie, and S. Pu. “Skeleton-based Action Recognition with Convolutional Neural Networks,” 2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), 10–14 July 2017, Hong Kong, pp. 597–600; doi: 10.1109/ICMEW.2017.8026285. Search in Google Scholar

[33] D. Liang, G. Fan, G. Lin, W. Chen, X. Pan, and H. Zhu. “Three-Stream Convolutional Neural Network With Multi-Task and Ensemble Learning for 3D Action Recognition,” 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 16–17 June 2019, Long Beach, CA, USA, IEEE, pp. 934–940; doi: 10.1109/cvprw.2019.00123. Search in Google Scholar

[34] S. Yan, Y. Xiong, and D. Lin. “Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition,” arXiv:1801.07455 [cs.CV], 2018; https://arxiv.org/abs/1801.07455, (accessed on 15.07.2022). Search in Google Scholar

[35] M. Li, S. Chen, X. Chen, Y. Zhang, Y. Wang, and Q. Tian, ”Actional-Structural Graph Convolutional Networks for Skeleton-Based Action Recognition,” 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019, pp. 3590–3598; doi: 10.1109/CVPR.2019.00371. Search in Google Scholar

[36] L. Shi, Y. Zhang, J. Cheng, and H.-Q. Lu. “Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition,” arXiv:1805.07694v3 [cs.CV], 10 July 2019; doi: 10.48550/ARXIV.1805.07694. Search in Google Scholar

[37] L. Shi, Y. Zhang, J. Cheng, and H.-Q. Lu. “Skeleton-based action recognition with multi-stream adaptive graph convolutional networks,” IEEE Transactions on Image Processing, vol. 29, October 2020, pp. 9532–9545; doi: 10.1109/TIP.2020.3028207. Search in Google Scholar

[38] H. Duan, Y. Zhao, K. Chen, D. Shao, D. Lin, and B. Dai. “Revisiting Skeleton-based Action Recognition,” arXiv, 2021, arXiv:2104.13586; https://arxiv.org/abs/2104.13586. Search in Google Scholar

[39] H. Duan, Y. Zhao, K. Chen, D. Lin, and B. Dai. “Revisiting Skeleton-based Action Recognition,” arXiv:2104.13586v2 [cs.CV], 2 Apr 2022; https://arxiv.org/abs/2104.13586v2. Search in Google Scholar

[40] J. Liu, G. Wang, P. Hu, L.-Y. Duan, and A. C. Kot. “Global Context-Aware Attention LSTM Networks for 3D Action Recognition,” 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21-26 July 2017, pp. 3671–3680; doi: 10.1109/CVPR.2017.391. Search in Google Scholar

[41] J. Liu, G. Wang, L.-Y. Duan, K. Abdiyeva, and A. C. Kot. “Skeleton-Based Human Action Recognition with Global Context-Aware Attention LSTM Networks,” IEEE Transactions on Image Processing (TIP), 27(4):1586–1599, 2018; doi: 10.1109/TIP.2017.2785279. Search in Google Scholar

[42] J. Liu, A. Shahroudy, G. Wang, L.-Y. Duan, and A. C. Kot. “Skeleton-Based Online Action Prediction Using Scale Selection Network,” IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 42(6):1453–1467, 2019; doi: 10.1109/TPAMI.2019.2898954. Search in Google Scholar

[43] T. Yu, and H. Zhu. “Hyper-Parameter Optimization: A Review of Algorithms and Applications,” arXiv:2003.05689 [cs, stat], 2020; https://arxiv.org/abs/2003.05689. Search in Google Scholar

[44] [Online]. “openpose”, CMU-Perceptual-Computing-Lab, 2021; https://github.com/CMU-Perceptual-Computing-Lab/openpose/. Search in Google Scholar

[45] [Online]. “Keras: the Python deep learning API,” https://keras.io/. Search in Google Scholar

[46] [Online]. “UTKinect-3D Database,” Available online: http://cvrc.ece.utexas.edu/KinectDatasets/HOJ3D.html (accessed on 30 June 2022). Search in Google Scholar

[47] Kiwon Yun. “Two-person Interaction Detection Using Body-Pose Features and Multiple Instance Learning,” https://www3.cs.stonybrook.edu/~kyun/research/kinect_interaction/index.html. Search in Google Scholar

eISSN:: 2080-2145
Sprache:: Englisch

Zeitrahmen der Veröffentlichung:: 4 Hefte pro Jahr
Fachgebiete der Zeitschrift:: Informatik, Künstliche Intelligenz, Technik, Elektrotechnik, Mess-, Steuer- und Regelungstechnik, Maschinenbau, Grundlagen des Maschinenbaus

Zeitschrift RSS Feed

Skeleton-based Human Action/Interaction Classification in Sparse Image Sequences

Online veröffentlicht: 04. März 2024

Seitenbereich: 1 - 14

Eingereicht: 14. Feb. 2023

Akzeptiert: 16. Juni 2023

DOI: https://doi.org/10.14313/jamris/3-2023/18

SchlüsselwörterAction classification, Skeleton features, 2-person interactions, Mixture of experts, Video analysis

© 2023 Włodzimierz Kasprzak et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Schlüsselwörter
Action classification, Skeleton features, 2-person interactions, Mixture of experts, Video analysis