A Review of Object Detection in Traffic Scenes Based on Deep Learning

[1] Yurtsever E., J. Lambert, A. Carballo and K. Takeda. (2020). A Survey of Autonomous Driving: Common Practices and Emerging Technologies. Ieee Access 8, 58443-58469. Search in Google Scholar

[2] Divakarla K. P., A. Emadi, S. Razavi, S. Habibi and F. Yan. (2019). A review of autonomous vehicle technology landscape. International Journal of Electric and Hybrid Vehicles 11 (4), 320-345. Search in Google Scholar

[3] Dalal N. and B. Triggs. (2005). “Histograms of oriented gradients for human detection.” 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05). Search in Google Scholar

[4] Lowe D. G. (2004). Distinctive image features from scale-invariant keypoints. International journal of computer vision 60, 91-110. Search in Google Scholar

[5] Lienhart R. and J. Maydt, An extended set of haar-like features for rapid object detection, in: Proceedings. international conference on image processing, IEEE, 2002, pp. I-I. Search in Google Scholar

[6] Wang Z., H. Fu, L. Wang, L. Xiao and B. Dai. (2019). SCNet: Subdivision coding network for object detection based on 3D point cloud. IEEE Access 7, 120449-120462. Search in Google Scholar

[7] Yadav N. and U. Binay. (2017). Comparative study of object detection algorithms. International Research Journal of Engineering and Technology (IRJET) 4 (11), 586-591. Search in Google Scholar

[8] Agarwal S., J. O. D. Terrail and F. Jurie. (2018). Recent advances in object detection in the age of deep convolutional neural networks. arXiv preprint arXiv:1809.03193. Search in Google Scholar

[9] Liu L., W. Ouyang, X. Wang, P. Fieguth, J. Chen, X. Liu and M. Pietikäinen. (2020). Deep learning for generic object detection: A survey. International journal of computer vision 128, 261-318. Search in Google Scholar

[10] Huang G., I. Laradji, D. Vazquez, S. Lacoste-Julien and P. Rodriguez. (2022). A survey of self-supervised and few-shot object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence 45 (4), 4071-4089. Search in Google Scholar

[11] Zaidi S. S. A., M. S. Ansari, A. Aslam, N. Kanwal, M. Asghar and B. Lee. (2022). A survey of modern deep learning based object detection models. Digital Signal Processing 126, 103514. Search in Google Scholar

[12] Ning C., L. Menglu, Y. Hao, S. Xueping and L. Yunhong. (2021). Survey of pedestrian detection with occlusion. Complex & Intelligent Systems 7, 577-587. Search in Google Scholar

[13] Wali S. B., M. A. Abdullah, M. A. Hannan, A. Hussain, S. A. Samad, P. J. Ker and M. B. Mansor. (2019). Vision-based traffic sign detection and recognition systems: Current trends and challenges. Sensors 19 (9), 2093. Search in Google Scholar

[14] Maity M., S. Banerjee and S. S. Chaudhuri. (2021). “Faster r-cnn and yolo based vehicle detection: A survey.” 2021 5th international conference on computing methodologies and communication (ICCMC). Search in Google Scholar

[15] Girshick R., J. Donahue, T. Darrell and J. Malik. (2014). “Rich feature hierarchies for accurate object detection and semantic segmentation.” Proceedings of the IEEE conference on computer vision and pattern recognition. Search in Google Scholar

[16] He K., X. Zhang, S. Ren and J. Sun. (2015). Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE transactions on pattern analysis and machine intelligence 37 (9), 1904-1916. Search in Google Scholar

[17] Ren S., K. He, R. Girshick and J. Sun. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems 28. Search in Google Scholar

[18] Lin T.-Y., P. Dollár, R. Girshick, K. He, B. Hariharan and S. Belongie. (2017). “Feature pyramid networks for object detection.” Proceedings of the IEEE conference on computer vision and pattern recognition. Search in Google Scholar

[19] He K., G. Gkioxari, P. Dollár and R. Girshick, Mask r-cnn, in: Proceedings of the IEEE international conference on computer vision, 2017, pp. 2961-2969. Search in Google Scholar

[20] Redmon J., S. Divvala, R. Girshick and A. Farhadi. (2016). “You only look once: Unified, real-time object detection.” Proceedings of the IEEE conference on computer vision and pattern recognition. Search in Google Scholar

[21] Redmon J. and A. Farhadi. (2017). “YOLO9000: better, faster, stronger.” Proceedings of the IEEE conference on computer vision and pattern recognition. Search in Google Scholar

[22] Redmon J. and A. Farhadi. (2018). Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767. Search in Google Scholar

[23] Bochkovskiy A., C.-Y. Wang and H.-Y. M. Liao. (2020). Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934. Search in Google Scholar

[24] Jocher G., A. Stoken, J. Borovec, L. Changyu, A. Hogan, L. Diaconu, J. Poznanski, L. Yu, P. Rai and R. Ferriday. (2020). ultralytics/yolov5: v3. 0. Zenodo. Search in Google Scholar

[25] Li C., L. Li, H. Jiang, K. Weng, Y. Geng, L. Li, Z. Ke, Q. Li, M. Cheng and W. Nie. (2022). YOLOv6: A single-stage object detection framework for industrial applications. arXiv preprint arXiv:2209.02976. Search in Google Scholar

[26] Wang C.-Y., A. Bochkovskiy and H.-Y. M. Liao. (2023). “YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors.” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Search in Google Scholar

[27] Information on https://github.com/ultralytics/ultralytics. Search in Google Scholar

[28] Liu W., D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu and A. C. Berg. (2016). “Ssd: Single shot multibox detector.” Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14. Search in Google Scholar

[29] Girshick R. (2015). “Fast r-cnn.” Proceedings of the IEEE international conference on computer vision. Search in Google Scholar

[30] Zhang J., Z. Xie, J. Sun, X. Zou and J. Wang. (2020). A cascaded R-CNN with multiscale attention and imbalanced samples for traffic sign detection. IEEE access 8, 29742-29754. Search in Google Scholar

[31] Yang T., X. Long, A. K. Sangaiah, Z. Zheng and C. Tong. (2018). Deep detection network for real-life traffic sign in vehicular networks. Computer Networks 136, 95-104. Search in Google Scholar

[32] Sharma V. K., P. Dhiman and R. K. Rout. (2023). Improved traffic sign recognition algorithm based on YOLOv4-tiny. Journal of Visual Communication and Image Representation 91, 103774. Search in Google Scholar

[33] Hu J., Z. Wang, M. Chang, L. Xie, W. Xu and N. Chen. (2022). PSG-Yolov5: A Paradigm for Traffic Sign Detection and Recognition Algorithm Based on Deep Learning. Symmetry 14 (11), 2262. Search in Google Scholar

[34] Yu P., Y. Zhao, J. Zhang and X. Xie. (2019). Pedestrian detection using multi-channel visual feature fusion by learning deep quality model. Journal of Visual Communication and Image Representation 63, 102579. Search in Google Scholar

[35] Shao X., J. Wei, D. Guo, R. Zheng, X. Nie, G. Wang and Y. Zhao. (2021). “Pedestrian detection algorithm based on improved faster rcnn.” 2021 IEEE 5th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC). Search in Google Scholar

[36] Ren K., Z. Chen, G. Gu and Q. Chen. (2023). Research on infrared small target segmentation algorithm based on improved mask R-CNN. Optik 272, 170334. Search in Google Scholar

[37] Cao J., C. Song, S. Peng, S. Song, X. Zhang, Y. Shao and F. Xiao. (2020). Pedestrian detection algorithm for intelligent vehicles in complex scenarios. Sensors 20 (13), 3646. Search in Google Scholar

[38] Liu L., C. Ke, H. Lin and H. Xu. (2022). Research on pedestrian detection algorithm based on MobileNet-YOLO. Computational intelligence and neuroscience 2022. Search in Google Scholar

[39] Suhao L., L. Jinzhao, L. Guoquan, B. Tong, W. Huiqian and P. Yu. (2018). Vehicle type detection based on deep learning in traffic scene. Procedia computer science 131, 564-572. Search in Google Scholar

[40] Fan J., T. Huo, X. Li, T. Qu, B. Gao and H. Chen. (2020). “Covered vehicle detection in autonomous driving based on faster rcnn.” 2020 39th Chinese Control Conference (CCC). Search in Google Scholar

[41] Luo J.-q., H.-s. Fang, F.-m. Shao, Y. Zhong and X. Hua. (2021). Multiscale traffic vehicle detection based on faster R–CNN with NAS optimization and feature enrichment. Defence Technology 17 (4), 1542-1554. Search in Google Scholar

[42] Liu J. and D. Zhang. (2020). “Research on vehicle object detection algorithm based on improved YOLOv3 algorithm.” Journal of Physics: Conference Series. Search in Google Scholar

[43] Dong X., S. Yan and C. Duan. (2022). A lightweight vehicles detection network model based on YOLOv5. Engineering Applications of Artificial Intelligence 113, 104914. Search in Google Scholar

[44] Yun S., D. Han, S. J. Oh, S. Chun, J. Choe and Y. Yoo. (2019). “Cutmix: Regularization strategy to train strong classifiers with localizable features.” Proceedings of the IEEE/CVF international conference on computer vision. Search in Google Scholar

[45] Krizhevsky A., I. Sutskever and G. E. Hinton. (2012). Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems 25. Search in Google Scholar

[46] Wu Y., Z. Li, Y. Chen, K. Nai and J. Yuan. (2020). Real-time traffic sign detection and classification towards real traffic scene. Multimedia Tools and Applications 79, 18201-18219. Search in Google Scholar

[47] Tang X., D. K. Du, Z. He and J. Liu. (2018). “Pyramidbox: A context-assisted single shot face detector.” Proceedings of the European conference on computer vision (ECCV). Search in Google Scholar

[48] Xiao J., H. Guo, J. Zhou, T. Zhao, Q. Yu, Y. Chen and Z. Wang. (2023). Tiny object detection with context enhancement and feature purification. Expert Systems with Applications 211, 118665. Search in Google Scholar

[49] Ouyang W., P. Luo, X. Zeng, S. Qiu, Y. Tian, H. Li, S. Yang, Z. Wang, Y. Xiong and C. Qian. (2014). Deepid-net: multi-stage and deformable deep convolutional neural networks for object detection. arXiv preprint arXiv:1409.3505. Search in Google Scholar

[50] Bahdanau D., K. Cho and Y. Bengio. (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473. Search in Google Scholar

[51] Jaderberg M., K. Simonyan and A. Zisserman. (2015). Spatial transformer networks. Advances in neural information processing systems 28. Search in Google Scholar

[52] Hu J., L. Shen and G. Sun. (2018). “Squeeze-and-excitation networks.” Proceedings of the IEEE conference on computer vision and pattern recognition. Search in Google Scholar

[53] Wang L., Y. Cao, S. Wang, X. Song, S. Zhang, J. Zhang and J. Niu. (2022). Investigation into recognition algorithm of helmet violation based on YOLOv5-CBAM-DCN. IEEE Access 10, 60622-60632. Search in Google Scholar

[54] Yang G., Z. Wang, S. Zhuang and H. Wang. (2022). PFF-CB: Multiscale occlusion pedestrian detection method based on PFF and CBAM. Computational intelligence and neuroscience 2022. Search in Google Scholar

[55] Wang X., Q. Zhao, P. Jiang, Y. Zheng, L. Yuan and P. Yuan. (2022). LDS-YOLO: A lightweight small object detection method for dead trees from shelter forest. Computers and Electronics in Agriculture 198, 107035. Search in Google Scholar

[56] Yin R., R. Zhang, W. Zhao and F. Jiang. (2020). Da-net: pedestrian detection using dense connected block and attention modules. IEEE Access 8, 153929-153940. Search in Google Scholar

[57] Ledig C., L. Theis, F. Huszár, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, A. Tejani, J. Totz and Z. Wang. (2017). “Photo-realistic single image super-resolution using a generative adversarial network.” Proceedings of the IEEE conference on computer vision and pattern recognition. Search in Google Scholar

[58] Bashir S. M. A., Y. Wang, M. Khan and Y. Niu. (2021). A comprehensive review of deep learning-based single image super-resolution. PeerJ Computer Science 7, e621. Search in Google Scholar

[59] Tan R., Y. Yuan, R. Huang and J. Luo. (2022). “Video super-resolution with spatial-temporal transformer encoder.” 2022 IEEE International Conference on Multimedia and Expo (ICME). Search in Google Scholar

[60] Li H. and P. Zhang. (2021). “Spatio-temporal fusion network for video super-resolution.” 2021 International Joint Conference on Neural Networks (IJCNN). Search in Google Scholar

[61] Bell-Kligler S., A. Shocher and M. Irani. (2019). Blind super-resolution kernel estimation using an internal-gan. Advances in Neural Information Processing Systems 32. Search in Google Scholar

[62] Li J., X. Liang, Y. Wei, T. Xu, J. Feng and S. Yan. (2017). “Perceptual generative adversarial networks for small object detection.” Proceedings of the IEEE conference on computer vision and pattern recognition. Search in Google Scholar

[63] Zheng K., M. Wei, G. Sun, B. Anas and Y. Li. (2019). Using vehicle synthesis generative adversarial networks to improve vehicle detection in remote sensing images. ISPRS International Journal of Geo-Information 8 (9), 390. Search in Google Scholar

[64] Cheng X., J. Zhou, J. Song and X. Zhao. (2023). A Highway Traffic Image Enhancement Algorithm Based on Improved GAN in Complex Weather Conditions. IEEE Transactions on Intelligent Transportation Systems. Search in Google Scholar

[65] Zhou X., L. Jiang, C. Hu, S. Lei, T. Zhang and X. Mou. (2022). YOLO-SASE: an improved YOLO algorithm for the small targets detection in complex backgrounds. Sensors 22 (12), 4600. Search in Google Scholar

[66] Chen C., C. He, C. Hu, H. Pei and L. Jiao. (2019). A deep neural network based on an attention mechanism for SAR ship detection in multiscale and complex scenarios. IEEE Access 7, 104848-104863. Search in Google Scholar

[67] Li X., W. Wang, L. Wu, S. Chen, X. Hu, J. Li, J. Tang and J. Yang. (2020). Generalized focal loss: Learning qualified and distributed bounding boxes for dense object detection. Advances in Neural Information Processing Systems 33, 21002-21012. Search in Google Scholar

[68] Zhaoxin L., L. Shuhua, L. Lingqiang and L. Qiyuan. (2022). Crowd counting in complex scenes based on an attention aware CNN network. Journal of Visual Communication and Image Representation 87, 103591. Search in Google Scholar

[69] Wang P., S. Fu and X. Cao. (2022). “Improved Lightweight Target Detection Algorithm for Complex Roads with YOLOv5.” 2022 International Conference on Machine Learning and Intelligent Systems Engineering (MLISE). Search in Google Scholar

[70] Woo S., J. Park, J.-Y. Lee and I. S. Kweon. (2018). “Cbam: Convolutional block attention module.” Proceedings of the European conference on computer vision (ECCV). Search in Google Scholar

[71] Rezatofighi H., N. Tsoi, J. Gwak, A. Sadeghian, I. Reid and S. Savarese. (2019). “Generalized intersection over union: A metric and a loss for bounding box regression.” Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. Search in Google Scholar

[72] Geiger A., P. Lenz, C. Stiller and R. Urtasun. (2013). Vision meets robotics: The kitti dataset. The International Journal of Robotics Research 32 (11), 1231-1237. Search in Google Scholar

[73] Cordts M., M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth and B. Schiele. (2016). “The cityscapes dataset for semantic urban scene understanding.” Proceedings of the IEEE conference on computer vision and pattern recognition. Search in Google Scholar

[74] Zhao H., J. Shi, X. Qi, X. Wang and J. Jia. (2017). “Pyramid scene parsing network.” Proceedings of the IEEE conference on computer vision and pattern recognition. Search in Google Scholar

[75] Liu S., L. Qi, H. Qin, J. Shi and J. Jia. (2018). “Path aggregation network for instance segmentation.” Proceedings of the IEEE conference on computer vision and pattern recognition. Search in Google Scholar

[76] Yu F., W. Xian, Y. Chen, F. Liu, M. Liao, V. Madhavan and T. Darrell. (2018). Bdd100k: A diverse driving video database with scalable annotation tooling. arXiv preprint arXiv:1805.04687 2 (5), 6. Search in Google Scholar

[77] Stallkamp J., M. Schlipsing, J. Salmen and C. Igel. (2011). “The German traffic sign recognition benchmark: a multi-class classification competition.” The 2011 international joint conference on neural networks. Search in Google Scholar

[78] Stallkamp J., M. Schlipsing, J. Salmen and C. Igel. (2012). Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition. Neural networks 32, 323-332. Search in Google Scholar

[79] Houben S., J. Stallkamp, J. Salmen, M. Schlipsing and C. Igel. (2013). “Detection of traffic signs in real-world images: The German Traffic Sign Detection Benchmark.” The 2013 international joint conference on neural networks (IJCNN). Search in Google Scholar

[80] Zhu Z., D. Liang, S. Zhang, X. Huang, B. Li and S. Hu. (2016). “Traffic-sign detection and classification in the wild.” Proceedings of the IEEE conference on computer vision and pattern recognition. Search in Google Scholar

[81] Davis J. and M. Goadrich. (2006). “The relationship between Precision-Recall and ROC curves.” Proceedings of the 23rd international conference on Machine learning. Search in Google Scholar

[82] Information on http://www.lara.prd.fr/benchmarks/trafficlightsrecognition. Search in Google Scholar

[83] Information on https://computing.wpi.edu/dataset.html. Search in Google Scholar

[84] Information on http://www.ee.cuhk.edu.hk/xgwang/MITtraffic.html. Search in Google Scholar

[85] Information on http://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/. Search in Google Scholar

[86] Zhang S., R. Benenson and B. Schiele. (2017). “Citypersons: A diverse dataset for pedestrian detection.” Proceedings of the IEEE conference on computer vision and pattern recognition. Search in Google Scholar

[87] Che Z., M. G. Li, T. Li, B. Jiang, X. Shi, X. Zhang, Y. Lu, G. Wu, Y. Liu and J. Ye. (2019). D2-City: A Large-Scale Dashcam Video Dataset of Diverse Traffic Scenarios. ArXiv abs/1904.01975. Search in Google Scholar

[88] Information on https://github.com/udacity/self-driving-car. Search in Google Scholar

[89] Arróspide J., L. Salgado and M. Nieto. (2012). Video analysis-based vehicle detection and tracking using an MCMC sampling framework. EURASIP Journal on Advances in Signal Processing 2012 (1), 1-20. Search in Google Scholar

[90] Everingham M., S. A. Eslami, L. Van Gool, C. K. Williams, J. Winn and A. Zisserman. (2015). The pascal visual object classes challenge: A retrospective. International journal of computer vision 111, 98-136. Search in Google Scholar

[91] Everingham M., L. Van Gool, C. K. Williams, J. Winn and A. Zisserman. (2010). The pascal visual object classes (voc) challenge. International journal of computer vision 88, 303-338. Search in Google Scholar

[92] Russakovsky O., J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg and L. Fei-Fei. (2015). ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision 115 (3), 211-252. Search in Google Scholar

[93] Xiao Y., Z. Tian, J. Yu, Y. Zhang, S. Liu, S. Du and X. Lan. (2020). A review of object detection based on deep learning. Multimedia Tools and Applications 79, 23729-23791. Search in Google Scholar

[94] Arulprakash E. and M. Aruldoss. (2022). A study on generic object detection with emphasis on future research directions. Journal of King Saud University-Computer and Information Sciences 34 (9), 7347-7365. Search in Google Scholar

[95] Padilla R., W. L. Passos, T. L. Dias, S. L. Netto and E. A. Da Silva. (2021). A comparative analysis of object detection metrics with a companion open-source toolkit. Electronics 10 (3), 279. Search in Google Scholar

[96] Zou Z., K. Chen, Z. Shi, Y. Guo and J. Ye. (2023). Object detection in 20 years: A survey. Proceedings of the IEEE. Search in Google Scholar

eISSN:: 2444-8656
Language:: English

Publication timeframe:: Volume Open
Journal Subjects:: Life Sciences, other, Mathematics, Applied Mathematics, General Mathematics, Physics

Journal RSS Feed

A Review of Object Detection in Traffic Scenes Based on Deep Learning

Published Online: Feb 26, 2024

Page range: -

Received: Jan 02, 2024

Accepted: Jan 11, 2024

DOI: https://doi.org/10.2478/amns-2024-0322

KeywordsObject detection, Deep learning, Traffic scenarios, Autonomous driving systems

© 2024 Ruixin Zhao et al., published by Sciendo

This work is licensed under the Creative Commons Attribution 4.0 International License.

Keywords
Object detection, Deep learning, Traffic scenarios, Autonomous driving systems