Predicting Vehicle Pose in Six Degrees of Freedom from Single Image in Real-World Traffic Environments Using Deep Pretrained Convolutional Networks and Modified Centernet

[1] Van Brummelen, Jessica, Marie O’Brien, Dominique Gruyer, and Homayoun Najjaran. “Autonomous vehicle perception: The technology of today and tomorrow.” Transportation research part C: emerging technologies 89 (2018): 384–406. Van BrummelenJessica O’BrienMarie GruyerDominique NajjaranHomayoun “Autonomous vehicle perception: The technology of today and tomorrow.” Transportation research part C: emerging technologies 89 2018 384 406 Search in Google Scholar

[2] Lefèvre, Stéphanie, Dizan Vasquez, and Christian Laugier. “A survey on motion prediction and risk assessment for intelligent vehicles.” ROBOMECH journal 1, no. 1 (2014): 1–14. LefèvreStéphanie VasquezDizan LaugierChristian “A survey on motion prediction and risk assessment for intelligent vehicles.” ROBOMECH journal 1 1 2014 1 14 Search in Google Scholar

[3] Shladover, Steven E., Charles A. Desoer, J. Karl Hedrick, Masayoshi Tomizuka, Jean Walrand, W-B. Zhang, Donn H. McMahon, Huei Peng, Shahab Sheikholeslam, and Nick McKeown. “Automated vehicle control developments in the PATH program.” IEEE Transactions on vehicular technology 40, no. 1 (1991): 114–130. ShladoverSteven E. DesoerCharles A. HedrickJ. Karl TomizukaMasayoshi WalrandJean ZhangW-B. McMahonDonn H. PengHuei SheikholeslamShahab McKeownNick “Automated vehicle control developments in the PATH program.” IEEE Transactions on vehicular technology 40 1 1991 114 130 Search in Google Scholar

[4] Kundu, Abhijit, Yin Li, and James M. Rehg. “3D-RCNN: Instance-level 3d object reconstruction via render-and-compare.” In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3559–3568. 2018. KunduAbhijit LiYin RehgJames M. “3D-RCNN: Instance-level 3d object reconstruction via render-and-compare.” In Proceedings of the IEEE conference on computer vision and pattern recognition 3559 3568 2018 Search in Google Scholar

[5] Chabot, Florian, Mohamed Chaouch, Jaonary Rabarisoa, Céline Teuliere, and Thierry Chateau. “Deep manta: A coarse-to-fine many-task network for joint 2d and 3d vehicle analysis from monocular image.” In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2040–2049. 2017. ChabotFlorian ChaouchMohamed RabarisoaJaonary TeuliereCéline ChateauThierry “Deep manta: A coarse-to-fine many-task network for joint 2d and 3d vehicle analysis from monocular image.” In Proceedings of the IEEE conference on computer vision and pattern recognition 2040 2049 2017 Search in Google Scholar

[6] Wandinger U., “Introduction to LiDAR.” In LiDAR, (pp. 1–18). Springer, New York, NY, 2005. WandingerU. “Introduction to LiDAR.” In LiDAR 1 18 Springer New York, NY 2005 Search in Google Scholar

[7] Arnold E., Al-Jarrah O.Y., Dianati M., Fallah S., Oxtoby D., Mouzakitis A. A., “survey on 3d object detection methods for autonomous driving applications.” IEEE Transactions on Intelligent Transportation Systems, 20, 3782–3795, 2019. ArnoldE. Al-JarrahO.Y. DianatiM. FallahS. OxtobyD. MouzakitisA. A. “survey on 3d object detection methods for autonomous driving applications.” IEEE Transactions on Intelligent Transportation Systems 20 3782 3795 2019 Search in Google Scholar

[8] Waldschmidt C., Hasch J., Menzel W., “Automotive radar—from first efforts to future systems.” IEEE Journal of Microwaves, 1(1), 135–148, 2021. WaldschmidtC. HaschJ. MenzelW. “Automotive radar—from first efforts to future systems.” IEEE Journal of Microwaves 1 1 135 148 2021 Search in Google Scholar

[9] He K., Zhang X., Ren S., Sun J., “Deep residual learning for image recognition.” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (pp. 770–778), 2016. HeK. ZhangX. RenS. SunJ. “Deep residual learning for image recognition.” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 770 778 2016 Search in Google Scholar

[10] Xie S., Girshick R., Dollár P., Tu Z., He K., “Aggregated residual transformations for deep neural networks.” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (pp. 1492–1500), 2017. XieS. GirshickR. DollárP. TuZ. HeK. “Aggregated residual transformations for deep neural networks.” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 1492 1500 2017 Search in Google Scholar

[11] Szegedy C., Ioffe S., Vanhoucke V., Alemi A. A., “Inception-v4, inception-ResNet, and the impact of residual connections on learning.” In the Thirty-first AAAI Conference on Artificial Intelligence, 2017. SzegedyC. IoffeS. VanhouckeV. AlemiA. A. “Inception-v4, inception-ResNet, and the impact of residual connections on learning.” In the Thirty-first AAAI Conference on Artificial Intelligence 2017 Search in Google Scholar

[12] Huang G., Liu Z., Van Der Maaten L., Weinberger K. Q., “Densely connected convolutional networks.” In Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 4700–4708), 2017. HuangG. LiuZ. Van Der MaatenL. WeinbergerK. Q. “Densely connected convolutional networks.” In Proceedings of the IEEE conference on computer vision and pattern recognition 4700 4708 2017 Search in Google Scholar

[13] Duan K., Bai S., Xie L., Qi H., Huang Q., Tian Q., “Centernet: Keypoint triplets for object detection.” In Proceedings of the IEEE/CVF International Conference on Computer Vision, (pp. 6569–6578), 2019. DuanK. BaiS. XieL. QiH. HuangQ. TianQ. “Centernet: Keypoint triplets for object detection.” In Proceedings of the IEEE/CVF International Conference on Computer Vision 6569 6578 2019 Search in Google Scholar

[14] Kolekar S., Gite S., Pradhan B., Kotecha K., “Behavior Prediction of Traffic Actors for Intelligent Vehicle Using Artificial Intelligence Techniques: A Review.” IEEE Access, 9, 135034–135058, 2021. KolekarS. GiteS. PradhanB. KotechaK. “Behavior Prediction of Traffic Actors for Intelligent Vehicle Using Artificial Intelligence Techniques: A Review.” IEEE Access 9 135034 135058 2021 Search in Google Scholar

[15] Everingham M., Van Gool L., Williams C. K. I., Winn J., Zisserman A., “The PASCAL Visual Object Classes Challenge 2012 (VOC2012) Results.”, http://www.pascalnetwork.org/challenges/VOC/voc2012/workshop/index.html. EveringhamM. Van GoolL. WilliamsC. K. I. WinnJ. ZissermanA. “The PASCAL Visual Object Classes Challenge 2012 (VOC2012) Results.” http://www.pascalnetwork.org/challenges/VOC/voc2012/workshop/index.html. Search in Google Scholar

[16] Kim J., Hong J., Park H., “Prospects of deep learning for medical imaging.” Precision and Future Medicine, 2(2), 37–52, 2018. KimJ. HongJ. ParkH. “Prospects of deep learning for medical imaging.” Precision and Future Medicine 2 2 37 52 2018 Search in Google Scholar

[17] He K., Gkioxari G., Dollar P., Girshick R., “Mask R-CNN.” In Proc. IEEE Int. Conf. Comp. Vis. pages 2980–2988. IEEE, 2017. HeK. GkioxariG. DollarP. GirshickR. “Mask R-CNN.” In Proc. IEEE Int. Conf. Comp. Vis 2980 2988 IEEE 2017 Search in Google Scholar

[18] Deng J., Dong W., Socher R., Li L.-J., Li K., Fei-Fei L., “Imagenet: A large-scale hierarchical image database.” In Proc. IEEE Conf. Comp. Vis. Patt. Recogn, pages 248–255. Ieee, 2009. DengJ. DongW. SocherR. LiL.-J. LiK. Fei-FeiL. “Imagenet: A large-scale hierarchical image database.” In Proc. IEEE Conf. Comp. Vis. Patt. Recogn 248 255 Ieee 2009 Search in Google Scholar

[19] Lin T.-Y., Maire M., Belongie S., Hays J., Perona P., Ramanan D., Dollar P., Zitnick C. L., “Microsoft coco: Common objects in context.” In Proc. Eur. Conf. Comp. Vis. pages 740–755. Springer, 2014. LinT.-Y. MaireM. BelongieS. HaysJ. PeronaP. RamananD. DollarP. ZitnickC. L. “Microsoft coco: Common objects in context.” In Proc. Eur. Conf. Comp. Vis. 740 755 Springer 2014 Search in Google Scholar

[20] Geiger A., Lenz P., Urtasun R., “Are we ready for autonomous driving? The kitti vision benchmark suite.” In Proc. IEEE Conf. Comp. Vis. Patt. Recogn. Pages 3354–3361. IEEE, 2012. GeigerA. LenzP. UrtasunR. “Are we ready for autonomous driving? The kitti vision benchmark suite.” In Proc. IEEE Conf. Comp. Vis. Patt. Recogn. 3354 3361 IEEE 2012 Search in Google Scholar

[21] Lu F., Liu Z., Song X., Zhou D., Li W., Miao H., Manocha D., “Vehicle Perception from a Single Image for Autonomous Driving Using Deformable Model Representation and Deep Learning.” SSRN 4085267, 2022. LuF. LiuZ. SongX. ZhouD. LiW. MiaoH. ManochaD. “Vehicle Perception from a Single Image for Autonomous Driving Using Deformable Model Representation and Deep Learning.” SSRN 4085267 2022 Search in Google Scholar

[22] Ke L., Li S., Sun Y., Tai Y. W., Tang C. K., “GSNet: Joint vehicle pose and shape reconstruction with geometrical and scene-aware supervision.” In European Conference on Computer Vision, Springer, Cham, (pp. 515–532), 2020. KeL. LiS. SunY. TaiY. W. TangC. K. “GSNet: Joint vehicle pose and shape reconstruction with geometrical and scene-aware supervision.” In European Conference on Computer Vision Springer, Cham 515 532 2020 Search in Google Scholar

[23] An K., Chen Y., Wang S., Xiao Z., “RCBi-CenterNet: an absolute pose policy for 3D object detection in autonomous driving.” Applied Sciences, 11(12), 5621, 2021. AnK. ChenY. WangS. XiaoZ. “RCBi-CenterNet: an absolute pose policy for 3D object detection in autonomous driving.” Applied Sciences 11 12 5621 2021 Search in Google Scholar

[24] Pan S. J., Yang Q. A, “survey on transfer learning.” IEEE Transactions on Knowledge and Data Engineering, 22(10), 1345–1359, 2009. PanS. J. YangQ. A “survey on transfer learning.” IEEE Transactions on Knowledge and Data Engineering 22 10 1345 1359 2009 Search in Google Scholar

[25] Khade S., Gite S., Pradhan B., “Iris Liveness Detection Using Multiple Deep Convolution Networks.” Big Data and Cognitive Computing, 6(2), 67, 2022. KhadeS. GiteS. PradhanB. “Iris Liveness Detection Using Multiple Deep Convolution Networks.” Big Data and Cognitive Computing 6 2 67 2022 Search in Google Scholar

[26] Samet N., Hicsonmez S., & Akbas E., “HoughNet: Integrating near and long-range evidence for visual detection.” IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(4), 4667–4681, 2022. SametN. HicsonmezS. AkbasE. “HoughNet: Integrating near and long-range evidence for visual detection.” IEEE Transactions on Pattern Analysis and Machine Intelligence 45 4 4667 4681 2022 Search in Google Scholar

[27] Rawat W., & Wang Z., “Deep convolutional neural networks for image classification: A comprehensive review.” Neural Computation, 29(9), 2352–2449, 2017. RawatW. WangZ. “Deep convolutional neural networks for image classification: A comprehensive review.” Neural Computation 29 9 2352 2449 2017 Search in Google Scholar

[28] Koonce, B., & Koonce, B., “ResNet50. Convolutional Neural Networks with Swift for Tensorflow: Image Recognition and Dataset Categorization.” 63–72, 2021. KoonceB. KoonceB. “ResNet50. Convolutional Neural Networks with Swift for Tensorflow: Image Recognition and Dataset Categorization.” 63 72 2021 Search in Google Scholar

[29] Carranza-García M., Torres-Mateo J., Lara-Benítez P., & García-Gutiérrez J., “On the performance of one-stage and two-stage object detectors in autonomous vehicles using camera data.” Remote Sensing, 13(1), 89, 2020. Carranza-GarcíaM. Torres-MateoJ. Lara-BenítezP. García-GutiérrezJ. “On the performance of one-stage and two-stage object detectors in autonomous vehicles using camera data.” Remote Sensing 13 1 89 2020 Search in Google Scholar

[30] Xie S., Girshick R., Dollár P., Tu Z., & He K., “Aggregated residual transformations for deep neural networks.” In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1492–1500), 2017. XieS. GirshickR. DollárP. TuZ. HeK. “Aggregated residual transformations for deep neural networks.” In Proceedings of the IEEE conference on computer vision and pattern recognition 1492 1500 2017 Search in Google Scholar

[31] Kolekar Suresh, Shilpa Gite, Biswajeet Pradhan, and Abdullah Alamri, “Explainable AI in Scene Understanding for Autonomous Vehicles in Unstructured Traffic Environments on Indian Roads Using the Inception U-Net Model with Grad-CAM Visualization.” Sensors 22, no. 24: 9677, 2022. https://doi.org/10.3390/s22249677. SureshKolekar GiteShilpa PradhanBiswajeet AlamriAbdullah “Explainable AI in Scene Understanding for Autonomous Vehicles in Unstructured Traffic Environments on Indian Roads Using the Inception U-Net Model with Grad-CAM Visualization.” Sensors 22 24 9677 2022 https://doi.org/10.3390/s22249677 Search in Google Scholar

[32] Valev K., Schumann A., Sommer L., & Beyerer J., “A systematic evaluation of recent deep learning architectures for fine-grained vehicle classification.” In Pattern Recognition and Tracking XXIX (Vol. 10649, p. 1064902). SPIE. 2018. ValevK. SchumannA. SommerL. BeyererJ. “A systematic evaluation of recent deep learning architectures for fine-grained vehicle classification.” In Pattern Recognition and Tracking XXIX 10649 1064902 SPIE 2018 Search in Google Scholar

[33] Wang S. H., & Zhang Y. D., DenseNet-201-based deep neural network with composite learning factor and precomputation for multiple sclerosis classification. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 16(2s), 1–19, 2020. WangS. H. ZhangY. D. DenseNet-201-based deep neural network with composite learning factor and precomputation for multiple sclerosis classification ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 16 2s 1 19 2020 Search in Google Scholar

[34] Song X., Wang P., Zhou D., Zhu R., Guan C., Dai Y., Yang R. “Apollocar3d: A large 3d car instance understanding benchmark for autonomous driving.” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (pp. 5452–5462), 2019. SongX. WangP. ZhouD. ZhuR. GuanC. DaiY. YangR. “Apollocar3d: A large 3d car instance understanding benchmark for autonomous driving.” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 5452 5462 2019 Search in Google Scholar

[35] Rouben, Y. F. D. “Pose estimation using overhead imagery and semantics.” (Doctoral dissertation, Massachusetts Institute of Technology), 2018. RoubenY. F. D. “Pose estimation using overhead imagery and semantics.” Doctoral dissertation, Massachusetts Institute of Technology 2018 Search in Google Scholar

[36] Llugsi R., El Yacoubi S., Fontaine A., Lupera P., “Comparison between Adam, AdaMax, and Adam W optimizers to implement a Weather Forecast based on Neural Networks for the Andean city of Quito.” In 2021 IEEE Fifth Ecu, 2021. LlugsiR. El YacoubiS. FontaineA. LuperaP. “Comparison between Adam, AdaMax, and Adam W optimizers to implement a Weather Forecast based on Neural Networks for the Andean city of Quito.” In 2021 IEEE Fifth Ecu 2021 Search in Google Scholar

Język:: Angielski

Częstotliwość wydawania:: 1 razy w roku
Dziedziny czasopisma:: Inżynieria, Wstępy i przeglądy, Inżynieria, inne

Kanał RSS czasopisma

Predicting Vehicle Pose in Six Degrees of Freedom from Single Image in Real-World Traffic Environments Using Deep Pretrained Convolutional Networks and Modified Centernet

Suresh Kolekar

Shilpa Gite

Biswajeet Pradhan

Abdulla Alamri

Kategoria artykułu: Original Research Article

Data publikacji: 06 sie 2024

Otrzymano: 18 kwi 2024

DOI: https://doi.org/10.2478/ijssis-2024-0025

Słowa kluczoweVehicle pose prediction, Autonomous driving, 3D traffic scene understanding, ResNet50, ResNext50, DenseNet201, Inception-ResNetV2

© 2024 Suresh Kolekar et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Słowa kluczowe
Vehicle pose prediction, Autonomous driving, 3D traffic scene understanding, ResNet50, ResNext50, DenseNet201, Inception-ResNetV2