Predicting Vehicle Pose in Six Degrees of Freedom from Single Image in Real-World Traffic Environments Using Deep Pretrained Convolutional Networks and Modified Centernet
Centre for Advanced Modelling and Geospatial Information Systems (CAMGIS), School of Civil and Environmental Engineering, Faculty of Engineering and I.T., University of Technology SydneyUltimo, Australia
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Van Brummelen, Jessica, Marie O’Brien, Dominique Gruyer, and Homayoun Najjaran. “Autonomous vehicle perception: The technology of today and tomorrow.” Transportation research part C: emerging technologies 89 (2018): 384–406.Van BrummelenJessicaO’BrienMarieGruyerDominiqueNajjaranHomayoun“Autonomous vehicle perception: The technology of today and tomorrow.”Transportation research part C: emerging technologies892018384406Search in Google Scholar
Lefèvre, Stéphanie, Dizan Vasquez, and Christian Laugier. “A survey on motion prediction and risk assessment for intelligent vehicles.” ROBOMECH journal 1, no. 1 (2014): 1–14.LefèvreStéphanieVasquezDizanLaugierChristian“A survey on motion prediction and risk assessment for intelligent vehicles.”ROBOMECH journal112014114Search in Google Scholar
Shladover, Steven E., Charles A. Desoer, J. Karl Hedrick, Masayoshi Tomizuka, Jean Walrand, W-B. Zhang, Donn H. McMahon, Huei Peng, Shahab Sheikholeslam, and Nick McKeown. “Automated vehicle control developments in the PATH program.” IEEE Transactions on vehicular technology 40, no. 1 (1991): 114–130.ShladoverSteven E.DesoerCharles A.HedrickJ. KarlTomizukaMasayoshiWalrandJeanZhangW-B.McMahonDonn H.PengHueiSheikholeslamShahabMcKeownNick“Automated vehicle control developments in the PATH program.”IEEE Transactions on vehicular technology4011991114130Search in Google Scholar
Kundu, Abhijit, Yin Li, and James M. Rehg. “3D-RCNN: Instance-level 3d object reconstruction via render-and-compare.” In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3559–3568. 2018.KunduAbhijitLiYinRehgJames M.“3D-RCNN: Instance-level 3d object reconstruction via render-and-compare.”InProceedings of the IEEE conference on computer vision and pattern recognition355935682018Search in Google Scholar
Chabot, Florian, Mohamed Chaouch, Jaonary Rabarisoa, Céline Teuliere, and Thierry Chateau. “Deep manta: A coarse-to-fine many-task network for joint 2d and 3d vehicle analysis from monocular image.” In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2040–2049. 2017.ChabotFlorianChaouchMohamedRabarisoaJaonaryTeuliereCélineChateauThierry“Deep manta: A coarse-to-fine many-task network for joint 2d and 3d vehicle analysis from monocular image.”InProceedings of the IEEE conference on computer vision and pattern recognition204020492017Search in Google Scholar
Wandinger U., “Introduction to LiDAR.” In LiDAR, (pp. 1–18). Springer, New York, NY, 2005.WandingerU.“Introduction to LiDAR.”InLiDAR118SpringerNew York, NY2005Search in Google Scholar
Arnold E., Al-Jarrah O.Y., Dianati M., Fallah S., Oxtoby D., Mouzakitis A. A., “survey on 3d object detection methods for autonomous driving applications.” IEEE Transactions on Intelligent Transportation Systems, 20, 3782–3795, 2019.ArnoldE.Al-JarrahO.Y.DianatiM.FallahS.OxtobyD.MouzakitisA. A.“survey on 3d object detection methods for autonomous driving applications.”IEEE Transactions on Intelligent Transportation Systems20378237952019Search in Google Scholar
Waldschmidt C., Hasch J., Menzel W., “Automotive radar—from first efforts to future systems.” IEEE Journal of Microwaves, 1(1), 135–148, 2021.WaldschmidtC.HaschJ.MenzelW.“Automotive radar—from first efforts to future systems.”IEEE Journal of Microwaves111351482021Search in Google Scholar
He K., Zhang X., Ren S., Sun J., “Deep residual learning for image recognition.” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (pp. 770–778), 2016.HeK.ZhangX.RenS.SunJ.“Deep residual learning for image recognition.”InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition7707782016Search in Google Scholar
Xie S., Girshick R., Dollár P., Tu Z., He K., “Aggregated residual transformations for deep neural networks.” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (pp. 1492–1500), 2017.XieS.GirshickR.DollárP.TuZ.HeK.“Aggregated residual transformations for deep neural networks.”InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition149215002017Search in Google Scholar
Szegedy C., Ioffe S., Vanhoucke V., Alemi A. A., “Inception-v4, inception-ResNet, and the impact of residual connections on learning.” In the Thirty-first AAAI Conference on Artificial Intelligence, 2017.SzegedyC.IoffeS.VanhouckeV.AlemiA. A.“Inception-v4, inception-ResNet, and the impact of residual connections on learning.”In the Thirty-first AAAI Conference on Artificial Intelligence2017Search in Google Scholar
Huang G., Liu Z., Van Der Maaten L., Weinberger K. Q., “Densely connected convolutional networks.” In Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 4700–4708), 2017.HuangG.LiuZ.Van Der MaatenL.WeinbergerK. Q.“Densely connected convolutional networks.”In Proceedings of the IEEE conference on computer vision and pattern recognition470047082017Search in Google Scholar
Duan K., Bai S., Xie L., Qi H., Huang Q., Tian Q., “Centernet: Keypoint triplets for object detection.” In Proceedings of the IEEE/CVF International Conference on Computer Vision, (pp. 6569–6578), 2019.DuanK.BaiS.XieL.QiH.HuangQ.TianQ.“Centernet: Keypoint triplets for object detection.”In Proceedings of the IEEE/CVF International Conference on Computer Vision656965782019Search in Google Scholar
Kolekar S., Gite S., Pradhan B., Kotecha K., “Behavior Prediction of Traffic Actors for Intelligent Vehicle Using Artificial Intelligence Techniques: A Review.” IEEE Access, 9, 135034–135058, 2021.KolekarS.GiteS.PradhanB.KotechaK.“Behavior Prediction of Traffic Actors for Intelligent Vehicle Using Artificial Intelligence Techniques: A Review.”IEEE Access91350341350582021Search in Google Scholar
Everingham M., Van Gool L., Williams C. K. I., Winn J., Zisserman A., “The PASCAL Visual Object Classes Challenge 2012 (VOC2012) Results.”, http://www.pascalnetwork.org/challenges/VOC/voc2012/workshop/index.html.EveringhamM.Van GoolL.WilliamsC. K. I.WinnJ.ZissermanA.“The PASCAL Visual Object Classes Challenge 2012 (VOC2012) Results.”http://www.pascalnetwork.org/challenges/VOC/voc2012/workshop/index.html.Search in Google Scholar
Kim J., Hong J., Park H., “Prospects of deep learning for medical imaging.” Precision and Future Medicine, 2(2), 37–52, 2018.KimJ.HongJ.ParkH.“Prospects of deep learning for medical imaging.”Precision and Future Medicine2237522018Search in Google Scholar
He K., Gkioxari G., Dollar P., Girshick R., “Mask R-CNN.” In Proc. IEEE Int. Conf. Comp. Vis. pages 2980–2988. IEEE, 2017.HeK.GkioxariG.DollarP.GirshickR.“Mask R-CNN.”In Proc. IEEE Int. Conf. Comp. Vis29802988IEEE2017Search in Google Scholar
Deng J., Dong W., Socher R., Li L.-J., Li K., Fei-Fei L., “Imagenet: A large-scale hierarchical image database.” In Proc. IEEE Conf. Comp. Vis. Patt. Recogn, pages 248–255. Ieee, 2009.DengJ.DongW.SocherR.LiL.-J.LiK.Fei-FeiL.“Imagenet: A large-scale hierarchical image database.”In Proc. IEEE Conf. Comp. Vis. Patt. Recogn248255Ieee2009Search in Google Scholar
Lin T.-Y., Maire M., Belongie S., Hays J., Perona P., Ramanan D., Dollar P., Zitnick C. L., “Microsoft coco: Common objects in context.” In Proc. Eur. Conf. Comp. Vis. pages 740–755. Springer, 2014.LinT.-Y.MaireM.BelongieS.HaysJ.PeronaP.RamananD.DollarP.ZitnickC. L.“Microsoft coco: Common objects in context.”In Proc. Eur. Conf. Comp. Vis.740755Springer2014Search in Google Scholar
Geiger A., Lenz P., Urtasun R., “Are we ready for autonomous driving? The kitti vision benchmark suite.” In Proc. IEEE Conf. Comp. Vis. Patt. Recogn. Pages 3354–3361. IEEE, 2012.GeigerA.LenzP.UrtasunR.“Are we ready for autonomous driving? The kitti vision benchmark suite.”In Proc. IEEE Conf. Comp. Vis. Patt. Recogn.33543361IEEE2012Search in Google Scholar
Lu F., Liu Z., Song X., Zhou D., Li W., Miao H., Manocha D., “Vehicle Perception from a Single Image for Autonomous Driving Using Deformable Model Representation and Deep Learning.” SSRN 4085267, 2022.LuF.LiuZ.SongX.ZhouD.LiW.MiaoH.ManochaD.“Vehicle Perception from a Single Image for Autonomous Driving Using Deformable Model Representation and Deep Learning.”SSRN40852672022Search in Google Scholar
Ke L., Li S., Sun Y., Tai Y. W., Tang C. K., “GSNet: Joint vehicle pose and shape reconstruction with geometrical and scene-aware supervision.” In European Conference on Computer Vision, Springer, Cham, (pp. 515–532), 2020.KeL.LiS.SunY.TaiY. W.TangC. K.“GSNet: Joint vehicle pose and shape reconstruction with geometrical and scene-aware supervision.”In European Conference on Computer VisionSpringer, Cham5155322020Search in Google Scholar
An K., Chen Y., Wang S., Xiao Z., “RCBi-CenterNet: an absolute pose policy for 3D object detection in autonomous driving.” Applied Sciences, 11(12), 5621, 2021.AnK.ChenY.WangS.XiaoZ.“RCBi-CenterNet: an absolute pose policy for 3D object detection in autonomous driving.”Applied Sciences111256212021Search in Google Scholar
Pan S. J., Yang Q. A, “survey on transfer learning.” IEEE Transactions on Knowledge and Data Engineering, 22(10), 1345–1359, 2009.PanS. J.YangQ. A“survey on transfer learning.”IEEE Transactions on Knowledge and Data Engineering2210134513592009Search in Google Scholar
Khade S., Gite S., Pradhan B., “Iris Liveness Detection Using Multiple Deep Convolution Networks.” Big Data and Cognitive Computing, 6(2), 67, 2022.KhadeS.GiteS.PradhanB.“Iris Liveness Detection Using Multiple Deep Convolution Networks.”Big Data and Cognitive Computing62672022Search in Google Scholar
Samet N., Hicsonmez S., & Akbas E., “HoughNet: Integrating near and long-range evidence for visual detection.” IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(4), 4667–4681, 2022.SametN.HicsonmezS.AkbasE.“HoughNet: Integrating near and long-range evidence for visual detection.”IEEE Transactions on Pattern Analysis and Machine Intelligence454466746812022Search in Google Scholar
Rawat W., & Wang Z., “Deep convolutional neural networks for image classification: A comprehensive review.” Neural Computation, 29(9), 2352–2449, 2017.RawatW.WangZ.“Deep convolutional neural networks for image classification: A comprehensive review.”Neural Computation299235224492017Search in Google Scholar
Koonce, B., & Koonce, B., “ResNet50. Convolutional Neural Networks with Swift for Tensorflow: Image Recognition and Dataset Categorization.” 63–72, 2021.KoonceB.KoonceB.“ResNet50. Convolutional Neural Networks with Swift for Tensorflow: Image Recognition and Dataset Categorization.”63722021Search in Google Scholar
Carranza-García M., Torres-Mateo J., Lara-Benítez P., & García-Gutiérrez J., “On the performance of one-stage and two-stage object detectors in autonomous vehicles using camera data.” Remote Sensing, 13(1), 89, 2020.Carranza-GarcíaM.Torres-MateoJ.Lara-BenítezP.García-GutiérrezJ.“On the performance of one-stage and two-stage object detectors in autonomous vehicles using camera data.”Remote Sensing131892020Search in Google Scholar
Xie S., Girshick R., Dollár P., Tu Z., & He K., “Aggregated residual transformations for deep neural networks.” In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1492–1500), 2017.XieS.GirshickR.DollárP.TuZ.HeK.“Aggregated residual transformations for deep neural networks.”In Proceedings of the IEEE conference on computer vision and pattern recognition149215002017Search in Google Scholar
Kolekar Suresh, Shilpa Gite, Biswajeet Pradhan, and Abdullah Alamri, “Explainable AI in Scene Understanding for Autonomous Vehicles in Unstructured Traffic Environments on Indian Roads Using the Inception U-Net Model with Grad-CAM Visualization.” Sensors 22, no. 24: 9677, 2022. https://doi.org/10.3390/s22249677.SureshKolekarGiteShilpaPradhanBiswajeetAlamriAbdullah“Explainable AI in Scene Understanding for Autonomous Vehicles in Unstructured Traffic Environments on Indian Roads Using the Inception U-Net Model with Grad-CAM Visualization.”Sensors222496772022https://doi.org/10.3390/s22249677Search in Google Scholar
Valev K., Schumann A., Sommer L., & Beyerer J., “A systematic evaluation of recent deep learning architectures for fine-grained vehicle classification.” In Pattern Recognition and Tracking XXIX (Vol. 10649, p. 1064902). SPIE. 2018.ValevK.SchumannA.SommerL.BeyererJ.“A systematic evaluation of recent deep learning architectures for fine-grained vehicle classification.”In Pattern Recognition and Tracking XXIX106491064902SPIE2018Search in Google Scholar
Wang S. H., & Zhang Y. D., DenseNet-201-based deep neural network with composite learning factor and precomputation for multiple sclerosis classification. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 16(2s), 1–19, 2020.WangS. H.ZhangY. D.DenseNet-201-based deep neural network with composite learning factor and precomputation for multiple sclerosis classificationACM Transactions on Multimedia Computing, Communications, and Applications (TOMM)162s1192020Search in Google Scholar
Song X., Wang P., Zhou D., Zhu R., Guan C., Dai Y., Yang R. “Apollocar3d: A large 3d car instance understanding benchmark for autonomous driving.” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (pp. 5452–5462), 2019.SongX.WangP.ZhouD.ZhuR.GuanC.DaiY.YangR.“Apollocar3d: A large 3d car instance understanding benchmark for autonomous driving.”InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition545254622019Search in Google Scholar
Rouben, Y. F. D. “Pose estimation using overhead imagery and semantics.” (Doctoral dissertation, Massachusetts Institute of Technology), 2018.RoubenY. F. D.“Pose estimation using overhead imagery and semantics.”Doctoral dissertation,Massachusetts Institute of Technology2018Search in Google Scholar
Llugsi R., El Yacoubi S., Fontaine A., Lupera P., “Comparison between Adam, AdaMax, and Adam W optimizers to implement a Weather Forecast based on Neural Networks for the Andean city of Quito.” In 2021 IEEE Fifth Ecu, 2021.LlugsiR.El YacoubiS.FontaineA.LuperaP.“Comparison between Adam, AdaMax, and Adam W optimizers to implement a Weather Forecast based on Neural Networks for the Andean city of Quito.”In 2021 IEEE Fifth Ecu2021Search in Google Scholar