Combination of Resnet and Spatial Pyramid Pooling for Musical Instrument Identification

1. Ribeiro, A. C. M., R. C. Scharlach, M. M. C. Pinheiro. Assessment of Temporal Aspects in Popular Singers. – CODAS, Vol. 27, 2015. https://doi.org/10.1590/2317-1782/2015201423410.1590/2317-1782/2015201423426691615 Search in Google Scholar

2. Bai, T., Y. Pang, J. Wang, K. Han, J. Luo, H. Wang, J. Lin, J. Wu, H. Zhang. An Optimized Faster R-CNN Method Based on DRNet and RoI Align for Building Detection in Remote Sensing Images. – Remote Sens., Vol. 12, 2020. https://doi.org/10.3390/rs1205076210.3390/rs12050762 Search in Google Scholar

3. Wetzel, J., A. Laubenheimer, M. Heizmann. Joint Probabilistic People Detection in Overlapping Depth Images. – IEEE Access, Vol. 8, 2020. https://doi.org/10.1109/ACCESS.2020.297205510.1109/ACCESS.2020.2972055 Search in Google Scholar

4. Dewi, C., R. C. Chen, H. Yu. Weight Analysis for Various Prohibitory Sign Detection and Recognition Using Deep Learning. Multimed. – Tools Appl. Vol. 79, 2020, pp. 32897-32915. https://doi.org/10.1007/s11042-020-09509-x10.1007/s11042-020-09509-x Search in Google Scholar

5. Xi, X., Z. Yu, Z. Zhan, Y. Yin, C. Tian. Multi-Task Cost-Sensitive-Convolutional Neural Network for Car Detection. – IEEE Access, Vol. 7, 2019. https://doi.org/10.1109/ACCESS.2019.292786610.1109/ACCESS.2019.2927866 Search in Google Scholar

6. Dewi, C., R. C. Chen, Y. T. Liu. Wasserstein Generative Adversarial Networks for Realistic Traffic Sign Image Generation. – In: Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2021, pp. 479-493. https://doi.org/10.1007/978-3-030-73280-6_3810.1007/978-3-030-73280-6_38 Search in Google Scholar

7. Ju, M., S. Moon, C. D. Yoo. Object Detection for Similar Appearance Objects Based on Entropy. – In: Proc. of 7th International Conference on Robot Intelligence Technology and Applications (RiTA’19), 2019. https://doi.org/10.1109/RITAPP.2019.893279110.1109/RITAPP.2019.8932791 Search in Google Scholar

8. Jiang, Y., L. Chen, H. Zhang, X. Xiao. Breast Cancer Histopathological Image Classification Using Convolutional Neural Networks with Small SE-ResNet Module. – PLoS One, Vol. 14, 2019. https://doi.org/10.1371/journal.pone.021458710.1371/journal.pone.0214587644062030925170 Search in Google Scholar

9. Yu, X., C. Kang, D. S. Guttery, S. Kadry, Y. Chen, Y. D. Zhang. ResNet-SCDA-50 for Breast Abnormality Classification. IEEE/ACM Trans. – Comput. Biol. Bioinforma, Vol. 18, 2021. https://doi.org/10.1109/TCBB.2020.298654410.1109/TCBB.2020.298654432287004 Search in Google Scholar

10. Yao, B., L. Fei-Fei. Grouplet: A Structured Image Representation for Recognizing Human and Object Interactions. – In: Proc. of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010. https://doi.org/10.1109/CVPR.2010.554023410.1109/CVPR.2010.5540234 Search in Google Scholar

11. Zhang, X., F. Wan, C. Liu, X. Ji, Q. Ye. Learning to Match Anchors for Visual Object Detection. – IEEE Trans. Pattern Anal. Mach. Intell., 2021. https://doi.org/10.1109/TPAMI.2021.305049410.1109/TPAMI.2021.305049433434120 Search in Google Scholar

12. Girshick, R. Fast R-CNN. – In: Proc. of IEEE International Conference on Computer Vision, 2015, pp. 1440-1448. https://doi.org/10.1109/ICCV.2015.16910.1109/ICCV.2015.169 Search in Google Scholar

13. Cheng, G., Y. Si, H. Hong, X. Yao, L. Guo. Cross-Scale Feature Fusion for Object Detection in Optical Remote Sensing Images. – IEEE Geosci. Remote Sens. Lett., Vol. 18, 2021. https://doi.org/10.1109/LGRS.2020.297554110.1109/LGRS.2020.2975541 Search in Google Scholar

14. Redmon, J., S. Divvala, R. Girshick, A. Farhadi. You Only Look Once: Unified, Real-Time Object Detection. – In: Proc. of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2016, pp. 779-788. https://doi.org/10.1109/CVPR.2016.9110.1109/CVPR.2016.91 Search in Google Scholar

15. Liu, W., D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Y. Fu, A. C. Berg. SSD: Single Shot Multibox Detector. – In: Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2016, pp. 21-37. https://doi.org/10.1007/978-3-319-46448-0_210.1007/978-3-319-46448-0_2 Search in Google Scholar

16. Srinivasan, K., P. Balamurugan, V. R. Azhaguramyaa. Survey on Similar Object Detection in H.264 Compressed Video. – In: Proc. of 2017 International Conference on Algorithms, Methodology, Models and Applications in Emerging Technologies (ICAMMAET’17), 2017. https://doi.org/10.1109/ICAMMAET.2017.818666310.1109/ICAMMAET.2017.8186663 Search in Google Scholar

17. Grauman, K., T. Darrell. The Pyramid Match Kernel: Discriminative Classification with Sets of Image Features. – In: Proc. of IEEE International Conference on Computer Vision, 2005, pp. 1458-1465. https://doi.org/10.1109/ICCV.2005.23910.1109/ICCV.2005.239 Search in Google Scholar

18. Lazebnik, S., C. Schmid, J. Ponce. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. – In: Proc. of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2006, pp. 1-8. https://doi.org/10.1109/CVPR.2006.6810.1109/CVPR.2006.68 Search in Google Scholar

19. Dai, J., Y. Li, K. He, J. Sun. R-FCN: Object Detection via Region-Based Fully Convolutional Networks. – In: Advances in Neural Information Processing Systems, 2016, pp. 379-387. Search in Google Scholar

20. Sivic, J., A. Zisserman. Video Google: A Text Retrieval Approach to Object Matching in Videos. – In: Proc. of IEEE International Conference on Computer Vision, 2003, pp. 1-8. https://doi.org/10.1109/iccv.2003.123866310.1109/ICCV.2003.1238663 Search in Google Scholar

21. Yang, J., K. Yu, Y. Gong, T. Huang. Linear Spatial Pyramid Matching Using Sparse Coding for Image Classification. – In: Proc. of 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2009, 2009, pp. 1794-1801. https://doi.org/10.1109/CVPRW.2009.520675710.1109/CVPR.2009.5206757 Search in Google Scholar

22. Wang, J., J. Yang, K. Yu, F. Lv, T. Huang, Y. Gong. Locality-Constrained Linear Coding for Image Classification. – In: Proc. of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010, pp. 3360-3367. https://doi.org/10.1109/CVPR.2010.554001810.1109/CVPR.2010.5540018 Search in Google Scholar

23. Van de Sande, K. E. A., J. R. R. Uijlings, T. Gevers, A. W. M. Smeulders. Segmentation as Selective Search for Object Recognition. – In: Proc. of IEEE International Conference on Computer Vision, 2011, pp. 1879-1886. https://doi.org/10.1109/ICCV.2011.612645610.1109/ICCV.2011.6126456 Search in Google Scholar

24. He, K., X. Zhang, S. Ren, J. Sun. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. – IEEE Trans. Pattern Anal. Mach. Intell., Vol. 37, 2015, pp. 1904-1916. https://doi.org/10.1109/TPAMI.2015.238982410.1109/TPAMI.2015.238982426353135 Search in Google Scholar

25. He, K., X. Zhang, S. Ren, J. Sun. Deep Residual Learning for Image Recognition. – In: Proc. of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2016, pp. 770-778. https://doi.org/10.1109/CVPR.2016.9010.1109/CVPR.2016.90 Search in Google Scholar

26. Chander, G., B. L. Markham, D. L. Helder. Summary of Current Radiometric Calibration Coefficients for Landsat MSS, TM, ETM+, and EO-1 ALI Sensors. – Remote Sens. Environ., Vol. 113, 2009, pp. 893-903. https://doi.org/10.1016/j.rse.2009.01.00710.1016/j.rse.2009.01.007 Search in Google Scholar

27. Fang, W., C. Wang, X. Chen, W. Wan, H. Li, S. Zhu, Y. Fang, B. Liu, Y. Hong. Recognizing Global Reservoirs from Landsat 8 Images: A Deep Learning Approach. – IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., Vol. 12, 2019, pp. 3168-3177. https://doi.org/10.1109/jstars.2019.292960110.1109/JSTARS.2019.2929601 Search in Google Scholar

28. Ibrahim, Y., H. Wang, M. Bai, Z. Liu, J. Wang, Z. Yang, Z. Chen. Soft Error Resilience of Deep Residual Networks for Object Recognition. – IEEE Access, Vol. 8, 2020, pp. 19490-19503. https://doi.org/10.1109/ACCESS.2020.296812910.1109/ACCESS.2020.2968129 Search in Google Scholar

29. Wen, L., X. Li, L. Gao. A Transfer Convolutional Neural Network for Fault Diagnosis Based on ResNet-50. – Neural Comput. Appl., Vol. 32, 2020. https://doi.org/10.1007/s00521-019-04097-w10.1007/s00521-019-04097-w Search in Google Scholar

30. Fulton, L. V., D. Dolezel, J. Harrop, Y. Yan, C. P. Fulton. Classification of Alzheimer’s Disease with and without Imagery Using Gradient Boosted Machines and Resnet-50. – Brain Sci., Vol. 9, 2019. https://doi.org/10.3390/brainsci909021210.3390/brainsci9090212677093831443556 Search in Google Scholar

31. Dewi, C., R.-C. Chen, Y.-T. Liu, S.-K. Tai. Synthetic Data Generation Using DCGAN for Improved Traffic Sign Recognition. – Neural Comput. Appl., Vol. 33, 2021, pp. 1-15.10.1007/s00521-021-05982-z Search in Google Scholar

32. Arcos-García, Á., J. A. Álvarez-García, L. M. Soria-Morillo. Evaluation of Deep Neural Networks for Traffic Sign Detection Systems. – Neurocomputing., Vol. 316, 2018, pp. 332-344. https://doi.org/10.1016/j.neucom.2018.08.00910.1016/j.neucom.2018.08.009 Search in Google Scholar

33. Dewi, C., R. C. Chen, H. Yu, X. Jiang. Robust Detection Method for Improving Small Traffic Sign Recognition Based on Spatial Pyramid Pooling. – J. Ambient Intell. Humaniz. Comput., Vol. 12, 2021. https://doi.org/10.1007/s12652-021-03584-010.1007/s12652-021-03584-0 Search in Google Scholar

34. Yang, H., L. Chen, M. Chen, Z. Ma, F. Deng, M. Li, X. Li. Tender Tea Shoots Recognition and Positioning for Picking Robot Using Improved YOLO-V3 Model. – IEEE Access., Vol. 7, 2019, pp. 180998-181011. https://doi.org/10.1109/ACCESS.2019.295861410.1109/ACCESS.2019.2958614 Search in Google Scholar

35. Tian, Y., G. Yang, Z. Wang, H. Wang, E. Li, Z. Liang. Apple Detection During Different Growth Stages in Orchards Using the Improved YOLO-V3 Model. – Comput. Electron. Agric., Vol. 157, 2019, pp. 417-426. https://doi.org/10.1016/j.compag.2019.01.01210.1016/j.compag.2019.01.012 Search in Google Scholar

Idioma:: Inglés

Calendario de la edición:: 4 veces al año
Temas de la revista:: Informática, Tecnologías de la información

RSS Feed de revista

Combination of Resnet and Spatial Pyramid Pooling for Musical Instrument Identification

Christine Dewi

Rung-Ching Chen

Publicado en línea: 10 abr 2022

Páginas: 104 - 116

Recibido: 16 nov 2021

Aceptado: 25 feb 2022

DOI: https://doi.org/10.2478/cait-2022-0007

Palabras claveResnet 50, Resnet 50 SPP, spatial pyramid pooling, musical instruments, similar object

© 2022 Christine Dewi et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Palabras clave
Resnet 50, Resnet 50 SPP, spatial pyramid pooling, musical instruments, similar object