Feature Map Augmentation to Improve Scale Invariance in Convolutional Neural Networks

[1] J. Dicarlo, D. Zoccolan, and N. C Rust, How does the brain solve visual object recognition? Neuron, vol. 73, pp. 415–34, 02 2012.10.1016/j.neuron.2012.01.010330644422325196 Search in Google Scholar

[2] D. Kumar, D. Sharma, and R. Goecke, Feature map augmentation to improve rotation invariance in convolutional neural networks, in Advanced Concepts for Intelligent Vision Systems, J. Blanc-Talon, P. Delmas, W. Philips, D. Popescu, and P. Scheunders, Eds. Cham: Springer International Publishing, 2020, pp. 348–359.10.1007/978-3-030-40605-9_30 Search in Google Scholar

[3] Y. LeCun, L. Bottou, Y. Bengio, P. Haffner et al., Gradient-based learning applied to document recognition, Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998. Search in Google Scholar

[4] K. Simonyan and A. Zisserman, Very deep convolutional networks for large-scale image recognition, arXiv preprint arXiv:1409.1556, 2014. Search in Google Scholar

[5] K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learning for image recognition, in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.10.1109/CVPR.2016.90 Search in Google Scholar

[6] A. Krizhevsky, G. Hinton et al., Learning multiple layers of features from tiny images, Citeseer, Tech. Rep., 2009. Search in Google Scholar

[7] H. Xiao, K. Rasul, and R. Vollgraf, Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms, arXiv, Tech. Rep., 2017. Search in Google Scholar

[8] F. F. Li, A. Karpathy, and J. Johnson, Tiny ImageNet Visual Recognition Challenge, https://tiny-imagenet.herokuapp.com/, 2019, [Online; accessed 30-Dec-2019]. Search in Google Scholar

[9] A. Shaw, Imagehoof dataset, https://github.com/fastai/imagenette/blob/master/README.md, 2019, [Online; accessed 10-Dec-2019]. Search in Google Scholar

[10] R. Maximilian and P. Tomaso, Hierarchical models of object recognition in cortex, Nature Neuro-science, vol. 2, pp. 1019–1025, 1999. Search in Google Scholar

[11] T. Serre, L. Wolf, S. Bileschi, M. Riesenhuber, and T. Poggio, Robust object recognition with cortex-like mechanisms, IEEE Trans. Pattern Anal. Mach. Intell., vol. 29, no. 3, pp. 411–426, Mar. 2007. [Online]. Available: http://dx.doi.org/10.1109/TPAMI.2007.5610.1109/TPAMI.2007.5617224612 Search in Google Scholar

[12] T. Serre, Hierarchical Models of the Visual System, in Encyclopedia of Computational Neuroscience, D. Jaeger and R. Jung, Eds. New York, NY: Springer New York, 2013, pp. 1–12.10.1007/978-1-4614-7320-6_345-1 Search in Google Scholar

[13] T. Poggio and T. Serre, Models of visual cortex, Scholarpedia, vol. 8, no. 4, p. 3516, 2013, revision #149958. Search in Google Scholar

[14] P. M. Bays, A signature of neural coding at human perceptual limits, Journal of Vision, vol. 16, no. 11, pp. 4–4, 09 2016. [Online]. Available: https://doi.org/10.1167/16.11.410.1167/16.11.4502466727604067 Search in Google Scholar

[15] D. H. Hubel and T. N. Wiesel, Receptive fields of single neurons in the cat’s striate cortex, J. Physiol, vol. 148, pp. 574–591, apr 1959.10.1113/jphysiol.1959.sp006308136313014403679 Search in Google Scholar

[16] Q. Zhao, T. Sheng, Y. Wang, Z. Tang, Y. Chen, L. Cai, and H. Ling, M2det: A single-shot object detector based on multi-level feature pyramid network, in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, 2019, pp. 9259–9266.10.1609/aaai.v33i01.33019259 Search in Google Scholar

[17] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, Going deeper with convolutions, in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 1–9.10.1109/CVPR.2015.7298594 Search in Google Scholar

[18] R. Girshick, Fast r-cnn, in Proceedings of the IEEE international conference on computer vision, 2015, pp. 1440–1448.10.1109/ICCV.2015.169 Search in Google Scholar

[19] N. Van Noord and E. Postma, Learning scale-variant and scale-invariant features for deep image classification, Pattern Recognition, vol. 61, pp. 583–592, 2017.10.1016/j.patcog.2016.06.005 Search in Google Scholar

[20] A. Kanazawa, A. Sharma, and D. W. Jacobs, Locally scale-invariant convolutional neural networks, CoRR, vol. abs/1412.5104, 2014. Search in Google Scholar

[21] D. Marcos, B. Kellenberger, S. Lobry, and D. Tuia, Scale equivariance in cnns with vector fields, arXiv preprint arXiv:1807.11783, 2018. Search in Google Scholar

[22] L. Ou, Z. Chen, J. Lu, and Y. Luo, Regularizing cnn via feature augmentation, in International Conference on Neural Information Processing. Springer, 2017, pp. 325–332.10.1007/978-3-319-70096-0_34 Search in Google Scholar

[23] T. DeVries and G. W. Taylor, Dataset augmentation in feature space, arXiv preprint arXiv:1702.05538, 2017. Search in Google Scholar

[24] B. Bayar and M. C. Stamm, Augmented convolutional feature maps for robust cnn-based camera model identification, in 2017 IEEE International Conference on Image Processing (ICIP). IEEE, 2017, pp. 4098–4102.10.1109/ICIP.2017.8297053 Search in Google Scholar

[25] D. Marcos, M. Volpi, and D. Tuia, Learning rotation invariant convolutional filters for texture classification, in 2016 23rd International Conference on Pattern Recognition (ICPR). IEEE, 2016, pp. 2012–2017.10.1109/ICPR.2016.7899932 Search in Google Scholar

[26] M. Jaderberg, K. Simonyan, A. Zisserman, and K. Kavukcuoglu, Spatial transformer networks, in Advances in Neural Information Processing Systems 28. Curran Associates, Inc., 2015, pp. 2017–2025. Search in Google Scholar

[27] L. Finnveden, Y. Jansson, and T. Lindeberg, The problems with using stns to align cnn feature maps, arXiv preprint arXiv:2001.05858, 2020. Search in Google Scholar

[28] Y. Gong, L. Wang, R. Guo, and S. Lazebnik, Multi-scale orderless pooling of deep convolutional activation features, in European conference on computer vision. Springer, 2014, pp. 392–407.10.1007/978-3-319-10584-0_26 Search in Google Scholar

[29] S. Zagoruyko and N. Komodakis, Learning to compare image patches via convolutional neural networks, in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 4353–4361.10.1109/CVPR.2015.7299064 Search in Google Scholar

[30] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, Rethinking the inception architecture for computer vision, in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 2818–2826.10.1109/CVPR.2016.308 Search in Google Scholar

[31] C. Szegedy, S. Ioffe, V. Vanhoucke, and A. A. Alemi, Inception-v4, inception-resnet and the impact of residual connections on learning, in Thirty-First AAAI Conference on Artificial Intelligence, 2017.10.1609/aaai.v31i1.11231 Search in Google Scholar

[32] D. Kumar and D. Sharma, Distributed information integration in convolutional neural networks, in Proceedings of the 15th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 5: VISAPP,. SciTePress, 2020, pp. 491–498.10.5220/0009150404910498 Search in Google Scholar

[33] D. Kumar and D. Sharma, Feature map upscaling to improve scale invariance in convolutional neural networks, in Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, vol. 5. Scitepress, Feb. 2021, pp. 113–122.10.5220/0010246001130122 Search in Google Scholar

[34] J. Heaton, Introduction to Neural Networks for Java, 2Nd Edition, 2nd ed. Heaton Research, Inc., 2008. Search in Google Scholar

[35] H. Hosseini, B. Xiao, M. Jaiswal, and R. Poovendran, On the limitation of convolutional neural networks in recognizing negative images, in 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA). IEEE, 2017, pp. 352–358.10.1109/ICMLA.2017.0-136 Search in Google Scholar

[36] D. Kumar, Multi-modal information extraction and fusion with convolutional neural networks for classification of scaled images, Ph.D. dissertation, University of Canberra, Canberra, Australia, 2020.10.1109/IJCNN48605.2020.9206803 Search in Google Scholar

[37] D. Kumar and D. Sharma, Multi-modal information extraction and fusion with convolutional neural networks, in 2020 International Joint Conference on Neural Networks (IJCNN). IEEE World Congress on Computational Intelligence (IEEE WCCI), 2020, pp. 1–9.10.1109/IJCNN48605.2020.9206803 Search in Google Scholar

[38] P. P. Tanner, P. Jolicoeur, W. B. Cowan, K. Booth, and F. D. Fishman, Antialiasing: A technique for smoothing jagged lines on a computer graphics image—an implementation on the amiga, Behavior Research Methods, Instruments, & Computers, vol. 21, no. 1, pp. 59–66, 1989.10.3758/BF03203871 Search in Google Scholar

[39] T. G. Dietterich, Approximate statistical tests for comparing supervised classification learning algorithms, Neural computation, vol. 10, no. 7, pp. 1895–1923, 1998. Search in Google Scholar

[40] R. Meyes, M. Lu, C. W. de Puiseau, and T. Meisen, Ablation studies in artificial neural networks, arXiv preprint arXiv:1901.08644, 2019. Search in Google Scholar

[41] R. Annunziata, C. Sagonas, and J. Calì, Destnet: Densely fused spatial transformer networks, arXiv preprint arXiv:1807.04050, 2018. Search in Google Scholar

Idioma:: Inglés

Calendario de la edición:: 4 veces al año
Temas de la revista:: Informática, Bases de datos y minería de datos, Inteligencia artificial

RSS Feed de revista

Feature Map Augmentation to Improve Scale Invariance in Convolutional Neural Networks

Dinesh Kumar

Dharmendra Sharma

Publicado en línea: 28 nov 2022

Páginas: 51 - 74

Recibido: 21 feb 2022

Aceptado: 19 oct 2022

DOI: https://doi.org/10.2478/jaiscr-2023-0004

Palabras claveConvolutional Neural Network, Feature Map Augmentation, Global Features, Scale-Invariant, Vision System

© 2023 Dinesh Kumar et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.

Palabras clave
Convolutional Neural Network, Feature Map Augmentation, Global Features, Scale-Invariant, Vision System