Adaptive Separation Fusion: A Novel Downsampling Approach in CNNS

[2] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778, 2016. Search in Google Scholar

[3] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 580–587, 2014. Search in Google Scholar

[4] O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in Medical image computing and computer-assisted intervention–MICCAI 2015: 18th international conference, Munich, Germany, October 5-9, 2015, proceedings, part III 18, pp. 234–241, Springer, 2015. Search in Google Scholar

[5] W. Zhang, Z. Hong, L. Xiong, Z. Zeng, Z. Cai, and K. Tan, “Sinextnet: A new small object detection model for aerial images based on pp-yoloe,” Journal of Artificial Intelligence and Soft Computing Research, vol. 14, no. 3, pp. 251–265. Search in Google Scholar

[6] N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko, “End-to-end object detection with transformers,” in European conference on computer vision, pp. 213–229, Springer, 2020. Search in Google Scholar

[7] Y.-L. Boureau, F. Bach, Y. LeCun, and J. Ponce, “Learning mid-level features for recognition,” in 2010 IEEE computer society conference on computer vision and pattern recognition, pp. 2559–2566, IEEE, 2010. Search in Google Scholar

[8] Y.-L. Boureau, J. Ponce, and Y. LeCun, “A theoretical analysis of feature pooling in visual recognition,” in Proceedings of the 27th international conference on machine learning (ICML-10), pp. 111–118, 2010. Search in Google Scholar

[9] A. Stergiou, R. Poppe, and G. Kalliatakis, “Refining activation downsampling with softpool,” in Proceedings of the IEEE/CVF international conference on computer vision, pp. 10357–10366, 2021. Search in Google Scholar

[10] M. D. Zeiler and R. Fergus, “Stochastic pooling for regularization of deep convolutional neural networks,” 2013. Search in Google Scholar

[11] D. Yu, H. Wang, P. Chen, and Z. Wei, “Mixed pooling for convolutional neural networks,” in Rough Sets and Knowledge Technology: 9th International Conference, RSKT 2014, Shanghai, China, October 24-26, 2014, Proceedings 9, pp. 364–375, Springer, 2014. Search in Google Scholar

[12] C. Gulcehre, K. Cho, R. Pascanu, and Y. Bengio, “Learned-norm pooling for deep feedforward and recurrent neural networks,” in Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2014, Nancy, France, September 15-19, 2014. Proceedings, Part I 14, pp. 530–546, Springer, 2014. Search in Google Scholar

[13] S. Zhai, H. Wu, A. Kumar, Y. Cheng, Y. Lu, Z. Zhang, and R. Feris, “S3pool: Pooling with stochastic spatial sampling,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4970–4978, 2017. Search in Google Scholar

[14] Z. Gao, L. Wang, and G. Wu, “Lip: Local importance-based pooling,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3355–3364, 2019. Search in Google Scholar

[15] J. Zhao and C. G. M. Snoek, “Liftpool: Bidirectional convnet pooling,” 2021. Search in Google Scholar

[16] Q. Zhu, J. Huang, N. Zheng, H. Gao, C. Li, Y. Xu, F. Zhao, et al., “Fouridown: factoring down-sampling into shuffling and superposing,” Advances in Neural Information Processing Systems, vol. 36, 2024. Search in Google Scholar

[17] R. Sunkara and T. Luo, “No more strided convolutions or pooling: A new cnn building block for low-resolution images and small objects,” in Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 443–459, Springer, 2022. Search in Google Scholar

[18] Z. Li, F. Liu, W. Yang, S. Peng, and J. Zhou, “A survey of convolutional neural networks: analysis, applications, and prospects,” IEEE transactions on neural networks and learning systems, vol. 33, no. 12, pp. 6999–7019, 2021. Search in Google Scholar

[19] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Communications of the ACM, vol. 60, no. 6, pp. 84–90, 2017. Search in Google Scholar

[20] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, “An image is worth 16x16 words: Transformers for image recognition at scale,” 2020. Search in Google Scholar

[21] S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon, “Cbam: Convolutional block attention module,” in Proceedings of the European conference on computer vision (ECCV), pp. 3–19, 2018. Search in Google Scholar

[22] C.-Y. Wang, A. Bochkovskiy, and H.-Y. M. Liao, “Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 7464–7475, 2023. Search in Google Scholar

[23] H. Su, S. Wei, S. Liu, J. Liang, C. Wang, J. Shi, and X. Zhang, “Hq-isnet: High-quality instance segmentation for remote sensing imagery,” Remote Sensing, vol. 12, no. 6, p. 989, 2020. Search in Google Scholar

[24] G. Jocher, “YOLOv5 by Ultralytics,” May 2020. Search in Google Scholar

[25] C. Li, L. Li, H. Jiang, K. Weng, Y. Geng, L. Li, Z. Ke, Q. Li, M. Cheng, W. Nie, Y. Li, B. Zhang, Y. Liang, L. Zhou, X. Xu, X. Chu, X. Wei, and X. Wei, “Yolov6: A single-stage object detection framework for industrial applications,” 2022. Search in Google Scholar

[26] Z. Ge, S. Liu, F. Wang, Z. Li, and J. Sun, “Yolox: Exceeding yolo series in 2021,” 2021. Search in Google Scholar

[27] G. Jocher, A. Chaurasia, and J. Qiu, “Ultralytics YOLO,” Jan. 2023. Search in Google Scholar

[28] Y. Le and X. Yang, “Tiny imagenet visual recognition challenge,” CS 231N, vol. 7, no. 7, p. 3, 2015. Search in Google Scholar

[29] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “Mobilenets: Efficient convolutional neural networks for mobile vision applications,” 2017. Search in Google Scholar

[30] J. Chen, S.-h. Kao, H. He, W. Zhuo, S. Wen, C.-H. Lee, and S.-H. G. Chan, “Run, don’t walk: Chasing higher flops for faster neural networks,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12021–12031, 2023. Search in Google Scholar

Language:: English

Publication timeframe:: 4 times per year
Journal Subjects:: Computer Sciences, Artificial Intelligence, Databases and Data Mining

Journal RSS Feed

Adaptive Separation Fusion: A Novel Downsampling Approach in CNNS

Xia Ji

Jinglong Chang

Yapeng Ji

Published Online: Feb 05, 2025

Page range: 197 - 210

Received: Oct 13, 2024

Accepted: Jan 14, 2025

DOI: https://doi.org/10.2478/jaiscr-2025-0010

Keywordsadaptive, object detection, image classification, deep learing, downsampling

© 2025 Xia Ji et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Keywords
adaptive, object detection, image classification, deep learing, downsampling