Adaptive Separation Fusion: A Novel Downsampling Approach in CNNS

Almost all computer vision tasks rely on convolutional neural networks and transformers, both of which require extensive computations. With the increasingly large size of images, it becomes challenging to input these images directly. Therefore, in typical cases, we downsample the images to a reasonable size before proceeding with subsequent tasks. However, the downsampling process inevitably discards some fine-grained information, leading to network performance degradation. Existing methods, such as strided convolution and various pooling techniques, struggle to address this issue effectively. To overcome this limitation, we propose a generalized downsampling module, Adaptive Separation Fusion Downsampling (ASFD). ASFD adaptively captures intra- and inter-region attentional relationships and preserves feature representations lost during downsampling through fusion. We validate ASFD on representative computer vision tasks, including object detection and image classification. Specifically, we incorporated ASFD into the YOLOv7 object detection model and several classification models. Experiments demonstrate that the modified YOLOv7 architecture surpasses state-of-the-art models in object detection, particularly excelling in small object detection. Additionally, our method outperforms commonly used downsampling techniques in classification tasks. Furthermore, ASFD functions as a plug-and-play module compatible with various network architectures.

Language:: English

Publication timeframe:: 4 times per year
Journal Subjects:: Computer Sciences, Artificial Intelligence, Databases and Data Mining

Journal RSS Feed

Adaptive Separation Fusion: A Novel Downsampling Approach in CNNS

Xia Ji

Jinglong Chang

Yapeng Ji

Published Online: Feb 05, 2025

Page range: 197 - 210

Received: Oct 13, 2024

Accepted: Jan 14, 2025

DOI: https://doi.org/10.2478/jaiscr-2025-0010

Keywordsadaptive, object detection, image classification, deep learing, downsampling

© 2025 Xia Ji et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Keywords
adaptive, object detection, image classification, deep learing, downsampling