Open Access

Change detection in synthetic aperture radar images using spatial fuzzy clustering based on the similarity matrix

, , ,  and   
Jul 19, 2025

Cite
Download Cover

Introduction

Change detection refers to the task of identifying discrepancies in multi-temporal remote sensing images of the same geographic region, with significant applications in urban planning, disaster evaluation, ecological monitoring, and other areas [1]. In the past, high-resolution optical imagery was the dominant source of data for change detection. Although optical imagery is sensitive to environmental factors, such as cloud cover, solar illumination, and seasonality, all of which have a negative impact on the quality of detection outcomes [2]. Synthetic aperture radar (SAR) imagery, with the ability to operate under conditions of changing weather, penetrability of cloud cover, and fog, is a suitable alternative. SAR independence from solar illumination and atmospheric factors makes it a major instrument for monitoring environmental change and urban sprawl, particularly in challenging situations [3].

The European Space Agency's Copernicus Program Sentinel-1 mission yields a uniform dataset for detecting changes using SAR. With its twin satellites, Sentinel-1A and Sentinel-1B, offering revisit times of 12 days as well as C-band high-resolution radar imagery, the data have been widely exploited for applications ranging from flood watch to urban sprawl monitoring [4]. Despite these benefits, conventional SAR change detection techniques, such as image differencing and thresholding, are usually plagued by limitations. These are the speckled noise sensitivity and inability to process complex patterns in SAR data, which can have a major impact on detection accuracy [5]. Deep learning models are a paradigm shift in change detection using SAR imagery. Convolutional neural networks (CNNs) are capable of learning spatial and contextual information from high-dimensional SAR data, thus enabling improved change detection between pre- and post-event images [6]. State-of-the-art developments in terms of locally constrained networks and dual-channel CNNs have improved feature extraction efficiency and noise reduction considerably. However, these methods are still constrained in detecting sophisticated contextual clues, especially in dense urban environments [7]. Urban change detection for changes in built structures, such as buildings and roads, are a relatively new field, and hence, there is a need for targeted research in this area [8]. This research utilizes cutting-edge deep learning methods to address challenges in SAR-based urban change detection. Based on specially created SAR data developed in collaboration with the research community, we aim to improve the usability of the SAR image in urban environments. Proposed work addresses common problems, such as speckle noise removal and development of robust classifiers, thus offering frame-work for invariant change detection in urban built-up environments.

Literature Review

The subsection presents a review of the state-of-the-art methods for change detection using SAR images. Zhong et al. [9] presented TansUNet++SAR, a combination of transformer with UNet++ for change detection. In the work, the feature maps of single time SAR image are extracted in each layer. Experiments on the Beijing change detection dataset present an Intersection over Union (IOU) increase of 9.79% and a F1-score increase of 4.38%. In the study by Samadi [10], a deep belief network (DBN) is employed, and the training process of this network included unsupervised feature learning followed by supervised network fine-tuning. Experiments indicate desirable performance and high accuracy. A difference image is created through the application of the subtraction or ratio operator on the intensity of the two pixel-by-pixel multi-temporal images [11]. The traditional approaches used to classify high-resolution satellite imagery tend to lack because of their intricate spectral, textural, and spatial nature. Adegun et al. [12] circumvented this shortcoming by comparing deep learning approaches, namely CNNs and vision transformers (ViTs). They compared various CNN architectures, such as ResNet, DenseNet, EfficientNet, VGG, and InceptionV3, on three datasets: EuroSAT, UCMerced-LandUse, and NWPU-RESISC45. The article introduces Iterated Dilated Convolutional Neural Network (ID-CNN), a deep learning technique for denoising SAR images with speckle noise. It involves convolutional layers and a division residual layer and is trained with loss functions. The process requires clean images to be used for training and its efficiency relying on the training data. Potential future research may be done through unsupervised learning or other advanced methods, such as Generative Adversarial Networks (GANs) [13]. The work indicates that incorporating images created by Graph Signal Processing (GSP) lowers false alarms in a Bayes' Theorem-based SAR change detection algorithm. This emphasizes the importance of good-quality reference images. The technique is straightforward but has not been experimented with in other SAR systems. Further studies may investigate alternative methods to enhance its efficiency [14]. The article applies a deep learning algorithm, Faster Region-based Convolutional Neural Network (R-CNN), for ship detection from SAR images using feature extraction through VGG ImageNet. The model accurately detects ships but requires training using labeled data and might consume heavy resources. Potential improvements for future research would involve optimizing efficiency, checking on other datasets, and adding other marine objects detection [15]. The article narrates the evolution of SAR image change detection technology from the 1980s to the present, including imaging techniques, detection techniques, and precision. It touches on the increasing importance of deep learning without going into technical explanations. Future studies can enhance by combining more data sources, making it simpler to comprehend deep learning models, and for increasing scalability for various applications [16]. The article reviews the application of supervised, unsupervised, and semi-supervised deep learning techniques on different datasets, including SAR, multispectral, hyper spectral, and Very High Resolution (VHR) imagery. The benefits of deep learning in change detection accuracy over conventional techniques are identified as well as the problems and prospects for advancing the same. The review is intended to direct the research toward more efficient change detection algorithms based on deep learning [17]. The article discusses five primary factors of change detection using deep learning, including its effects on information representation, method choice, performance, limitations, and future directions. It contrasts Deep Learning for Change Detection (DLCD) with conventional techniques from different perspectives and offers taxonomy of DLCD methods, noting their benefits and drawbacks, among which are training data and computational requirements [18]. The paper meta-analyzes and reviews deep learning techniques for remotely sensed automatic change detection. It covers the basics, categorizes techniques as supervised, unsupervised, and transfer learning-based techniques, and highlights avenues for future research to establish a platform for future research in the area [19].

In this study, Section I presents the introduction. Section II discusses the state-of-the art methods for change detection. Section III presents the proposed methodology for change detection using spatial fuzzy clustering based on the similarity matrix. Section IV presents the experimental results along with discussion, and Section V presents the conclusions.

Methodology
Introduction to CNN

CNNs efficiently yield spatial hierarchies for SAR change detection, image processing, and feature extraction. The addition of CNN layers with average pooling and squeeze optimizes feature dimensionality reduction and extraction. Spatial patterns are detected by convolutional layers; whereas, average pooling lowers feature map size while retaining principal information and keeping computational complexity in check as shown in Figure 1. Spatial dimensions are additionally compressed by squeeze operation, augmenting saliency focus on prominent features. The method preserves the best trade-off between accuracy and efficiency and hence is amenable to high-dimensional SAR change detection.

Figure 1:

Illustration of CNN model architecture. CNN, convolutional neural network.

Model architecture

The SAR_CNN model architecture for change detection in SAR images is a CNN that uses various convolutional and pooling layers to extract features sequentially and iteratively refine them. The network operates on SAR images taken at two-time instances, and the aim is to identify changes between them.

Input layer

The network receives two SAR images at two separate times as input. The two images are stacked on top of each other to create a 2-channel input tensor, where one channel corresponds to each image. The spatial shape of the input is normally denoted as H × W × 2, where H and W are the height and width of the images, respectively, and the depth is 2 because of the two image inputs.

First convolutional layer

A Conv2D layer with 2 input channels and 6 filters (2 × 2 kernel, stride 1, padding 1) extracts features from SAR images, producing six H × W feature maps while preserving spatial resolution.

First pooling layer

Following the convolutional process, the output is fed through a 2D Average Pooling (AvgPool2D) layer with a pool size of 2 × 2. This pooling layer down samples the feature maps by taking the average value of each 2 × 2 block of pixels, lowering the spatial dimensions by a factor of 2. The average pooling operation is mathematically written as shown in Eq. (1) yi,j=14p=01q=01x2i+P,2j+q {y_{i,j}} = {1 \over 4}\sum\nolimits_{p = 0}^1 {\sum\nolimits_{q = 0}^1 {x2i + P,2j + q}} where yi,j, j is the pooled value, and x 2i + p, 2j + q are the values in the 2 × 2 block of the input feature map. This operation retains the most salient features while reducing the computational load.

Second convolutional layer

The second convolutional layer performs a similar operation to the first one but with 6 input channels (from the previous pooling layer) and 12 output channels (filters). The kernel size is 2 × 2, and the stride remains 1, but the 19 padding is set to 0. This results in an output of 12 feature maps, each with reduced spatial dimensions due to the absence of padding.

Second pooling layer

A second AvgPool2D layer follows the second convolutional layer. This layer further reduces the spatial resolution of the feature maps by applying the 2 × 2 pooling operation, decreasing the size by half again, as shown in Figure 2.

Figure 2:

Illustration of the average pooling operation with a 2 × 2 filter and stride.

Third convolutional layer

The third and final convolutional layer applies 1 × 1 convolutions with 12 input channels and 2 output channels (filters). The kernel size of 1 × 1 allows the model to reduce the number of channels while maintaining the spatial dimensions of the feature maps. This layer is primarily used to reduce the depth of the feature maps and prepare the output for classification.

Output layer

The output layer consists of a tensor with two channels, which corresponds to the model's final classification for each pixel—whether it has changed or not between the two time periods. The output is “squeezed” to remove singleton dimensions, producing a final output tensor. In practice, this tensor is used to generate a binary change detection map where each pixel is classified as either a “change” or “no change” based on the output values.

Mathematical representation

The model applies linear transformations through convolution, as represented in Eq. (2): y(i,j)=m,nx(i+m,j+n)w(m,n) y(i,j) = \sum\limits_{m,n} {x(i + m,j + n) \cdot w(m,n)} where x(i, j), x(i, j), x(i, j) is the input, w(m, n), w(m, n), w(m, n) is the filter, and y(i, j), y(i, j), y(i, j) is the output feature map. Pooling reduces spatial dimensions, enabling hierarchical feature extraction. Training optimizes filter weights using cross-entropy loss and the Adam optimizer. The final model detects changes in SAR image pairs through convolution, pooling, and classification layers.

Proposed methodology for change detection using spatial fuzzy clustering based on the similarity matrix

SAR change detection is challenged by the fact that CNNs favor feature extraction at the expense of vital preprocessing. The main drawbacks are speckle noise, resolution variation, and geometric differences. To compensate for this, the proposed approach is based on enhancing preprocessing, which involves similar matrix calculation and spatial fuzzy clustering, to yield noise-resistant and spatially uniform data. Unlike typical methods involving model architecture design, the proposed modular preprocessing improves adaptability and scalability.

The Ottawa SAR dataset is used as the dataset for experimental analysis. Initially, in the proposed work, the SAR_CNN network utilizes transfer learning with pretrained weights, supervised learning on labeled SAR image pairs, and binary cross-entropy loss trained with the Adam (learning rate 0.0001) for rapid convergence. The Ottawa SAR dataset, with high-resolution dual-polarization temporal sequences, undergoes preprocessing through radiometric calibration, geometric correction, and speckle noise reduction. The dataset is then divided into 70% for training, 20% for validation, and 10% for testing to achieve the best generalizations. Figure 3 presents the flowchart of the proposed method.

Figure 3:

Flowchart of the proposed method. CNN, convolutional neural network; SFCM, spatial fuzzy clustering membership.

Preprocessing is vital for SAR change detection, ensuring clean and optimized input for deep learning. This pipeline uses similarity calculations, spatial clustering, and noise reduction to enhance data quality for analysis. The preprocessing steps are as elaborated.

Similarity matrix calculation

The preprocessing pipeline plays a critical role in enhancing the accuracy and robustness of change detection models for SAR images by extracting meaningful features and mitigating the effects of noise and acquisition artifacts. The first step in the pipeline, similarity matrix calculation, establishes a pixel-wise similarity metric between two SAR images captured at different time intervals. This process identifies relative intensity differences, which are normalized to account for global variations in brightness. The similarity matrix S(x,y), S(x,y) is computed using the formula as shown in Eq. (3) S(x,y)={|l1(R,y)l2(γ,y)|l1(x1,y)l2(xJ,y),ifl1(x,y)+l2(x,y)>00,Otherwise S(x,y) = \left\{{\matrix{{\sum {{{\left| {{l_1}(R,y) - {l_2}(\gamma,y)} \right|} \over {{l_1}({x_1},y) - {l_2}({x_J},y)}}},\,\,if\,\,l1(x,y) + l2(x,y) > 0} \hfill\cr{0,Otherwise} \hfill\cr}} \right.

Space membership

The spatial membership function used in the proposed work assesses the spatial connection of a pixel to its nearby pixels within the context of fuzzy clustering. The function is prepared to integrate the spatial information within the clustering algorithm to make the segmentation or categorization more immune to noise and outliers by analyzing the effect of nearby pixels. For every pixel at an index, the function adds up the membership values of its 8-connected neighbors for a specific cluster with spatial coherence. This makes pixels in homogeneous areas have similar memberships, resulting in smoother and more meaningful segmentation. The space membership matrix extends the conventional fuzzy clustering by taking spatial consistency into account, such that the segmentation is more consistent with real structures in the image, instead of being dominated by noise or isolated pixel values as in Eq. (4). Space(i,j,v)=(dx,dy)BNM(i+dx,j+dy,v) {\rm{Space}}\,(i,j,v) = \sum\limits_{(dx,dy) \in BN} {M(i + dx,j + dy,v)}

In Eq. (4), i and j represent the coordinates of the current pixel in the image, while v is the cluster being considered for the space membership calculation. The set Batch Normalization (BN) contains the offsets for the 8-connected neighbors around the pixel (i, j). The membership value M (i + dx, j + dy, v) indicates the degree of membership of the neighboring pixel (i + dx, j + dy) to cluster v, and the sum of the overall neighbors gives the spatial membership for the pixel (i, j) and cluster v.

Spatial fuzzy clustering membership (SFCM)

The SFCM method segments the similarity matrix while ensuring spatial coherence, addressing SAR noise issues common in traditional clustering. Each pixel is probabilistically assigned to multiple clusters using a fuzzy membership matrix, updated iteratively. The membership of pixel iii in cluster jjj is given by Eq. (5) vj=Σiuijml(i)Σiuijm {v_j} = {{{\Sigma_i}u_{{\rm{i}}j}^m \cdot l(i)} \over {{\Sigma_i}u_{{\rm{i}}j}^m}} uij=1Σk(dijdik)2m1. {u_{ij}} = {1 \over {{\Sigma_k}{{\left({{{dij} \over {{d_{ik}}}}} \right)}^{{2 \over {m - 1}}}}}}.

In Eq. (6), dijd_{ij}dij is the Euclidean distance between pixel iii and cluster center jjj, and m > 1m > 1m > 1 controls the fuzziness of clustering.

To incorporate spatial in formation, membership values are refined using neighborhood probabilities as shown in Eq. (7) u˜ij=αuij+(1α)ΣkεneighborsukjNeighborhoodsize. {\tilde u_{ij}} = \alpha\cdot {u_{ij}} + (1 - \alpha) \cdot {{{\Sigma_{{\rm{k}}\varepsilon {\rm{neighbors}}}}ukj} \over {{\rm{Neighborhoodsize}}}}. where α balances the contributions of spatial and intensity-based clustering, and N(i)N(i)N(i) represents the neighborhood of pixel iii. Until membership updates drop below a predetermined level, the procedure repeats.

The clustered output is converted into a binary image, where changed pixels are assigned 255 (white) and unchanged ones 0 (black), simplifying change visualization. Dataset-specific preprocessing adapts to resolution, noise, and acquisition geometry variations, ensuring robust performance. This modular approach enhances SAR change detection by providing noise-robust, spatially consistent, and dataset-adaptive inputs, improving model accuracy and interpretability.

The binary image conversion stage transforms the outputs of clustering into a binary image indicating areas (255) and non-change areas (0). This facilitates visualization and is a required intermediate output. The dataset-specific preprocessing guarantees flexibility in various SAR datasets by setting input paths and storing outputs in an orderly manner. This modularity enables smooth execution of similarity computations, fuzzy clustering, and binary image conversion, which facilitates scalability. The process execution starts with calculating the similarity matrix and then SFCM for segmenting change and stability regions. The pre-classified labeled and binary outputs are then stored for further processing.

Training and validation

CNN is trained in SAR image pairs, learning hierarchical features for effective change detection. the optimal performance was achieved using a batch size of 64 and a learning rate of 0.0003, which were optimized using cross-entropy loss and the Adam optimizer. The validation and performance evaluation procedure prevents over-fitting and ensures generalization to unseen data. Validation loss and accuracy are regularly tracked, and regularization and hyperparameter tuning are implemented as required.

Through the incorporation of strong preprocessing, adaptive dataset management, and model training optimization, the methodology increases the accuracy and reliability of SAR change detection models.

Results and Discussion

The results from the training loop of the deep learning model for change detection in SAR images demonstrate the model's learning progress over two epochs. As seen in the training output, the loss consistently decreases with each batch of training data, indicating that the model is successfully minimizing the error and improving its performance. The model achieves high accuracy (98.65% and 98.88%) with reduced loss, proving deep learning effectiveness in SAR change detection. Its performance improves with more training data, demonstrating robustness in applications, such as disaster monitoring and land-use change detection. Further enhancements can be achieved through additional training, hyperparameter tuning, and extended training periods. Figure 4 presents the output of the model training.

Figure 4:

Output of model training.

The two SAR images provided are of the same geographical region but taken at different times. Figure 5 (SAR1) illustrates the area with its original features, such as some structures and landforms. Figure 6 (SAR2) depicts the same area after a period of change, which is clearly seen in the change of the previously noted features. The texture and patterns of the structures vary in these photographs, with new or altered elements appearing to signify change, particularly in the structures delineated by the sharp black outlines. This visual disparity is the premise for identifying changes between the two photographs. Classification of images is carried out before performing change detection in order to recognize and classify the various regions and features present in the images. It includes segmenting the images in terms of texture and intensity, enabling the deep learning model to concentrate on detecting changes in areas. The pre-classification output generally identifies areas where the large texture and intensity changes are happening, as illustrated in Figure 7. The pre-classified areas will most likely hold the differences between the two SAR images.

Figure 5:

SAR1 image before change.

Figure 6:

SAR2 image after change.

Figure 7:

Pre-classification of SAR1 & SAR2 using space matrix calculation and SFCM. SFCM, spatial fuzzy clustering membership.

Finally, the result of change detection is generated as seen in Figure 8, where the model detects and classifies the areas of change. The output shows the changes detected with very high accuracy, and in this case, the model's accuracy for change detection between the two images was very high (about 98.8%). This output allows for the identification of new developments, land use alteration, or environmental alteration, which are vital for various applications, such as disaster monitoring, urban planning, and environmental monitoring. The results prove the efficiency of using deep learning techniques in automatic change detection of SAR imagery.

Figure 8:

Final change detection output.

As shown in Table 1, comparison with state-of-the-art methods, the proposed SAR image change detection method achieves 98.53% accuracy, highlighted in bold, outperforming state-of-the-art techniques, such as fuzzy C-means (FCM) (91.29%), PCA-KMeans (90.45%), Markov random field Fuzzy C-means (MRFFCM) (91.27%), and neighborhood-based ratio and extreme learning machine (NR-ELM) (88.93%), and DBNs (87.22%). The significant improvement demonstrates the effectiveness of the proposed approach in accurately detecting changes in SAR imagery by leveraging advanced preprocessing techniques and deep learning models.

Comparison with state-of-the-art methods

Methods Accuracy (%)
Proposed method 98.53
FCM [20] 91.29
PCAKM [20] 90.45
MRFFCM [20] 91.27
NR-ELM [21] 88.93
DBN [21] 87.22

DBN, deep belief network; FCM, fuzzy C-means; MRFFCM, Markov random field fuzzy C-means; NR-ELM, neighborhood-based ratio and extreme learning machine; PCAKM, principal component analysis and K-means.

The value in bold highlights the performance of the proposed method (DLCD).

Conclusions

The proposed method with deep learning-based SAR image change detection reveals enhanced accuracy and durability, responding optimally to complexities, such as speckle noise and complex backscatter responses. Through superior preprocessing, spatial fuzzy clustering, and iterative optimization techniques, the methodology greatly enhances the accuracy of change detection at an impressive rate of 98.84% as well as remarkably reducing the value of losses. The suggested approach surpasses existing methods, demonstrating its viability for use in disaster management, environment monitoring, and city planning. Future research directions can include improvement of preprocessing strategies, increasing generalization over varying SAR datasets, and incorporating more sophisticated architectures for further performance enhancements.

Language:
English
Publication timeframe:
1 times per year
Journal Subjects:
Engineering, Introductions and Overviews, Engineering, other