Cite

Introduction

Radar systems detect targets by capturing the electromagnetic waves reflected from those targets. The Radar Cross Section (RCS) [1] quantifies a target’s capacity to reflect radar signals in the direction of radar reception [2]. RCS data finds extensive utility in both military and civilian contexts, serving to evaluate and identify distant targets. In military applications, RCS data finds relevance in tasks such as military ship type identification and the recognition of air-launched decoys. In the civilian domain, RCS data proves valuable for anticipating islands or reefs and assessing fog severity, among various other applications.

Traditionally, the prevalent approach for target identification using RCS data involves extracting periodic features, size features, statistical features, and discrete wavelet energy features from RCS sequences. Subsequently, single-classifier algorithms like KNN, correlation matching, support vector machines, and random forests are employed for target identification. Alternatively, fusion algorithms amalgamate multiple single classifiers to create a more robust fusion classifier. Despite leveraging the distinct advantages of different classifiers, these methods often exhibit limited discriminability and lower identification rates. The advent of neural network classifiers rooted in deep learning has significantly enhanced the accuracy of RCS target identification. Neural networks offer advantages such as distributed information storage, parallel computation, integrated storage and processing, rapid processing speed, robust fault tolerance, self-learning, self-organization, and adaptability. The typical paradigm involves constructing an RCS dataset for targets and training a target identification model using deep learning methods, thereby achieving recognition of target objects.

The effective deployment of deep learning for RCS-based target identification typically necessitates an extensive dataset for model learning and training. However, the practical acquisition of RCS data samples poses challenges, leading to a scarcity of samples that impedes the training of deep learning models and consequently results in diminished target identification accuracy. Although dataset expansion is a conventional strategy, its high associated costs render it impractical.

In addressing the constraint of limited RCS data samples [3], our research reveals that applying the concept of “meta-learning” can facilitate model training with small samples [4, 5, 6]. Conventional small sample learning algorithms often encounter issues where models become entrapped in local optima, consequently, necessitating a training method that considers the global context.

To overcome the limitations of existing methodologies, CHELSEA F et al. [7] introduced an enhancement to MAML algorithm by establishing a model initialization representation adaptable to multiple tasks. When confronted with a new task, only parameter fine-tuning is required to achieve satisfactory training results [8, 9, 10], minimizing the demand for a large sample size and addressing the small sample problem. However, the shallow layer structure of MAML’s network model will impact recognition accuracy. Additionally, the model presents challenges such as high computational load and training instability during the training process.

To address these issues, this paper proposes enhancements to the network model of MAML algorithm. Experimental results substantiate that the refined MAML model significantly improves target recognition accuracy for RCS data in scenarios with limited samples.

The basic idea of MAML algorithm

MAML, as a model-agnostic meta-learning algorithm distinct from a deep learning model, functions as a method for training practical mathematical models. The primary objective is to cultivate models capable of transcending dependence on extensive data volumes, thereby facilitating swift adaptation to new tasks. MAML distinguishes itself by exhibiting notable proficiency in the context of novel tasks, owing to its provision of substantial prior knowledge.

The core idea of MAML

MAML algorithm operates as a meta-learning framework, distinguished by its role as a facilitator for the training of mathematical models rather than constituting a deep learning model per se. Its primary objective is to cultivate models that can transcend reliance on extensive data volumes, demonstrating a capacity for rapid adaptation to novel tasks.

The foundational concept of MAML algorithm involves training a meta-learning model, denoted as M, on tasks of similar nature. Subsequently, through fine-tuning on a limited dataset specific to a new task, a distinct mathematical model, denoted as m, is derived—effectively adapted to the nuances of the novel task. The loss function, denoted as LTi, in the context of MAML is articulated in Formula (1). In this expression, θ signifies the initial parameter of the network model, θi$\theta _i^\prime $ represents the parameter acquired through learning for the i-th sub-task based on the initial parameter of the network model, and LTi denotes the loss function characterizing the sub-task, parameterized by θi$\theta _i^\prime $. L(θ)=i=1NLTi(θi) $$L(\theta ) = \mathop \sum \limits_{i = 1}^N {L_{Ti}}(\theta _i^\prime )$$

MAML diverges from the conventional approach of traditional pre-training. In the traditional paradigm, the same parameter θ undergoes updates across various sub-tasks, resulting in an initialization parameter θ optimized to minimize the cumulative losses across all sub-tasks. However, such an approach does not ensure the attainment of a global optimal solution for each individual sub-task. In contrast, MAML algorithm adapts the initialization parameter θ based on the parameter θi$\theta _i^\prime $ associated with each sub-task. At this stage, the model shifts its focus from the losses of individual sub-tasks, prioritizing the maximization of its overall learning capability. With minimal training on new tasks, it exhibits a rapid convergence towards a global optimal solution.

The network model of MAML

The fundamental architecture of the deep neural network model employed in MAML algorithm plays a pivotal role in the training process. As depicted in Figure 1, the network structure adheres to a specific configuration, encompassing a total of five layers distinguished by different colors. A notable characteristic of this structure is the composition of the initial four layers, which consist of convolutional layers and batch normalization layers. In contrast, the final layer is exclusively comprised of fully connected layers. This design choice results in a shallow network configuration, characterized by a reduced parameter count, expeditious convergence, and commendable fitting performance.

Figure 1.

The general form of MAML model

However, the inherent nature of shallow networks imposes certain constraints, notably in terms of their limited feature extraction capabilities and a deficiency in establishing correlations between data points. These limitations, intrinsic to shallow networks, consequently impact the recognition accuracy of MAML model. Addressing these constraints is imperative for enhancing the overall performance and efficacy of the algorithm.

Improvement of MAML algorithm and implementation steps
Improvements to the network model

To enhance the precision of target recognition within MAML model, a series of structural refinements have been implemented, as illustrated in Figure 2. The key enhancements are outlined as follows:

Addition of hourglass-shaped convolutional layer in the input layer:

A distinctive hourglass-shaped convolutional layer has been incorporated into the input layer to augment the feature extraction capability specific to target data. This augmentation aims to capture more representative feature parameters, thereby enhancing the model’s ability to discern critical patterns.

Additional convolutional layer in the layer preceding the output layer:

An supplementary convolutional layer has been introduced just before the output layer to fortify the inter-neuronal correlations, facilitating the network’s descent along the global optimum gradient. This augmentation is designed to improve the model’s ability to capture nuanced relationships and intricacies in the data.

Modification of loss function to central loss function:

The original loss function has been revamped to incorporate a central loss function, designed to gauge the proximity between instances belonging to the same class. This modification contributes significantly to elevating target recognition accuracy by emphasizing the inherent similarities within classes.

Figure 2.

Improved MAML model

Implementation of improvement algorithm

The refined MAML algorithm necessitates specific configurations within the training dataset. The pre-training dataset is meticulously organized on a task-by-task basis. To initiate training, the model requires a task distribution, and concurrently, two hyperparameters must be specified.

The enhancement algorithm is implemented through the following specific steps:

Randomly initialize the model parameters.

Set the number of epochs for a training round.

Sample multiple tasks to form a batch.

Calculate the loss LTi on the support set of a task using Formula (2): LTi(fθ)=x(j),y(j)Tiy(j)logfθ(x(j))+(1yj)log(1fθ(x(j))) where fθ represents the model, x is the input training sample, y is the label of the training sample.

Calculate the parameter θi$\theta _i^\prime $ after a gradient update using Formula (3): θi=θ-αθLTi(fθ) $$\theta _i^\prime = \theta - \alpha {\nabla _\theta }{L_{Ti}}({f_\theta })$$

Iterate through Steps 4 to 5 until all tasks in the current batch are traversed, completing the first gradient update.

Upon acquiring parameters from the initial gradient update, a subsequent gradient update is executed via a procedure commonly referred to as “gradient by gradient.” The gradients for the complete batch are computed by employing the query set from each task. Subsequently, these gradients are directly employed in modifying the original model through the application of Stochastic Gradient Descent (SGD), thereby updating the parameter θ in accordance with Formula (4): θθβθTip(T)LTi(fθi) $$\theta \leftarrow \theta - \beta {\nabla _\theta }\mathop \sum \nolimits_{{T_i} \sim p(T)} {L_{Ti}}({f_{\theta _i^\prime }})$$

Continue sampling the next batch and iterate through Steps 3 to 7 until all batches are traversed.

Experimental results and analysis

In order to ascertain the efficacy of the improved MAML algorithm, a horizontal comparison was executed among four distinct models: the original MAML model, the improved MAML model denoted as MAML-New, ResNet 18-layers model, and Long Short-Term Memory (LSTM) model. Comprehensive analyses, encompassing both qualitative and quantitative assessments, were conducted on the experimental results to validate the advancements introduced by MAML-New model.

The comprehensive procedural steps of the experimental investigation are outlined as follows:

Preparation of experimental datasets:

This phase involves the meticulous preparation of Radar Cross Section (RCS) data for four distinct models: MAML, MAML-New, ResNet 18-layers [11], and LSTM [12]. This encompasses the curation of both training and testing datasets specifically tailored for MAML model.

Model Training:

Initial preprocessing of the experimental dataset, encompassing critical tasks such as dataset classification, data filling, normalization, and data standardization.

The dataset is partitioned on a per-task basis, serving as input for the respective models.

Subsequently, the models undergo comprehensive training.

Experimental results and comparative analysis:

Execution of experiments to elicit results that reflect the models’ performance.

Rigorous comparative analysis is conducted to assess and contrast the efficacy of MAML, MAML-New, ResNet 18-layers, and LSTM models.

Preparation of experimental data

The raw RCS dataset was generated using the FEKO software simulation method [13]. To facilitate computation, the dataset was stratified into 12 categories, delineated by unique external features of the targets and labeled as category 1 to category 12. Notably, category 1 to category 4 constituted the experimental test dataset, while category 5 to category 12 comprised the pre-training dataset. Figure 3 and 4 provide simplified models representing the 12 different categories. The RCS data for these models were exported utilizing the FEKO software, with each category containing 200 models of varied sizes. As depicted in Figure 5 and 6, the cumulative incident angles for a single model resembled a hemisphere. Drawing an analogy to Earth’s longitude and latitude, the longitude spanned from 0° to 360°, and the latitude ranged from 0° to 90°. Each degree of latitude corresponded to 360 incident angles, resulting in a model encompassing 90 × 360 incident angles. Each incident angle correlated with a specific RCS value, yielding RCS data for each model sized at 90 × 360. Consequently, the 12 categories contributed to a total of 12 × 200 RCS datasets. Figure 7 illustrates an instance of RCS data for a category 3 model in Cartesian coordinates with a size of 85 cm.

Figure 3.

Category 1 - 4, the experimental test dataset

Figure 4.

Category 5 - 12, the pre-training dataset

Figure 5.

All incident angles of a model

Figure 6.

Schematic diagram of incidence angle

Figure 7.

RCS data of 85CM category 3 model in Cartesian coordinate system

Training of MAML model

The training procedures for both MAML and MAML-New model can be delineated as follows:

Preliminary to the training process, employ preprocessing techniques such as data padding, normalization, and data standardization on the pre-training dataset I and the test dataset J .

Define key parameters: n_way signifies the number of sample categories in each task, k_spt represents the number of support set samples, k_qry denotes the number of query set samples, and task_numtask_num stands for the number of training batches of samples. Randomly select n_way (n_way < 8) categories from the pre- training dataset I . For each category, randomly choose k_spt + k_qry (k_spt + k_qry ≤ 200) labeled samples, thereby constituting a task Ti with n_way × (k_spt + k_qry) samples. From each category’s k_spt + k_qry samples in the current task, designate k_spt samples as the support set Tis and k_qry samples as the query set Tiq. Each task is tantamount to a data point in training. Randomly extract task_num such tasks to form a batch. Concurrently specify the hyperparameters meta_lr and update_lr, where meta_lr and update_lr denote the learning rates for the two-stage gradient iterations.

Table I enumerates certain network parameter values pertinent to training the model using MAML method.

Employ the same procedure as delineated in step (2) to partition the test dataset J into tasks, selecting Js and Jq as the support set and query set, respectively, for all tasks in the test dataset.

Following the steps outlined in MAML algorithm’s section III.B, train the meta-learning model Mmeta using the pre-training dataset I.

Fine-tune the trained meta-learning model Mmeta on the support set Js of the test data, thereby obtaining the target recognition model M adapted to the current task.

Input the query set Jq into the well-trained target recognition model M and ultimately obtain a prediction result R.

Partial network parameter values for MAML and MAML-New

Parameter Value Meaning
epoch 600 Training epochs
k 4 Number of sample categories
k_spt 20 Number of support set samples
k_qry 30 Number of query set samples
imgsz 180 Dimension of input data
imgc 1 Number of channels for input data
task_num (batch_size) 16 Training batch of samples
meta_lr 1e-3 First gradient update learning rate
update_lr 0.01 Second gradient update learning rate

At this juncture, the training of MAML model concludes.

Experimental results and comparative analysis

Figure 8 through 11 individually delineate the training progression of ResNet 18-layers, LSTM, MAML, and MAML-New models. These subfigures present their respective prediction accuracy and loss, with blue curves representing accuracy and yellow curves representing loss rates. The horizontal axis denotes training batches, while the vertical axis spans ratio values from 0 to 1.

Figure 8.

ResNet 18-layers

Figure 9.

LSTM

Figure 10.

MAML

Figure 11.

MAML-New

Certainly, the observations suggest that the accuracy of ResNet 18-layers and LSTM remains below 0.8, whereas the accuracy of both MAML and MAML-New exceeds 0.8. Additionally, MAML-New exhibits a slightly higher accuracy than MAML. Noteworthy is the observation that MAML-New experiences lower training losses in comparison to MAML.

The recognition accuracy for the four models corresponding to the four categories in the test dataset, as illustrated in Figure 3, has been computed and is presented in Table II.

Comparative experimental results of different models

Accuracy\Model Category 1 Accuracy Category 2 Accuracy Category 3 Accuracy Category 4 Accuracy Average accuracy
MAML 82.16% 72.45% 81.3% 85.97% 80.47%
MAML-New 86.42% 79.70% 87.17% 89.19% 85.62%
ResNet 18-layers 81.7% 62.1% 82.4% 90.1% 73.45%
LSTM 81.1% 68.0% 80.8% 80.3% 77.55%

Table II reveals that the average recognition accuracy for MAML model, ResNet 18-layers model, and LSTM model is 80.47%, 73.45%, and 77.55%, respectively—substantially lower than the recognition accuracy achieved by MAML-New model at 85.62%. The recognition accuracy hierarchy, from highest to lowest, is as follows: MAML-New > MAML > ResNet 18-layers > LSTM. In scenarios with limited samples, MAML model demonstrates superior recognition capability compared to conventional deep neural networks such as ResNet 18-layers and LSTM models. Furthermore, MAML-New model exhibits an average improvement of 5-percentage-point in recognition accuracy over MAML model.

Conclusions

To address the challenge of a small sample size in training a target recognition model based on RCS data, MAML algorithm was employed. Structural modifications to the network included the incorporation of an hourglass-shaped architecture and the addition of convolutional operations at the output layer. Simultaneously, adjustments were applied to the loss function, and experiments were systematically conducted on the RCS dataset. The resulting model effectively recognizes targets using RCS data, with empirical results indicating a notable improvement in recognition performance, particularly in scenarios characterized by a small sample size.

eISSN:
2470-8038
Language:
English
Publication timeframe:
4 times per year
Journal Subjects:
Computer Sciences, other