Open Access

Chat-GPT Powered IoT devices using regularizing the data for an efficient management systems

 and   
Feb 24, 2025

Cite
Download Cover

Introduction

Monitoring and analysing foetal electrocardiogram (fECG) signals are essential for non-invasive heart disease diagnosis, which may also include foetal heart disease monitoring and diagnosis [1]. By applying electrodes to the foetal scalp during delivery, these fECG data can be obtained in an invasive manner. Researchers have recently become interested in using embedded electronics for non-invasive foetal heart rate monitoring and diagnosis. In the early stages of pregnancy, these devices are fitted with electrodes that can anticipate several foetal heart disorders [24]. In particular, to attain the optimal classification of heart rates according to the fECG signals, machine and deep learning algorithms [57] are integrated into edge and IoT devices [810]. In terms of technology, this approach to implementing sophisticated algorithms on wearable edge devices appears to be intricate and unreliable for real-time health monitoring. Wearable edge devices' batteries suffer when fECG signals are continuously measured and processed, which presents extremely sensitive problems [1113]. Also, since the ECG signals are non-Euclidean time series, accuracy is compromised when processing them utilising conventional Machine Learning (ML) and Deep Learning (DL) models [14].

In terms of technology, this approach to implementing sophisticated algorithms on wearable edge devices appears to be intricate and unreliable for real-time health monitoring. Wearable edge devices' batteries suffer when fECG signals are continuously measured and processed, which presents extremely sensitive problems [1113]. Also, since the fECG signals are non-Euclidean time series, accuracy is compromised when processing them utilising conventional ML and DL models [14].

Contribution of the Research

By incorporating LLM T5, a novel feature extraction technique is developed. In order to analyse the fECG, the spatial and temporal aspects are creatively extracted through the deployment of complexity-aware of llm. The suggested model suggests a novel approach to feature engineering that can improve diagnostic performance.

2Quickness and Simplicity To accomplish the high classification ratio, typical fully connected networks on Powerful LLM.

Structure of the Paper

The other part of the document is organised in the following manner: Information about the several related studies by other authors is provided in Section 2. Section 3 presents the hardware design, minimal complexity aware deep learning framework, and dataset description. Section 4 provides examples of the experimentation, implementation procedure, and results. Section 5 concludes the paper with suggestions for further improvement.

Related Works

S. Mirza et al.[16] developed a method for minimising and filtering the artefacts in ECG by fusing methods from adaptive filtering and independent component analysis (ICA). The foetal ECG signal can be successfully extracted from the abdomen ECG using this non-invasive method. A comparatively high SNR is provided by the nonparametric ICA method. The results show how useful this approach is for calculating foetal electrocardiograms (fECG). Although, the limitations of this system is that it adds complexity.

J. Hao et al. [17] employed the FastICA technique in conjunction with the Singular Value Decomposition (SVD) technique to extract FECG signals. The ST segments and QRS waves in FECG signals were identified using the modified wavelet mode maximum approach. The problem of missing waveforms was fixed, and the optimal channel signal's signal-to-noise ratio was 45.028 dB. The sensitivity, positive predictive value, and F1 score for foetal QRS wave identification were 96.90%, 98.23%, and 95.24%, respectively. However, the primary drawback of this paradigm has been identified as its increased computational complexity.

M. Anumukonda et al. [18] extract cardiac sound components utilising a multi-channel micro-electromechanical system (MEMS) microphone-based phonocardiography device. The proposed multi-channel phonocardiography system uses artificial neural networks (ANNs) to identify the cardiac sound components and synaptic weights based on the inverse-delayed (ID) function method of the neurone. This technique yields better findings in terms of accuracy (90%) and sensitivity (99%). However, the primary problem here is that training the data takes more time.

E. Fotiadou et al. [19] employ a DL technique to eliminate any residual noise from the multichannel foetal ECG after suppressing the maternal ECG. The authors introduce a deep convolutional encoder-decoder network with symmetric skip-layer connections to learn end-to-end mappings from noise-corrupted foetal ECG data to clean ones. The fundamental advantage of this framework is that it does not require any prior knowledge of the power spectra of the noise or pulse location and can preserve beat-to-beat morphological variations. However, the main drawback of this architecture is its increased resource use.

D. Al. Saadany et al. [20] offered a ML approach for foetal arrhythmia detection. This method used the concepts of Shanon Energy Envelope (SEE), Peak Energy Envelope (PEE), and Discrete Wavelet Transform (DWT) to detect foetal arrhythmia. It eliminates noise and artefacts using a range of filtering techniques. This approach yields better results in terms of accuracy (93.21%). The disadvantage of this architecture is that input processing is more time-consuming.

K. Meddah et al. [21] demonstrated an FPGA-based system for tracking ECG data and identifying cardiac arrhythmias. The QRS detection method was modelled after the Pan and Tompkins algorithm. One advantage of this system is that it has been tested in real-world conditions. This framework also optimises the memory and resource usage of the hardware implementation. It scores higher in terms of sensitivity (97.3%) and overall accuracy (97.6%). However, when the dataset size increases, the results are negatively impacted by this framework's relatively poor processing speed.

Y. Ching et al. [22] introduced a convolutional neural network (CNN)-based technique for foetal ECG identification from the abdominal ECG data. The detection accuracy of their method is 95.2%. To ensure its feasibility in the wearable device, the sensor is constructed on the FPGA platform. The prolonged delay caused by buffering new data for convolution can be discarded. This method's main advantage is that it allocates a partial sum buffer, which enables the flow to share feature map memory and reuse input and output. However, the computational complexity of this framework is its main drawback.

C. M. Jose et al. [23] subsequently supplied it as an FPGA. The adaptive filtering method removes low-power noise from the input ECG (foetal and maternal) once the feature extraction stage of the data is finished. Using feature extraction, the maternal and foetal components of the ECG signal are neatly separated. According to the results, the foetus may or may not be ill, and the child may be in a normal or abnormal state. This framework's main advantage is its decreased power consumption, but its main disadvantage is its temporal complexity.

Proposed Methodology
Materials and Methods

The multichannel fetal ECG recordings from five distinct women between 38 and 41 weeks of gestation are included in the Addominal and Direct Foetal ECG Databased for this investigation. The KOMPOREL system for foetal ECG acquisition and analysis (ITAM Institute, Zabrze, Poland) was used to get the recordings in the Medical University of Silesia's Department of Obstetrics. The reference direct foetal ECG recorded from the foetal head and four distinct signals acquired from the mother's belly make up each recording. Foetal heart rates are classified using these datasets, and their impact on the measurement of beat-to-beat foetal heart rate (FHR) variability is estimated. The FECG datasets utilised in the experiment are given in Figure 2. The recoding information of the dataset is shown in Table 1.

Figure 1:

Proposed Methodoloy

Figure 2:

Sample Multi-Channel FECG Datasets utilised for Training and Testing the Module

Figure 3:

T5 Architecture

Figure 4 :

performance metrics compared with other models

Specification of the FECG Datasets

S1.No Recording Characterization Specification
1 Recording Period 38 to 41 Weeks of Gestation
2 Signals From Maternal Abdomen 04
3 Types of Electrodes Ag-AgCl Electrode
4 Bandwidth 1Hz-150Hz
5 Filtering Type Digital filtering
6 Sampling Rate 1KHz
7 Resolution 16 bits
8 Total Number of Datasets 5089

Parameters of T5 Model

Parameter Description Value
Model Size Number of parameters in the model T5-Small
Input Length Maximum sequence length for input text. 512 tokens
Output Length Maximum sequence length for output text. 128 tokens
Vocabulary Size Size of the token vocabulary (default: 32,000 for T5). 32,000
Number of Layers Number of encoder and decoder layers in the model. 6 encoder, 6 decoder
Hidden Size Size of the hidden representation in the encoder/decoder (e.g., 512 for T5-Small). 512
Feed-Forward Size Size of the feed-forward network in each transformer block. 2048
Number of Attention Heads Number of attention heads in the self-attention framework. 8
Dropout Rate Dropout probability applied to attention weights and feed-forward layers. 0.1
Positional Embeddings Fixed sinusoidal embeddings used for positional information. Yes
Optimizer Algorithm used for optimization (e.g., Adam, Adafactor). Adafactor
Learning Rate Initial learning rate for training. 0.001
Training Steps Total number of steps for training. ~10,000 steps
Batch Size Number of samples processed in one forward/backward pass. 32
Weight Initialization Method for initializing model weights (e.g., Xavier initialization). Xavier Initialization
Pre-trained Tasks Text-to-text tasks the model has been trained on (e.g., translation, summarization). Summarization, classification
Fine-tuning Tasks Downstream tasks for which the model can be fine-tuned. FHR classification, FECG signal processing
Data Preprocessing

Clinical applications of the abdominal electrode approach, as described frequently result in numerous artefacts because of maternal and foetal movement. Therefore, data-pre-processing techniques are used to avoid baseline drift, power-frequency interface, and pulse artefacts.

LLM – T5

LLM is an advanced type of artificial intelligence (AI) technique developed to realize and generate human-like text. LLMs are trained on vast amounts of textual data from diverse sources, like books, articles, and websites. These algorithms rely on deep learning techniques, particularly transformer architectures, which enable them to learn patterns, semantics, and context from text. LLMs are capable of performing a wide range of natural language processing (NLP) tasks, such as translation, summarization, question answering, and text generation, among others. Their power lies in their ability to generalize and adapt to various tasks without needing task-specific training for every new application.

LLMs operate by predicting the next word in a sentence based on the context offered by previous words. This process is achieved through a mechanism called attention, which helps the method weigh the importance of different parts of the input text to make predictions. Transformers, the backbone of LLMs, consist of encoder and decoder layers that process text in parallel, making them highly efficient compared to previous sequential models like RNNs or LSTMs. Through extensive pretraining on large datasets, LLMs learn linguistic nuances, syntax, and even factual knowledge. Fine-tuning these models on specific tasks further enhances their performance and adaptability.

T5, or Text-to-Text Transfer Transformer, is a specific type of LLM developed by Google Research. It adopts a unified architecture where each NLP task is treated as a text-to-text problem. For example, summarizing a paragraph, translating text, or even answering questions can all be framed as input-output pairs of text. This approach simplifies task formulation and makes it easier to fine-tune the model for various applications. T5 is built on the transformer architecture and comes in different sizes, from small versions for lightweight tasks to larger versions with billions of parameters for more complex tasks.

One of T5's major strengths is its flexibility in handling diverse NLP tasks using a single model. By converting all tasks into a text-to-text format, T5 eliminates the need for task-specific architectures, making it a versatile choice for developers and researchers. It also leverages pretraining on large datasets using objectives like "span corruption," where parts of the text are masked and the model learns to predict them. This pretraining approach equips T5 with a deep understanding of language structure and context. Moreover, T5 has shown exceptional performance in benchmarks like GLUE, SuperGLUE, and translation tasks, cementing its position as a cutting-edge framework.

T5 is widely used in applications like automated summarization, where it can condense lengthy documents into concise summaries, and translation, providing high-quality multilingual support. It is also employed in question answering systems, chatbots, and even creative writing tasks, generating coherent and contextually relevant text. Furthermore, T5 is scalable, allowing deployment in resource-constrained environments with smaller versions or on powerful servers for more demanding tasks. Its versatility and efficacy make it a go-to choice for both academic research and practical implementations in industries like healthcare, customer service, and education.

Encoder-Decoder Design

Pre-processed fECG input data is fed into the feature processing block of the encoder architecture in order to extract the features from the raw data. The encoder is built using an network with varying numbers of kernels. The primary goal of adding the various levels is to extract global temporal characteristics as well as high resolution features. High resolution features are retrieved utilising a small kernel size, while high global temporal features are extracted utilising a big kernel size. As a result, the three downsampled stages of the suggested encoder are built using adaptive kernel sizes of 2, 4, and 8, respectively. The following are the encoder output features.

The mathematically outputs from the encoder stages are given as O(e)=i=12F(E(i)*C((i))

Here F(E) is output function from the each and every stage of the encoders which is decided by the Equation. (C(i)) is the convolutional layer

Three steps make up the majority of the decoder. In contrast to earlier U-net variations, the suggested decoder includes the suggested block after which comes the up-sampling and skip connection. In particular, the encoder's output serves as the decoder's input. The input features are up-sampled by two in each decoder stage before being concatenated with a skip connection from the encoder in the same stage and supplied into the block.

The following are some benefits of utilising the suggested in decoders: a. Enables the decoder to fully utilise the encoder's features and upsampling b. Improves decoding effectiveness by constructing long-range dependencies. After the three stages above, the output with high resolution with global temporal features are obtained. The mathematical formula for each decoder stage is given as below D(e)=i=12O(Ei)*F(u(i))

Experimentation and Implementation

Python's Scikit Learn Libraries for ARM Cores were used to develop the software implementation.. The Intel Workstation with an I7 CPU, NVIDIA GPU, 16GB RAM, and 3.2 GHz frequency powers the entire arrangement. Keras (Tensorflow) was used as the backend to create the suggested baseline architecture

Performance Metrics

The proposed architecture and deep feed forward training networks that divide the related types into normal FHR and abnormal FHR are evaluated during the experiment. To analyze the effectiveness of the recommended design, metrics such as accuracy, sensitivity, specificity, recall, and f1-score are calculated. Table 3 illustrates the algorithms for computing the metrics that were utilised to evaluate the recommended framework.

Mathematical Formulas for the Evaluation Metrics’ Computation

SL.NO Evaluation Metrics Mathematical Expression
01 Accuracy TP+TNTP+TN+FP+FN
02 Sensitivity or recall TPTP+FN×100
03 Specificity TNTN+FP
04 Precision TNTP+FP
05 F1-Score 2.Precison*RecallPrecision+Recall

Where, TP - True Positive Values, TN - True Negative Values, FP - False Positive and FN -

Results and Discussion

The identification of fetal heart rate (FHR) is a binary classification task. The settings used in the component structure-based ablation tests were identical to those in the suggested framework.

Comparative Analysis of the models (After and Before Hardware Deployment) a) Average Accuracy Performance b) Precision and Recall Performance c) F1-Score Performance

The comparative assessment of the suggested approach in simulation mode (before to hardware deployment) and hardware mode (post-hardware deployment) is displayed in Figure 5(a,b,c). According to the figure, the suggested model shown a significant similar performance in identifying the FHR with a lower RMSE of 0.001. As a result, the architecture created for the suggested model may effectively use foetal ECG signals to determine the FHR.

Figure 5:

Comparative assessment of Distinct models

CONCLUSIONS

This study introduces a novel computationally efficient framework for classifying Fetal Heart Rate (FHR) signals using a T5-based architecture with multichannel FECG inputs. The proposed algorithm employs a three-step methodology aimed at optimizing runtime and improving the efficiency of feature extraction and classification from FECG signals. The framework integrates a T5-based deep learning model with feed-forward layers to effectively classify FHR from fetal ECG datasets, demonstrating superior performance relative to other cutting-edge learning techniques in FHR classification. Extensive comparative and ablation experiments were carried out to evaluate the efficacy of the proposed framework. The integration of the T5 model with advanced classification layers showcases its potential as a high-performance algorithm suitable for deployment in versatile devices. The study emphasizes the importance of resource-efficient optimization techniques, ensuring reduced computational complexity and enhanced energy efficiency, making the framework adaptable to power-sensitive environments. This research highlights the advantages of employing a T5-based architecture for FECG signal processing and classification. The proposed methodology focuses on achieving a balance between diagnostic accuracy and computational efficiency, positioning it as a feasible solution for resource-constrained applications. Future research directions include further validation with clinical real-time FECG datasets and additional optimization of the model's architecture. These efforts aim to reduce resource utilization further, enabling the framework's deployment in wearable or implantable devices for continuous monitoring and processing of FECG signals.