Research on downlink channel state information prediction technique for 5G system based on deep neural network
Publicado en línea: 19 mar 2025
Recibido: 11 nov 2024
Aceptado: 09 feb 2025
DOI: https://doi.org/10.2478/amns-2025-0471
Palabras clave
© 2025 Jinhui Chen et al., published by Sciendo
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
With the development of society, the increasing user demand has catalyzed the creation of the fifth generation mobile communication system (5G), and also brought great challenges to 5G. Massive multiple-input multiple-output (MIMO) technology has received widespread attention for its advantages of high system capacity and high spectral efficiency, etc. After more than 10 years of research, massive MIMO technology has been successfully applied in 5G mobile communication systems [1–3]. By continuously increasing the antenna array size, the ultra-large-scale MIMO technology will be one of the key technologies to further improve the network performance of the 6th generation (6G) mobile communication system in the future. However, the ability of the base station to obtain accurate downlink channel state information (CS1) is a prerequisite for the massive MIMO technique to realize the system gain [4–6]. Currently, massive MIMO systems usually work in time-division duplex (TDD) communication mode, by utilizing the reciprocity of uplink and downlink channels, the downlink CSI can be obtained directly from the CSI estimation of the uplink, which avoids the huge downlink CSI acquisition overhead. Compared with TDD communication mode, frequency division duplex (FDD) communication mode has the advantages of high transmission rate, continuous communication, and applicable to high-speed mobile scenarios [7–9]. At the same time, a large number of existing communication systems still use FDD communication mode. In order to utilize the inherent advantages of FDD communication mode and reduce the resource waste and performance loss caused by changing communication modes, FDD massive MIMO system has recently received extensive attention and research [10–11]. In FDD mode, the uplink and downlink channels do not have strict reciprocity, and the downlink CSI obtained by the base station requires downlink CS1 estimation and feedback from the user. Since the guide frequency overhead of downlink CSI estimation and the link overhead of feedback are proportional to the number of antennas at the base station end, one of the great challenges facing FDD massive MIMO systems is the huge guide frequency and feedback overhead of downlink CSI acquisition [12–14]. Currently, FDD systems usually use codebook-based and compression-aware CSI feedback techniques. However, with the increasing size of antenna arrays at the base station end, the design of the codebook is very difficult, making the codebook-based CSI feedback technique no longer suitable for FDD massive MIMO systems. The high dimensional CSI matrix also makes the CSI feedback technique based on compressed sensing have high computational complexity and low feedback accuracy [15-17]. In recent years, deep neural networks have been applied in various fields and have shown excellent performance, such as parts defect detection and recommendation systems. A large number of works have been done to effectively use deep neural networks to solve the problem of downlink channel state information prediction for wireless communications. It can be seen that deep neural networks have a powerful ability to learn and process high-dimensional data, which can provide a potential solution to the problem of obtaining high-dimensional downlink CSI for FDD massive MIMO systems [18–22].
In order to solve the problem that the base station cannot obtain CSI in time due to the influence of upstream and downstream delay and the time required for signal processing, this paper proposes a deep learning-based downstream channel state information prediction method for 5G systems by utilizing massive MIMO technology and deep learning with CSI feedback. After that, the channel model of massive MIMO system is utilized to prove the effectiveness of CSI feedback technique of data-driven scheme. Finally, the superiority of the scheme in channel estimation performance is verified by simulation experiments and CSI amplitude feedback.
Consider a simple single-user MIMO [23–24] (SU-MIMO) system where the base station is equipped with
Let the transmitted signal from the antenna array at the base station be
where
Massive MIMO technology dramatically increases the channel capacity of a wireless communication system by concentrating the energy of radio waves into a smaller spatial area using a massive antenna array. The channel capacity of a communication system is defined as the maximum mutual information of the transmitted and received signals, so the channel capacity of a massive MIMO system can be expressed as:
The mutual information of transmit vector
Since emission vector
According to the power-limited maximum entropy theorem, the receiving vector
Limit the maximum transmit energy of the antenna array to
When the base station does not hold CSI, the transmit power is divided equally among the antennas, and the channel capacity of the MIMO system at this time is:
The above derivation shows that the MIMO system can utilize space-division multiplexing to improve the spectral efficiency when there is no base station that does not have knowledge of the CSI, which is called open-loop MIMO. The channel capacity mentioned above can be achieved by using the hierarchical space-time coding (BLAST) proposed by Bell Labs. When the CSI is known to the base station, the MIMO at this point is called closed-loop MIMO. Using singular value decomposition (SVD), the original channel can be decomposed into
Thus, according to Shannon’s Second Theorem, the system and capacity expressed in Eq. (9) can again be written as:
The allocation of power according to the water-flooding algorithm (WF) allows the system and capacity of the above equation to be maximized. The optimal power allocation is expressed as:
From Eq. (14), it can be seen that the principle of the water injection algorithm is to allocate more transmit power to the subchannels with good channel quality.
The analytic expression shows that the sum rate of the MIMOBC channel is a
Thus, the capacity of the MIMOBC channel can be obtained by optimizing the ordering of the users and the covariance matrix of the transmitted signals as shown in the following equation:
It can be seen that obtaining accurate downlink CSI is extremely important for massive MIMO systems.
Neuron is the most basic unit in deep learning and all neural networks are neurons combined in different ways. The basic structure of a neuron is shown in Figure 1. A complete neuron consists of a linear model and an activation function. For the linear model, it is assumed that the input sample

Basic structure of neurons
For the activation function
ReLU rectifies all negative values in input
Neurons can be combined in different ways to form different network structures. Common neural network structures are described below. Convolutional Neural Network The structure of the convolutional neural network is shown in Fig. 2. The blue part indicates a 4 × 4 single-channel input feature map, whose dashed part is the padding value (Padding), which is generally zero-padding, so the input size of the convolutional layer is 5 × 5, which is denoted as Fully connected neural network Similarly, the structure of the fully connected network and its nature is shown in Fig. 3. In the figure, ⊗ represents the matrix multiplication, and the yellow part represents the input feature map matrix, denoted as

Convolutional neural network structure

Full connection layer network structure
The CSI feedback method for designing massive MIMO systems using deep learning techniques does not require precise analysis of channel data features and clear modeling. Therefore, it can be applied to a variety of channel scenarios and greatly simplifies the complexity and difficulty of signal processing. The deep learning-based CSI feedback method is shown in Fig. 4. In the offline training phase, the CSI matrix data of the massive MIMO system needs to be collected in advance and fed into the network for learning and training, so that the network parameters can fit the corresponding channel data features. In the online prediction phase, using the already trained network model for CSI reconstruction can greatly reduce the time complexity and design difficulty of the system. Depending on the focus and direction of problem solving, the neural network model may have a large gap.

CSI feedback model based on deep learning
The classification results of CSI feedback methods based on existing deep learning are shown in Figure 5. At present, for the CSI feedback methods based on white coding, the use of encoder + decoder structure of the implementation of the form of more, and the reconstruction accuracy is higher: a single decoder of the network, although the complexity of the network is lower, but the reconstruction accuracy is lower, and can not meet the actual needs: for the CSI feedback methods of adaptive quantizer, its main purpose is to he characterized CSI with lower quantization bit number, in order to reduce the feedback process required for the parametric number: the depth-expanded CSI feedback method focuses on the modification of certain high-performance compressed-aware reconstruction algorithms, using neural networks to replace all or part of the parameters in the algorithms in order to improve the reconstruction accuracy.

Deep learning CSI feedback method classification
Assuming that in an FDD massive MIMO system, there is
The fading factor
Define
Next, this paper will follow the definition of an uplink to downlink mapping function and prove its existence. From Eq. (23), the channel function
The inverse mapping of definition Φ
The probability that the inverse mapping
The following mapping relation can be obtained from Eqs. (26) and (27):
The following mapping relationship exists for upstream and downstream links introduced from the above equation:
The effectiveness of the data-driven scheme will be demonstrated mathematically below. Since Φ
In order to facilitate modeling and simulation, this paper only considers a simple FDD massive MIMO system, which uses orthogonal frequency division multiplexing (OFDM) [26–27] modulation mode. Assuming that there are
The channel matrix corresponding to the CSI is the set of information for each subcarrier, then the CSI matrix
CsiNet is the first multilayer neural network designed for the problem of downlink CSI prediction in FDD massive MIMO environments. The CsiNet network is implemented by a convolutional neural network structure and is capable of processing channel state information from multiple antennas and outputting high quality channel estimation results.
LeakyReLU is a variant of modified linear unit (ReLU). The mathematical expression of LeakyReLU is shown below:
The development of CsiNet network provides a new solution to the downlink CSI feedback problem in massive MIMO systems. Simulation results of CsiNet show that CsiNet network exhibits better prediction performance at all compression rates compared to compressed sensing techniques.
Inspired by the CsiNet decoder network model, this paper proposes a network model for predicting the downlink CSI based on the uplink CSI, which is called the three-dimensional convolutional neural network model [28] (3D-CsiNet), and the overall network structure of 3D-CsiNet is shown in Fig. 6. The design of the convolutional layer, residual network and other parts of this network will be described separately below. Three-dimensional convolution In convolutional neural networks, 2D convolution refers to performing convolution operation on 2D data, which is done by sliding a small matrix called convolution kernel over an image or matrix and multiplying the convolution kernel with the localized region of the input, and finally adding all the results to get the output feature map. 3D convolution, on the other hand, refers to the convolution operation on 3D data, which is similar to 2D convolution, but adds a dimension (depth) to the convolution kernel by sliding a 3D convolution kernel over the 3D data and multiplying the convolution kernel and the local region of the input, and finally adding all the results to get the output feature sequence. Feature Extraction In a convolutional neural network, the sensory field is the range of perception of a neuron in a particular layer for a location in the input data. In convolutional neural networks, the size of the perceptual field of each layer is related to the size of the convolutional kernel contained in that layer, and the size of the perceptual field gradually increases with the increase in the number of network layers. The size of the receptive field directly affects the recognition and classification ability of the neural network, so it is very important to choose the size of the receptive field reasonably in the network design and tuning. Residual network CsiNet solves the gradient vanishing problem of convolutional model by adding residual network. In this paper, the residual network is used in the 3D-CsiNet network model. The residual network utilizes the error of the loss function to train the parameters of this residual block. The residual block consists of two convolutional layers, where the first convolutional layer performs feature extraction on the input data, the second convolutional layer further processes the features, and then the outputs of these two convolutional layers are summed up and nonlinearly transformed by the activation function.

Structure of 3D-CsiNet network model
The true value is the result of placing the SRS in all time slots with perfect channel estimation; the observed value is the result obtained by interpolating all time slots after the actual channel estimation of the SRS under the actual transmission period of the SRS. The predicted values are the results predicted by inputting the observations into the network as a training set. In this experiment the uplink SNR is set to 25 dB and the SRS period is 6 slots. The comparison of the predicted and real values of the channel in the walking scenario is shown in Fig. 7. From the figure it can be seen that both the observed and predicted values are very close to the true values, which indicates the accuracy of the channel estimation and the fact that the proposed MIMO-based 3D-CsiNet model is very favorable for channel prediction.

The channel prediction is compared to the real value in the walking scenario
The results of the comparison between the predicted and true values of the channel in the cycling scenario are shown in Fig. 8. As can be seen from the figure, as the Doppler shift increases (

Channel prediction and true value of cycling scenarios
The results of the comparison between the prediction performance of the 3D-CsiNet-based model prediction and the traditional second-order AR model prediction are shown in Fig. 9. The simulation sets the SRS period to 4 time slots, the Doppler shift to 250 Hz, and the prediction time length to 28. It can be seen that the estimation error of the 3D-CSINET-based prediction algorithm is lower than that of the second-order AR algorithm, and the MSE value of the 3D-CSINET algorithm is reduced by 59.70% compared to the second-order AR algorithm at the signal-to-noise ratio of 35 dB.

The 3D-CsiNet model and the AR model predictive can compare the results
This paper also analyzes the effect of different SRS transmission periods on the channel prediction results, which are simulated with SRS periods of 2, 4, 8, 16, and 32 time slots respectively. The channel prediction errors for different SRS transmission periods are shown in Fig. 10. The results show the MSE comparison between the predicted and unpredicted schemes for five SRS cycles at a fixed Doppler shift of 250 Hz and an SNR of 30 dB, with a selected prediction time length of 24. It can be seen from the figure that the predicted scheme outperforms the unpredicted scheme for each SRS cycle, and the advantage increases with the increase of the SRS cycle from SRS cycle of 16 time slots. The proposed prediction method alone improves the MSE performance by 54.29% compared to the non-prediction scheme.

The channel prediction error of different SRS transmission cycles3
Two typical bitstream generation methods, namely quantization and binarization, are first evaluated and compared. The simulation scenario is set up as follows: the base station is equipped with 256 ULA antennas, and the user is a single antenna. In this paper, the number of bits per dimension (BPD) is used to characterize the feedback prediction effect. The NMSE performance of CSI amplitude feedback under different bitstream generation methods is shown in Fig. 11. The results show that the best feedback performance is achieved when the number of quantization bits is set to 4, while the total number of feedback bits is fixed. Consequently, the number of quantization bits in the quantization layer in this section is set to 4. To some extent, binarization can be regarded as a special case of quantization, where the quantization bits are set to 1. This special 1-bit quantization (i.e., binarization) is far superior to the ordinary 1-bit quantization. At the same time, binarization operates better than 4-bit quantization when the feedback bits are extremely limited, and far worse than 4-bit quantization in other cases.

The NMSE performance of the CSI amplitude feedback
This paper is dedicated to evaluating the performance of two different CSI phase feedback mechanisms proposed, namely MDPF-1 and MDPF-2, where the most important difference between the two is whether statistical or instantaneous CSI amplitude information is introduced into the phase feedback NN. Unlike the compression and feedback of CSI amplitude, the compression of CSI phase is dependent on CSI amplitude. Therefore, when evaluating the feedback accuracy of the CSI phase, the NMSE or MSE between the original CSI phase and the reconstructed CSI phase is not directly computed, but rather the NMSE between the original complex CSI and the reconstructed complex CSI is computed, and the CSI amplitude is assumed to be perfect because of the dedication to the evaluation of the feedback accuracy of the CSI phase at this point.
Assuming that the base station side is equipped with 272 antennas and the user side is equipped with a single antenna, and the number of channel paths is set to 5. The performance of NMSE of different CSI phase feedback mechanisms is shown in Fig. 12. Where the raw feedback indicates that the CSI phase is fed back directly with the self-encoder and the loss function is the most primitive MSE function. When the BPD is very low, the NMSE of the primitive phase feedback method is > 0 dB, which means that very little useful information is fed back, comparable to noise. This is mainly because it is not known which information is important, so we just try to provide feedback on all the phase information, but this requires a large number of feedback bits. When the BPD is 0.5-0.6, the performance gains of MDPF-1 and MDPF-2 are 10.41-10.53 dB and 10.03-10.36 dB, respectively, compared to the original phase feedback method.In addition, utilizing the instantaneous CSI amplitude information, MDPF -2 outperforms MDPF-1 that utilizes statistical CSI magnitude information, and the performance gain is especially evident when the BPD is low. As the BPD increases, the performance gap between the two slowly becomes smaller. This is due to the fact that the feedback bits are sufficient for all the phase information required by both MDPF-1 and MDPF-2. In order to exploit the correlation in CSI amplitude between neighboring users, CSI phase and amplitude are fed back separately, which leads to the problem of bit allocation for CSI phase and amplitude feedbacks, and an unoptimized allocation strategy leads to a significant degradation of CSI feedback accuracy. Therefore, the optimal bit allocation strategy can be found by extensive simulation, similar to exhaustive search.

NMSE performance of different CSI phase feedback mechanisms
In this paper, we first analyze the massive MIMO system model and channel characteristics in detail, and explore the CSI feedback method based on compressed sensing technology; after that, we describe the working principle of the CSI feedback method based on deep learning, and construct a 3D convolutional neural network model (3D-CsiNet) under 5G system. The primary conclusions are as follows: Simulation results show that the 3D-CsiNet model proposed in this paper has higher prediction accuracy and generalization ability compared with the traditional AR model. Compared with the existing representative CSI feedback algorithms, the proposed 3D-CsiNet model has higher CSI reconstruction accuracy and lower number of model parameters in 3D-CsiNet, thus verifying the advantages of the proposed algorithm. The feedback performance is best when the total number of feedback bits is set to 4. The binarization operation outperforms 4-bit quantization when the feedback bits are extremely limited, but is much worse than 4-bit quantization in other cases. The original phase feedback method has an NMSE > 0 dB, and the feedback has very little useful information and is comparable to noise. The performance gains are 10.41-10.53 dB and 10.03-10.36 dB for MDPF-1 and MDPF-2, respectively, when the BPD of MDPF-1 and MDPF-2 is 0.5-0.6. Therefore, the optimal bit allocation strategy can be found through extensive simulation.
This research was sponsored by the Beijing Nova Program (No.20240484645).