In 2008, Professor Arikan first proposed polar codes [1], which was rigorously proved from the mathematical point of view and finally obtained polarization phenomenon of channel. Polar codes can reach the Shannon limit theoretically and construction method can be uniquely determined, so it has low coding complexity and simple implementation. It has been adopted as the control channel coding scheme for the 5th Generation Mobile Networks (5G). The basic decoding algorithms of polar codes mainly include sequential cancellation (SC) [2] decoding algorithm and belief propagation (BP) [3] decoding algorithm. The SC decoding algorithm has low computational complexity, but due to its sequential decoding nature, there is a significant decoding delay problem in the long code decoding process. Compared with SC decoding algorithm, BP decoding algorithm is parallel decoding, with higher throughput rate and lower delay, so it can better meet the communication requirements of 5G.
With the development of neural networks, deep learning techniques have attracted much attention due to their good performance in many tasks, such as speech recognition, games, machine translation, and autonomous driving [4][5]. In communication systems, for the general channel decoding problem, it can be considered as a classification of information, so deep learning techniques [6] are applied to polar codes decoding as a way to improve the decoding performance of polar codes [7]. Compared to the traditional iterative decoding, deep learning techniques use a cascade of multiple layers of nonlinear processing units to extract and transform the features contained in the coding structure and noise features by pre-trained neural networks that pass through each layer only once to calculate their estimates, called one-time decoding. Low latency decoding is achieved and the decoding performance is close to the maximum a posteriori probability (MAP) performance [8]. In addition, the high-speed requirements can be easily met by powerful hardware such as parallel computing and graphics processing units (gpu) using existing deep learning platforms, such as Tensorflow [9].
Although neural network decoding has led to a better decoding performance, the training complexity of decoding is exponentially related to the number of information bits, which makes neural network decoding limited by short block lengths [10]. The literature [11] proposed three types of neural network decoders built on multilayer perceptron (MLP), convolutional neural network (CNN), and recurrent neural network (RNN) with the same parameter size, and compared the performance of the three deep neural networks through simulation and concluded that each type of deep neural network has a saturation length, which is due to their limited learning capacity caused by their limited learning capacity. To solve this problem, the literature [12] proposes to divide the coding graph into sub-blocks and train them individually to approach the maximum a posteriori performance of each sub-block, and then connect these blocks through the remaining traditional confidence propagation decoding stage, the resulting decoding algorithm is non-iterative and essentially enables a high level of parallelization while showing a competitive BER performance. Meanwhile, literature [13] proposed a division-based belief propagation neural network algorithm that replaces the sub-blocks of belief propagation decoder with a neural network and connects them using a BP decoding framework, which improves the decoding performance in long codes case and reduces the computational complexity and latency of the algorithm compared to the traditional BP algorithm.
In this paper, a partitioning method is combined with a multilayer perceptron to optimize polar codes belief propagation decoding algorithm. Mainly, polar codes are divided into several sub-blocks, and the last layers of BP decoding are replaced by trained neural network blocks, and the initial value of right message during BP iteration is set. The simulation results show that the proposed method has better performance in terms of BER and decoding delay compared with traditional BP decoding algorithm. And it can be applied to long codes.
Polar codes can be described by (N, K), where N is code length and K is number of bits of information. The core construction principle is “channel polarization” [14]. By polarizing channel, (N, K) polar codes divides the channel into two parts: K noiseless channels are used to transmit information bits, and remaining N-K completely noisy channels transmit frozen bits. Let
Where
Figure 1 shows the structure of polar codes, and it can be seen that polar codes has a recursive structure, for N=8 polar codes, it includes two N=4 polar codes structures and four N=2 polar codes structures, according to the structure of Figure 1, polar codes can be decoded using BP decoding algorithm.
Structure of polar code
BP decoding algorithm is a widely used message transmission algorithm, and one of the most important features compared with other decoding algorithms is that it can decode in parallel, which can solve the delay problem well, and it mainly realizes decoding by iterating the left and right cycles of the factor graph.
The factor graph of (N,K) polar codes consists of
Where
In the processing element, the updated information of a node of the outgoing processing element is determined by the information of the other three nodes of the incoming processing element. The information update formula as follows:
Among them:
In order to reduce the computational complexity of BP decoding algorithm, the minimum sum (Min-Sum,MS) approximation[19] is used to approximate
After T iterations, the j-th estimated information bit can be obtained by hard decision on the left information log-likelihood ratio
Deep learning is one of the hottest technologies today, which allows systems to learn effective algorithms directly from training data. Deep Neural Networks (DNN), also known as deep feedforward neural networks, is a typical deep learning model. A deep neural network model can be abstracted as a function f that maps input x0 ∈ ℝN0 to output y ∈ ℝNL:
Where
Multi-layer structure of deep neural network
In neural networks, the parameters in input and output mapping usually refer to weights and deviations. The weight reflects the degree of influence between neurons, while the deviation describes whether the neurons are activated. In order to minimize the loss function, the two parameters are iteratively adjusted using back propagation (BP) or random gradient descent (SGD) during the training phase until the DNN converges and stabilizes.
Convolutional Neural Networks (CNN) is also a class of feedforward artificial neural networks, which have been widely used in computer vision tasks such as image classification, recognition, and segmentation in recent years. Recurrent Neural Network (RNN) is a class of neural networks with recursive structure, that is, the current output of a sequence is related to the previous output. Traditional RNN has serious disappearance and explosion gradient problems. Therefore, people usually use its variants, such as Long Short Term Memory (LSTM) and Gated Recurrent Unit (GRU).
Neural network decoder is a classification problem in supervised learning. The current neural network decoder system diagram is shown in the figure 3, which mainly replaces the decoder in traditional receivers.
Block diagram of neural network based decoder system
At the transmitter, the information bit length of the source u is K, which is changed to x after polarization coding, and the code word length is set to N. The K information bits and the other (N-K) frozen bits are respectively arranged on reliable bit channels and unreliable bit channels, and then multiplied by the coding matrix of polar codes
After the modulated sequence passes through the channel, noise interference is added to obtain the noise-added sequence y, where the received sequence y satisfies the following formula:
Gaussian white noise is a common interference in the channel, so n is set here to meet the standard normal distribution of Gaussian white noise. At the receiving end, the sequence y is input into the neural network to complete the classification process and output the estimated information bits
Where
The literature [7] and [16] used neural networks to improve BP decoding algorithm and achieved good performance. However, the complexity of the neural network and the corresponding training difficulty are positively correlated with the difficulty of decoding, and the size of the codebook set is exponentially related to the length of the information bits. When the length of information bits is long, it is difficult to train a suitable neural network. Below, some comparative simulations will be conducted to demonstrate this problem.
Three polar codes with code lengths of 8, 16 and 32 with code rate R=0.5 are selected and trained with the same training set size and training period to compare their decoding performance. The relevant parameters are shown in the table I and II.
Performance of different network structures at N=8
Performance of different network structures at N=16
Performance of different network structures at N=32
Parameters Settings
code length | 8, 16, 32 |
code rate | 0.5 |
batchsize | 512 |
learning rate | 0.001 |
training set size | 106 |
epoch | 103 |
network structure | 32-16-8, 128-64-32, 512-256-128 |
Network Structure
N=8 | 1024 | 11752 | 169846 |
N=16 | 1352 | 13488 | 174992 |
N=32 | 1352 | 13488 | 174992 |
The simulation results show that when the code length is very short, the decoding performance of the three network structures is similar, and their performance is better than the traditional BP decoding algorithm. For the case of N=16, the network structure of 32-16-8 cannot be decoded correctly. However, for the case of N=32, none of the three network structures can achieve the decoding function, requiring a more complex network structure to complete this task. However, the number of parameters for the 512-256-128 network structure is already very large. Therefore, it can be seen that the exponential growth of the complexity of the network structure with the length of information bits will seriously restrict the development of neural network decoder technology.
From the analysis in the previous section, it can be seen that although the emergence of neural network decoders is a significant breakthrough in traditional decoding, they still cannot overcome the problem of dimensional constraints. Therefore, in order to further expand the application of neural network decoders in long codes, this section will propose a partitioning based neural network decoding scheme combining the coding characteristics of polar codes themselves.
According to the structure of polar codes in Figure 1, it is a recursive coding scheme, and long polar codes can be seen as a combination of several short polar codes. The traditional BP decoding algorithm is an iterative decoding algorithm, and its number of iterations is directly related to the performance and accuracy of decoding. The more iterations, the more accurate the result, but its computational complexity is also positively correlated with the number of iterations. The more iterations, the greater the decoding delay. Therefore, inspired by literature [12] [13] and literature [17] [18], this article introduces a neural network decoder to decode the sub blocks, which serves as a substitute for the last layers of BP decoder, while combining BP algorithm to complete the information exchange between the sub blocks. For example, dividing into two decoding blocks is shown in Figure 7:
Structure diagram of the proposed MLP-BP
The interaction between BP component and NND component is shown in Figure 8.
Interaction of BP and DNN blocks
The decoding is divided into two parts. The neural network block part contains multiple parallel MLP decoding blocks, each decoding block corresponding to polar codes sub block, arranged in parallel at the leftmost end of BP decoding. Replacing a part of BP decoding with neural network blocks combines the advantages of BP decoding algorithm and MLP decoding algorithm, maintaining parallel decoding. For the proposed algorithm, the size of NND portion is not fixed, but is determined by the training ability and performance requirements of the network. The larger the NND portion, the less the number of blocks, and the closer the entire system structure is to traditional neural network decoders, resulting in a significant increase in the difficulty of neural network training; On the contrary, the larger BP part, the closer its decoding performance will be to the performance of BP algorithm.
The whole module input is
Since the MLP neural network decodes each sub-block, the information between the sub-blocks is not transmitted to each other. At this time, the BER curves of different sub-blocks
Proposed MLP-BP decoding algorithm
1: |
2: |
3: Initialization: Initialization using (2) |
4. |
5. |
6: Update using equation (3),
|
7. |
8: After reaching NND use the sub-block |
9: |
10: |
11: Using equation (7) yields,
|
12. |
13: Retransmission |
14. |
15: Update using equation (3)
|
16: |
17: |
After the CRC part is added to the recoding, because when the sub block is very small, CRC verification cannot be added. When the sub block is larger, even if CRC verification can be added, it will reduce the code rate and cause performance losses. Therefore, the CRC check portion is added to all codewords.
In the decoding problem of this section, the Gaussian approximation construction method is used, and the randomly generated data is used as the training set. For (8,4) polar code, the information bits are {3,5,6,7}; for {16,8} polar code, the information bits are {7,9,10,11,12,13,14,15}, and these two short code blocks are used as references to divide long codes. For (32,16) code, the information bits are {12,14,15,16,19,21,22,23,24,25,26,27,28,29,30}
From the table 3, it can be seen that (32,16) code can be divided into two 16 length codes, with the first block code rate of 0.25 and the second block code rate of 0.75. After block training, they are spliced.
Polar(32,16) Divided into Two Parts
[0–15] | {11,12,14,15} | 0.25 |
[16–31] | {19,21,22,23,24,25,26,27,28,29,30,31} | 0.75 |
By analyzing the information bits, not for a new code length to retrain a set of MLP decoding block, the (32,16) is divided into four 8-long code words: the first block code rate of 0, the fourth block code rate of 1, both of which do not require network training, using a hard judgment can be, for 2, 3 two intermediate blocks, through Table 4 compared with Table 3, equivalent to two (8,4) polar code neural network decoding block. The relative positions of the information bits are the same, and the two decoding blocks can be shared, realizing the reuse of the network decoding blocks. In general, when the code lengths are the same and the relative positions of the contained information bits are the same, the same decoding block can be used. When the MLP-BP decoder is applied to longer code words, the MLP block will have duplicate parts, which reduces the required neural network blocks and improves the reusability.
Polar(32,16) Divided into Four Parts
[0–7] | None | None | 0 |
[8–15] | {11,12,13,14} | {3,5,6,7} | 0.5 |
[16–23] | {19,21,22,23} | {3,5,6,7} | 0.5 |
[24–31] | {24,25,26,27,28,29,30,31} | {0,1,2,3,4,5,6,7} | 1 |
Parameter Setting
Tensorflow | |
Polar(32,16), (64,128) | |
1~5dB | |
Cross Entropy Loss | |
Adam |
Firstly, the change trend of the loss value with the training period is analyzed. From Figure 9, it can be seen that the loss value decreases and gradually stabilizing at a certain value as the number of training times increases, indicating that the proposed algorithm MLP-BP can effectively converge through training. In addition, Figure 10 also analyzes the performance changes of BER and FER when SNR is 4 dB are analyzed, and it can be seen that as the number of training increases, the performance of the algorithm gradually decreases and is superior to traditional BP decoding algorithms.
Change of MLP-BP training loss value when N=128
Evolution of MLP-BP and BP BER when N=128
Next, we discuss the advantages of the proposed algorithm compared to traditional BP decoding algorithms.
Experiments are first conducted using code word N=32, and the code word is divided into 4 blocks, each of which had 8 bits. The trained model is saved, and after the training of the sub blocks is completed, it is cascaded to the end of BP decoding to replace the last three layers of BP. The performance comparison with BP decoding both using 30 iterations is shown in Figure 11.
BER performance comparison of two decoding methods at N=32
Then, the model is applied to a polar code with code length of N=128, and the code is divided into 16 blocks, each containing 8 code words for decoding, the decoding results are shown in Figure 12:
BER performance comparison of two decoding methods at N=128
For (128, 64) polar code, the MLP-BP decoding model has a performance improvement over BP decoding, with MLP-BP having a 0.2 dB gain over BP for BER 0.01 to 0.2, and MLP-BP having a 0.4 dB gain over BP for BER 0.001.
The decoding delay of the BP decoding algorithm for polar codes of code length N with a BP iteration number of
And the decoding delay required by the MLP-BP decoding algorithm proposed in this paper is determined by the code length N and the number of iterations
For (32,16) polar code, the decoding delay of each algorithm is shown in Table 6, when the number of iterations is 30, the proposed block-based MLP-BP decoding algorithm reduces the time step by 81.1% compared with BP decoding algorithm.
Decoding time delay
Decoding time delay | 380 | 72 |
In this paper, we proposes an MLP-BP decoding algorithm for polar codes based on the idea of partitioning, which combines the advantages of BP algorithm and MLP to maintain parallel decoding. At the same time, a method for setting the right value information during intermediate iterations is set, and the MLP neural network blocks are nested into the final layers of BP decoding. Simulation results show that compared to traditional BP decoding algorithm, the proposed MLP-BP decoding algorithm improves the bit error rate, reduces the number of iterations required, and reduces decoding delay in disguised form under the same performance conditions. In future research, further research should be conducted on the structural optimization of neural networks and the design of network parameters, with the aim of improving the decoding performance of polar codes BP decoding algorithm.