Contribution of Deep Learning Algorithm to Improve Channel Estimation Performance

Lassaad Smirani^{1,2}

Show more

1. Introduction

Researchers and engineers must constantly improve the architecture of mobile networks in order to evolve rates and services. Long Term Evolution (LTE) networks, marketed as 4G, have been widely deployed around the world. His successor, the 4G+ or the real 4G (LTE-Advanced, Releases 10-11), had be recently deployed in major cities. This technology offers voice support and triples the theoretical throughput. The last generation, the pre-5G (LTE-B, Release 12-13) [1] [2] [3] , was designed to open the 5G road to support new cellular network users. At the same time, new types of mobile applications have emerged. These benefit from increased throughput by massively using bandwidth and require efficient management of Quality of Service (QoS). To adapt to these news uses of cellular networks, resource allocation procedures must evolve to meet the applications’ expectations.

A particularity of wireless networks is that they are subject to many channel impairments (signal attenuation, frequency selective fading, multi-pathing...). In the LTE system, the radio resources are both divided in the time domain and in the frequency domain. The exploitation of temporal and frequency variations in the quality of resources is essential in order to provide a broadband connection to users.

In this article, we applied Deep Learning on LTE-A uplink channel estimation system. The first section of this work is reserved for creating two SC-FDMA databases: for training and for test, based on three types of channel propagation models. After we apply an artificial neural network to estimate the channel for SC-FDMA link.

The technique of channel estimation by use of neural network is a delicate process which needs to take into consideration several parameters described in this section. Once training has taken place, the neural network will be tested and implemented on the receiver.

The section two of this paper discusses the same experiment but by using Deep Learning instead of conventional neural networks. The third section is dedicated to the result analysis and the comparisons made by MMSE.

2. Fundamentals of LTE-Advanced

For LTE, to become true fourth-generation (4G) technology, it has been refined to meet the requirements of the IMT-Advanced specifications published by the International Telecommunications Union (ITU).

The necessary enhancements are specified in 3GPP version 10, also known as LTE-Advanced. IMT-compliant systems will be candidates for future spectrum bands yet to be identified, which is another major reason for bringing LTE-Advanced to the level required by IMT-Advanced technology (Figure 1).

This ensures that today’s deployed LTE mobile networks evolve in the direction of many years of commercial operation. The LTE-Advanced standard also increases peak data rates to 1 Gb/s downlink and 500 Mb/s uplink [4] .

The important components of the LTE-Advanced technology are carrier aggregation, MIMO extension for downlink up to 8 × 8, and for uplink up to 4 × 4, improvements in uplink access (aggregated SC-FDMA and simultaneous transmission of data and control information (PUSCH and PUCCH)), Better cell-side performance (eICIC, enhanced Inter-Cell Interference Coordination, relaying).

3. The SC-FDMA Model

The SC-FDMA access technique is used for the uplink, its main feature is the DFT propagation. The signal on each subcarrier is a linear combination of all the symbols M. Also this access technique is characterized by a gain in PAPR (Pick to Average Power Ratio) which is of the order of 2 dB [5] .

4. Creating Two Databases: Types of Channel Models

We use the model implemented according to R4-070872 3GPP TR 36.803v0.3.0 [6] .

Our aim is to generate three types of channel models to build the example base for mobile radio communications networks (Figure 2).

This standard implemented a set of 3 channel models to simulate propagation conditions within the multipath fading air interface. Multipath fading is modeled as a delay line taken with a number of taps at fixed positions on a sampling grid.

Figure 1. Orthogonal frequency division multiple acces.

Figure 2. Frame constitution of SC-FDMA.

The gain associated with each tap is characterized by a distribution (Ricean with a factor K > 0 or Rayleigh with a factor K = 0).

The gain is also characterized by the maximum Doppler frequency determined from the speed of the mobile.

For each tap, the filtered noise method is used to generate channel coefficients with the specified spectral power distribution and density. The definition of the 3 specific channels is indicated in the following Tables 1-5.

In this part of the work we used Matlab to create two databases which will be essential to realize the learning of the neural network. Firstly, Matlab randomly generated the information that is 100 miles each time, and then we introduced these sequences into an SC-FDMA transmission chain to obtain the signal X(t) (Figure 3).

The output of this chain will be introduced in a transmission channel according to one of the three above mentioned models to obtain each time the signal Y(t).

Table 1. Parameters of channel models.

Table 2. Pin Inputs and pin outputs of channel models.

Table 3. Extended pedestrian a model (EPA).

Table 4. Extended vehicular a model (EVA).

Table 5. Extended Typical Urban model (ETU).

5. Channel Estimation

The most classic modeling of the channel is to consider that its response impulse is stationary in the wide sense: Wide Sens Stationary (WSS) and that broadcasters are uncorrelated: Uncorellated Scatterers (US). This model WSSUS was introduced by P.A. Bello in 1963. Bello characterizes well short-term variations

Figure 3. A block-diagram of an uplink SC-FDMA transmitter and receiver.

for displacements of the order of a few tens of wavelengths [7] .

It is also clear that mobile radio communications systems respond to multipath rules. In order to obtain coefficients rules of L(t) associated with the path L, it is necessary to introduce the notion of cluster (group of micro-paths) associated with a L delay. Indeed, the reception area of the mobile terminal often comprises nearby diffusers transforming a given path into a group of micro-paths having very small differences in operation and therefore almost the same delay, but with any phase differences. Thus, the coefficient of each path L corresponds to the superposition of all the coefficients of the micro-paths of the clusters.

When a path L corresponds to a multitude of incoherent micro-paths, the probability density of the corresponding coefficient ${\alpha}_{l}\left(L\right)={\rho}_{l}\left(t\right){\text{e}}^{j\theta \left(t\right)}$ is Gaussian complex in application of the theorem of the central limit. We deduce that:

・ The real and imaginary parts of ${\alpha}_{l}\left(t\right)$ are Gaussian variables uncorrelated variance ${\sigma}_{{\alpha}_{l}}^{2}$

・ The module (envelope) L of the coefficients then follows a Rayleigh law given by:

$p\left({\rho}_{l}\right)=\{\begin{array}{l}\frac{{\rho}_{l}^{2}}{{\sigma}_{{\alpha}_{l}}^{2}}{\text{e}}^{-\frac{{\rho}_{l}^{2}}{2{\sigma}_{{\alpha}_{l}}^{2}}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}pour\text{\hspace{0.17em}}{\rho}_{l}\ge 0\\ 0\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}sinon\end{array}$ (1)

With $2{\sigma}_{{\alpha}_{l}}^{2}$ is the power gain associated with the path L.

The phase ${\theta}_{l}\left(t\right)$ of the coefficients is uniformly distributed between 0 and 2π.

6. Channel Equalization

The signal propagates in the air that represents the physical medium between the transmitter and the receiver. At the reception of the signal two operations will be carried out: The first is to realize an equalization at the input of the receiver, its role is to eliminate the effect of the channel impulse response. The second operation is used to predict the channel behavior and its different variations as a function of time, it is the channel estimation.

The channel estimation techniques are numerous. The famous one is the Mean Square Error Estimator (MMSE), known by its efficiency and its great complexity. Also the ZF method is known for its simplicity with low efficiency. The MMSE estimator uses the second-order statistics of channel conditions to minimize the mean squared error. The estimation based on the MMSE gauge is used to improve the accuracy with the information of the channel estimation. The algorithm can be implemented in the frequency domain.

Zero Forcing Detection (ZF) is the simplest signal detection technique used to extract the transmitted signal. The disadvantage of this technique is that it does not correlation of the transmitter and the receiver, so the biggest error occurs. It can’t completely remove the Inter Signal Interference. ZF is less complex compared to other mechanisms.

The estimation technique of Zero-Forcing (ZF) such it is suggested in the LTE Standard for SC-FDMA Systems presents a new estimation technique whose reference sequence is localized on all the carrier frequencies of the transmitted signal and consists of particular signals whose statistics will be used at reception to estimate the channel [8] .

In general, the impulse response of a mobile radio channel varying in the time can be represented by a RIF filter as indicated in (2). However, channel frequency estimation techniques operate in packet mode. This assumes that the considered channels are invariant throughout the duration of transmission of each packet. Thus only the quasi-stationary channels will be possible for the application of these techniques. Equation (3) gives a mathematical representation.

$h\left(\tau ,t\right)={\displaystyle {\sum}_{n}{\alpha}_{n}\left(t\right){\text{e}}^{-j2\pi {f}_{c}{\tau}_{n}\left(t\right)}\delta \left(\tau -{\tau}_{n}\left(t\right)\right)}$ (2)

$h\left(\tau \right)={\displaystyle {\sum}_{n}{\alpha}_{n}{\text{e}}^{-j2\pi {f}_{c}{\tau}_{n}}\delta \left(\tau -{\tau}_{n}\right)}$ (3)

The ZF channel estimation technique is a technique whose role is to estimate the frequency response of the channel [9] . Each coefficient estimated in the domain frequency then represents the frequency response of the channel on a subcarrier given.

The ZF channel estimation technique goes hand in hand with the principle of insertion the guard interval. It transforms the linear convolution of the channel by the signal transmitted in a circulating convolution. Knowing that the DFT of the circular convolution between two discrete signals corresponds to the product of the DFTs of each of these signals, we can write the Equation (5) where H_{k} and Y_{k} respectively represent the frequency responses of the channel and from the packet sent. W_{k} represents the noise of the channel on the carrier n_{k} (Table 6).

It can be seen that in this equation the symbols Y_{k} do not interfere. It is the second advantage provided by the guard interval.

${H}_{k}=DFT\left\{{h}_{n}\right\}$ (4)

${R}_{k}={H}_{k}{Y}_{k}+{W}_{k}$ (5)

In this Equation (5), when the source symbols are considered to be known at moment of channel estimation they are referred to as reference symbols or drivers’ symbol.

To obtain an estimate of the channel, it would be enough to divide the signal received on each of the sub-carriers of the signal by the reference symbol which modulates it: This is the Zero-Forcing estimation technique (Figure 4).

There are two methods of inserting reference symbols: we can have an insertion either in the time domain when they are considered a learning sequence, or in the frequency domain where some subcarriers of the signal will be dedicated to transmit them among the duration of the communication. Both methods can also be combined for a time and frequency estimation of channel transmission.

Table 6. Simulation parameters.

Figure 4. Comparison of MMSE Simulator with ZF simulator.

7. Use of Neural Networks for Channel Estimation

The problem that our work evokes revolves around the methods that try to give satisfactory results to estimate uplink LTE-A channel transmission. The quality of this estimate plays a very important role and has a direct impact on information reception quality as well as the throughput. The concern of the different generations of mobile communication systems is the increase of flow with an exclusive quality of reception.

In this paper, we present new channel estimation techniques adapted to the SC-FDMA system, all belonging to the class of techniques called Data Aided (DA) or adding signal. For this category of estimators, the used information is a part of the transmitted signal located in the time-frequency domain and called learning sequence or references.

8. Channel Estimation Based on Preamble Sequences

The preamble sequences are inserted at the beginning of each communication thus making it possible to estimate the transmission channel before sending the informative signal. This technique is very interesting if the channel does not undergo any variation after the learning period. However, less variation of the channel during the transmission of the informative signal would prevent any detection in reception and this irreversibly. To take in account this possibility, the preamble sequences are regularly inserted throughout the duration of the communication. For example, in the specific case of SC-FDMA systems a preamble sequence is inserted in the middle of each slot. In this case, the estimation occurs at each once a preamble sequence is detected on reception. Disadvantage of this method is the decrease of informative signal flow: More there are better learning sequences will be the continuation of the channel that is paid by a decrease in useful flow [10] .

9. Estimation by Distributed Pilots and Interpolation

This is a technique that allows the frequency response of the channel to be continuously evaluated on each transmitted packet on the system sub-carriers, dedicated to sending pilot symbols. The pilots are inserted in the frequency domain and on some subcarriers. Since the pilots are not located on all sub-channels, it is then necessary to proceed by interpolation in order to determine the channel coefficients on all the subcarriers of the signal. Indeed when the spacing of the sub channels dedicated to the pilots is very less than the coherence band of the channel which corresponds to the inverse of the delay generated by multipath, then the neighboring subcarriers will be practically influenced in the same way by the channel. So it is possible to estimate the channel response on all neighboring subcarriers located within any two pilot subcarriers by a simple interpolation.

The interpolation techniques are numerous: The linear first order interpolation, the second-order interpolation, the technique of over-sampling based on the Fourier Transform etc. These techniques will be compared with our proposed intelligent method. Another interpolation class combining time and frequency domains and based on symbol correlation in these two domains is also used in some systems.

10. Using Intelligent Method for Interpolation: Learning Process of a Neural Network

After applying classical methods of interpolation we propose in this paper two intelligent methods of interpolation. We are well aware that neural networks can contribute in this phase of channel estimation. We have already presented a scientific paper [1] . We will take advantage of the work already done and introduce the method of deep learning

As we know Artificial Neural Network is made up of neurons connected to each other; at the same time, each connection of ANN is associated with a weight that dictates the importance of this relationship in the neuron when multiplied by the input value. Each neuron has an activation function that defines the output of the neuron. The activation function is used to introduce non-linearity in the modeling capabilities of the network. We have several options for activation functions. Training ANN consists of learning the values of its parameters (weights wij and bj biases), it is the most genuine part of learning and we can see this learning process in a neural network as an iterative process of “going and return” by the layers of neurons. The “going” is a forward propagation of the information and the “return” is a backpropagation of the information [11] .

The first phase forward propagation occurs when the network is exposed to the training data and these cross the entire ANN for their predictions to be calculated. That is, passing the input data through the network in such a way that all the neurons apply their transformation to the information they receive from the neurons of the previous layer and sending it to the neurons of the next layer. When the data has crossed all the layers, and all its neurons have made their calculations, the final layer will be reached with a result of label prediction for those input examples (Figure 5).

Figure 5. Artificial neural networks.

Next, we will use a loss function to estimate the error and to compare and measure how good/bad our prediction result was in relation to the correct result (remember that we are in a supervised learning environment and we have the label that tells us the expected value). Ideally, we want our cost to be zero, that is, without divergence between estimated and expected value. Therefore, as the model is being trained, the weights of the interconnections of the neurons will gradually be adjusted until good predictions are obtained.

Once the loss has been calculated, this information is propagated backwards. Hence, its name: backpropagation. Starting from the output layer, that loss information propagates to all the neurons in the hidden layer that contribute directly to the output. However, the neurons of the hidden layer only receive a fraction of the total signal of the loss, based on the relative contribution that each neuron has contributed to the original output. This process is repeated, layer by layer, until all the neurons in the network have received a loss signal that describes their relative contribution to the total loss.

Visually, we can summarize what we have explained with this visual scheme of the stages:

Now that we have spread this information back, we can adjust the weights of connections between neurons. What we are doing is making the loss as close as possible to zero the next time we go back to using the network for a prediction. For this, we will use a technique called gradient descent. This technique changes the weights in small increments with the help of the calculation of the derivative (or gradient) of the loss function, which allows us to see in which direction “to descend” towards the global minimum; this is done in general in batches of data in the successive iterations (epochs) of all the dataset that we pass to the network in each iteration. To recap, the learning algorithm consists of:

Start with values (often random) for the network parameters (w_{ij} weights and b_{j} biases).

Take a set of examples of input data and pass them through the network to obtain their prediction.

Compare these predictions obtained with the values of expected labels and calculate the loss with them.

Perform the backpropagation in order to propagate this loss to each and every one of the parameters that make up the model of the neural network.

Use this propagated information to update the parameters of the neural network with the gradient descent in a way that the total loss is reduced and a better model is obtained.

In summary, we can consider backpropagation as a method to alter the parameters (weights and biases) of the neural network in the right direction. It starts by calculating the loss term first, and then the parameters of the neural network are adjusted in reverse order with an optimization algorithm taking into account this calculated loss.

Three arguments are passed to the method: an optimizer, a loss function, and a list of metrics. In classification problems like our example, accuracy is used as a metric (Figure 6).

11. Deep Learning for Channel Estimation

Deep Learning, also called deep neural network or deep neural learning, can be introduced as a function among artificial intelligence procedures. When processing data and solving a problem and creating models uses in decision-making, the Deep Learning function tries to mimic the behavior of the human brain. In Artificial Intelligence Deep Learning is a subset of machine learning, the most important component in Deep Learning are robust networks. These artificial neural networks can learn independently and without any control from database containing examples [12] .

12. How Deep Learning Works?

Deep learning has evolved hand-in-hand with the digital era, which has brought about an explosion of data in all forms and from every region of the world. This data, known as a big data, is drawn from sources like social media, internet search engines, e-commerce platforms, and online cinemas, among others. This huge amount of data is easily accessible and can be shared through finch applications like cloud computing.

Although they have at some time proven their performance, ANNs are traditionally made up of few hidden layers. For the cases that we have already been studied, ANN is composed of only one hidden layer due to several reasons, especially technical limitations in terms of processor computing power. Nowadays and thanks to many technological advances, and ever more substantial databases, the hidden layers of neural networks have been able to grow. Deep Learning refers to a particular type of Artificial Intelligence using in particular the neuron network and certain models of particular algorithms such as the convolutional neural

Figure 6. Channel estimation by classic ANN.

Figure 7. Deep learning example.

network model in order to generate intelligent models through learning (Figure 7).

13. Experimentations

Although they have at some time proven their performance, ANNs are traditionally made up of few hidden layers. For the cases that we have already been studied, ANN is composed of only one hidden layer due to several reasons, especially technical limitations in terms of processor computing power. Nowadays and thanks to many technological advances, and ever more substantial databases, the hidden layers of neural networks have been able to grow. Deep Learning refers to a particular type of Artificial Intelligence using in particular the neuron network and certain models of particular algorithms such as the convolutional neural network model in order to generate intelligent models through learning

We designed a transmission and reception chain. At the output of the source, a binary message is generated randomly, its dual reception is its estimated. Although they have at some time proven their performance, ANNs are traditionally made up of few hidden layers. For the cases that we have already been studied, ANN is composed of only one hidden layer due to several reasons, especially technical limitations in terms of processor computing power. Nowadays and thanks to many technological advances, and ever more substantial databases, the hidden layers of neural networks have been able to grow. Deep Learning refers to a particular type of Artificial Intelligence using in particular the neuron network and certain models of particular algorithms such as the convolutional neural network model in order to generate intelligent models through learning The second component of the chain is the channel coding whose output is a coded binary message, its dual at the reception is its estimated. Errors between these two signals are corrected by channel coding. Then, we have the Mapping to obtain a vector of complex symbols of a constellation Ω of size K such as a Phase Modulation MDP4 or an Amplitude Modulation in Quadrature MAQ 64. Then the serial/parallel converter: The vector of complex symbols of useful data paralleled on different sub-carriers. The representation in the frequency domain of the impulse response of the transmission channels is a sum of sinusoids. We study the simulated signal U(t) over a period T_{s} corresponding to N samples. U(t) is the resultant of a Dirac pulse δ(t) emitted through a channel with L paths and is defined by:

$U\left(t\right)={\displaystyle {\sum}_{\beta =0}^{L}{h}_{\beta}\delta \left(t-{\tau}_{\beta}\right)}$ (7)

with h_{β} a random variable according to a centered Gaussian law.

First, the transition to the frequency domain is performed by applying an FFT to this response, we obtain the sum of sinusoids representative of the transmission channel. Then, to make the experiments relating to the comparison of the different interpolation methods, we take some well-chosen values of the frequency response: These values will act as pilots. Then we perform the interpolation using the different interpolation algorithms, in order to obtain an estimate of the sum of sinusoids from the values of the pilots (Figure 8).

Finally, we calculate the BER between the values of the frequency response estimated by the interpolation algorithms on one side and the exact value on the other side.

14. Results and Conclusions

We used the same simulation parameters for the following four cases:

1) An interpolation with the MMSE technique

2) An interpolation using a classical artificial neural network, multilayer perceptron

3) An interpolation using a hybrid artificial neuron network

4) An interpolation using deep learning

The results of the simulations are illustrated in the figure below. We see a slight improvement from one method to another. Deep Learning and thanks to

Figure 8. Channel estimation by HANN, MMSE, PMC and DL.

Figure 9. Channel estimation by HANN and DL.

its hidden layers have better results than a neural network with a single hidden layer, but the processing time is slower (Figure 9).

15. Conclusions

In this article, we have improved the results of channel estimation in the SC-FDMA context. Simulation results obtained by our Deep Learning algorithm with its hidden layers were compared to the results obtained by classic estimators and a simple ANN.

The simulation results show that, compared to conventional LS, MMSE and ANN algorithms, DL performs better and improves clearly the outputs. A complexity study has shown that the DL has a low complexity compared to the MMSE but the processing time is lower due to the hidden layers.

References

[1] Smirani, L., Boulahia, J. and Bouallegue, R. (2017) A Semi Blind Channel Estimation Method Based on Hybrid Neural Networks for Uplink LTE-A.

[2] Boulahia, J., Smirani, L. and KSA, M.A. (2015) Experiments of a Neuro Symbolic Hybrid Learning System with Incomplete Data.

[3] Behjati, M., et al. (2020) What Is the Value of Limited Feedback for Next Generation of Cellular Systems? Wireless Personal Communications, 110, 1127-1142.

https://doi.org/10.1007/s11277-019-06777-1

[4] Chen, W., et al. (2019) Uplink Procedures for LTE/LTE-A Communication Systems with Unlicensed Spectrum.

[5] Roh, J.C., Bertrand, P. and Yao, J. (2020) Transmission Scheme for Sc-FDMA with Two DFT-Precoding Stages.

[6] Habib, B. and Farhat, H. (2018) Channel Hardware Simulator Design and Implementation for MIMO Time-Varying 802.15. 7 VLC Indoor Signals. 2018 IEEE Middle East and North Africa Communications Conference (MENACOMM), 18-20 April 2018, Jabal Loubnane, Lebanon.

https://doi.org/10.1109/MENACOMM.2018.8370999

[7] Alencar, M.S. and da Rocha Jr., V.C. (2020) Propagation Channels, in Communication Systems. Springer, New York, 207-235.
https://doi.org/10.1007/978-3-030-25462-9_7

[8] Minango, J. and de Almeida, C. (2019) Hyper-Power Zero Forcing Detector for Massive MIMO Systems. Wireless Networks, 25, 4349-4357.
https://doi.org/10.1007/s11276-019-02099-z

[9] Miridakis, N.I., Tsiftsis, T.A. and Wang, H.-M. (2019) Zero Forcing Detection for Short Packet Transmission under Channel Estimation Errors. IEEE Transactions on Vehicular Technology, 68, 7164-7168. https://doi.org/10.1109/TVT.2019.2913886

[10] Barak, E. (2019) Mud Pulse Telemetry Preamble for Sequence Detection and Channel Estimation.

[11] Liu, H., et al. (2019) Recurrent Neural Network-Based Approach for Sparse Geomagnetic Data Interpolation and Reconstruction. IEEE Access, 7, 33173-33179.

https://doi.org/10.1109/ACCESS.2019.2903599

[12] Soltani, M., et al. (2019) Deep Learning-Based Channel Estimation. IEEE Communications Letters, 23, 652-655.
https://doi.org/10.1109/LCOMM.2019.2898944