Performances prediction in Wireless Sensor Networks: A survey on Deep learning based-approaches

The dynamism and successive changes in the distribution of nodes are among the important characteristics of wireless sensor networks (WSNs). To adapt with those changes, network designers frequently have to configure multiple parameters for each layer on WSN architecture (i.e physical, medium access control, network, and application layer). This tuning has an important impact on the network performances (e.g packet loss, energy efficiency, throughput, network lifetime, etc). However, finding the optimal configuration is the main challenge. Deep learning (DL) based on neural network layers can be used to extract patterns from high-dimensional data provided by sensor nodes. In this paper, we survey the most recent DL approaches which aim to predict WSN performances by finding the pattern on the network parameters (such as transmission power level, MAC protocol type, spectrum availability, congestion points, etc.). Moreover, we classify the studied articles by considering the targeted network layer or cross-layer. This paper can be considered as a starting point for researchers to review the recent DL applications on the optimization of WSNs performances based on multiple network parameters.


INTRODUCTION
The world's attention today is focused on enabling artificial intelligence in a range of areas.Wireless sensor networks show their capabilities in achieving this aim in many fields such as healthcare centers, smart buildings, agriculture environments, and industrial factories [1].However, those applications often require a large and dynamic distribution of nodes [2].Accordingly, they are a subject of a set of constraints such as low packet loss, high network lifetime, high throughput, high energy efficiency, and low delay.For this reason, the network manager always finds himself faced with modifying a set of stack parameters to meet those network requirements.
Traditionally, the configuration of stack parameters in most deployments is based on common patterns from experience and rules of thumb, as a consequence of an insufficient understanding of the combined effect of the stack parameters on the performance [3].Therefore, the process of extracting those patterns needs to be more accurate using an optimal approach that can fill this gap.For an intelligent configuration of WSNs, the workflow to adapt may consist of five components: problem identification, data collection, data analysis, visualization, and evaluation [4].In data analysis, traditional technologies use machine learning (e.g.support vector machine, logistic regression, and random forest, etc) for either classification or regression tasks (e.g.intrusion detection in WSNs with support vector machine (SVM) [5], sensors localization using Kalman filter and ridge regression [6], fault detection in WSNs with random forest [7]).Nevertheless, as the IoT devices become a source of massive and irregular data, such conventional ML approaches are not powerful for efficient feature extraction [8].Indeed, most traditional ML techniques applied in IoT (internet of things) systems usually utilize shallow architectures, which are characterized by limited modeling and representational power.For this reason, a deep approach is highly desirable for deep pattern extraction from the invaluable raw data generated in various IoT applications.The recent development of deep learning architectures [9], frameworks [10], and hardware make it the best choice to deploy much more deep and powerful models.
In this article, we will survey the most up-to-date deep learning approaches for WSN performance optimization.First the stack layer architecture of WSNs networks will be presented by explaining the aim and the impact of stack parameters configuration on WSNs performances.Secondly, we will investigate the application of DL on performance prediction for each layer.At the end, we will survey the cross-layer approaches with the capability of deep learning in achieving accurate predictions of performance trade-offs between different layers.

Protocol stack of WSNs
The most common architecture for wireless sensor networks follows the OSI Model [11].Basically in WSNs, we need five layers: application layer, transport layer, network layer, data link layer, and physical layer.Each of the previous layers has a set of parameters that need to be configured correctly in order to optimize one or multiple network performances as shown in Figure 1.
In the following, we will define the services granted by PL, MAC, and network layers.
The aim of PL is to ensure the following services: signal detection, frequency selection, modulation, data encryption, and carrier frequency generation.The number and physical distribution of sensors in many networks make it hard to guarantee the above services along with saving energy consumption.The PL layers parameters accessible by the network designer cover the transmit power level, modulation scheme, and hop distance.The best values for these parameters are determined by their interactions as well as the network needs.When a wireless communication is received, the signal power to noise power ratio of the channel (i.e., SNR) may be deciphered with a given likelihood of a mistake.The probability of mistake decreases as the amount of energy utilized in transmission grows, thus Figure 1 Stack protocol of WSNs the number of retransmissions decreases.Hence, the expected number of retransmissions and the transmit power level have an optimal trade-off to minimize the energy consumption.Likewise, the hop distance is the third player, with its trade-off with the transmit power level.On the other hand, the modulation/coding scheme controls the probability of transmission success [13].
The network layer is responsible for route maintenance.Regarding [15], there is a set of performance metrics to meet by routing protocols.One of the techniques adopted is multipath routing, which provides data reliability by the creation of multiple data copies being sent from sender to receiver node relying on multiple paths.Another approach is energy-Aware routing, where an optimal routing path is determined concerning energy consumption.Well as minimizing routing control overhead by reducing transmission of unnecessary control packets.

Deep learning
DL constitutes a sub-branch of machine learning (i.e ML), with the ability to make predictions, classifications, or decisions based on high dimensional data, without being explicitly programmed.DL algorithms hierarchically extract patterns from raw data through multiple hidden layers of nonlinear processing units, to make predictions (i.e mapping between images and their class labels, computing future stock prices based on historical values) or take actions (i.e deciding the next optimal chess move given the current status on the board) according to some target objective.In contrast, the traditional ML models rely heavily on features defined by domain experts.

PHYSICAL LAYER SERVICES IMPROVEMENT USING DL
The physical layer has a set of parameters including a Modulation scheme, the distance between nodes, and transmit power level.Many Deep Learning-based approaches are proposed to find the optimal values of these parameters by predicting network performances.
In [16], two PL parameters are studied namely Transmission power level (TP) and the distance between the nodes (DT).The authors propose a deep Neural Network-based model using the data obtained from a Mixed Integer Programming (MIP) model.The data used is from a 25-node grid network topology.The distance between the nodes is varying from 1 to 40 m with 26 different transmission power levels.The experiment was performed 100 times under random configurations of both parameters.The results show that the two physical layer parameters can be used to predict the optimal network lifetime with high accuracy.The limits of this approach reside in two parts: First the amount of the data used for training the DL models, where the authors used 560 values which make the exploitation of other complicated neural networks architectures unlikely such as CNN.The second limitation comes with the capability of the model to predict just one parameter based on two others, but in most of the scenarios it's required to predict two or more parameters simultaneously, for example, if need to predict the distance between the nodes without previous knowledge of the network lifetime.But, predicting them at once will give the user the knowledge about the distance and the network lifetime.On the other hand, if the distance and the transmission power level are already known, this work gives great results in predicting the lifetime, and according to this prediction, the network designer can adjust the input parameters.
The modulation type of received signals is defined by modulation classification.The authors in [17] suggested a CNN-based DL scheme to improve the classification accuracy for complex modulation signals.Since various modulation methods have different constellation diagrams, the proposed approach employs Alex-Net to distinguish constellation diagrams, allowing the modulation method to be pinpointed.Alex-Net is a CNNbased deep learning model that can classify 1.2 million images into 1000 categories and has thousands of neurons and millions of connections [18].The proposed model can reliably distinguish QPSK, 8PSK, 16QAM, and 64QAM signals, according to simulations, and it has comparable accuracy to conventional modulation classification schemes like cumulant-based schemes and Support Vector Machine (SVM) based schemes.The subject of DL-based modulation classification has a lot of potential.However, simply looking at the constellation diagram's graphic pattern can restrict the classification effect.If the image classification is aimed at modulation parameters analysis, the classification efficiency could be improved.
Physical layer functions can be represented as an autoencoder including modulation, error correction coding and signal classification, etc.In [19] the authors train an auto-encoder as a CNN, and they test on a single end-to-end communication system, multipletransmitter/receiver system, and radio transformer networks.The proposed approach decreases the block error rate (BLER) by 1-5 times compared to the traditional modulation methods (e.g, QAM and BPSK) and error-correcting code (Hamming code) in multipletransmitter/receiver systems.Regarding Rayleigh fading communication scenarios (with radio transformer networks), the proposed model decreases the BLER by 1-7 times.

MAC LAYER SERVICES IMPROVEMENT USING DL
Spectrum prediction and MAC identification are the major concerns of wireless sensor networks [13].To address those constraints, several MAC protocols have been proposed.Time Division Multiple Access (TDMA), Carrier Sense Multiple Access/Collision Avoidance (CSMA/CA), and Code Division Multiple Access represent the traditional MAC protocols in sensor wireless networks [8].WSNs are a subject of many environmental changes, therefore an automatic configuration of the mac protocols parameters is necessary.Using Deep learning we can have an intelligent MAC protocol that can make predictions related to the spectrum and MAC identification.

Spectrum prediction
One of the main services of the MAC layer is to grantee the spectrum sharing in a way that minimizes collisions [20].Therefore, the need to predict the wireless medium availability to optimize the network performances.A deep learning-based approach can be built for spectrum prediction.In the literature, some research targets the classification of the mediums to busy or idle based on MAC measurements.Some other approaches focus on channel quality prediction, using idle probabilities or idle duration to select the channel with the highest quality for transmission.In [21], the authors propose a neural networks approach to create a schedule for accessing the medium in MF-TDMA like networks.The NN model learns the correct frequency and time slot to receive packets in a way that minimizes the collision.A CNN-based approach is presented in [22], in the interest of predicting the upcoming spectrum usage of other neighboring networks.The data used comes from different unknown sources with different network technologies.In [23], authors target performance improvements in cognitive networks.They design a neural network model which can predict primary users' future activity using past channel occupancy sensing results.Those predictions are used by secondary users to improve the throughput by minimizing collisions with primary users (PUs) in full-duplex.

MAC identification
In cognitive radios (CR) [24], MAC identification approaches are applied to cultivate the communication and coexistence between the different mac protocols in the network.
The information gathered during spectral sensing is used to detect environmental conditions, other technologies' existence, and spectrum holes.Licensed network users usually have frequency bands that are dedicated for their uses, but those frequencies or spectrum holes are not used at a particular time.Thus, they can be used by a CR user.The aim of MAC protocol identification is to help CR users to determine the timing range information of a spectrum hole.Therefore, according to this information, they will tailor its packet transmission duration, which will improve network performance.
The authors in [25] explore the application of CNN in MAC protocol identification.The MAC protocols studied include carrier sense multiple access with collision avoidance (CSMA/CA), time division multiple access (TDMA), Slotted ALOHA, and Pure ALOHA.The raw data of the protocols are converted into spectrograms, which are then used to feed a CNN model.The results showed high accuracy on the identification of the four MAC protocols compared with an SVM model.

NETWORK LAYER SERVICES IMPROVEMENT USING DL
Routing protocols represent the heart of wireless sensor networks.Many routing protocols have been developed expressly for WSNs, including data-centric, hierarchical, and location-based routing protocols [26].Routing protocols based on deep learning algorithms such as reinforcement learning (RL), and neural networks (NNs) have been extensively proposed recently due to their superior performance for complex networks [27].
In [28], the authors propose a deep reinforcement learning based-approach to enhance network lifetime through energy-efficient scheduling.The adapted method follows 3 phases: clustering, Duty cycling, and routing phase.DRL interacts in the Duty cycling phase to dynamically decide each node's scheduling mode (NSM).Three modes are considered, which are sleep, transmit, and listen mode.When a sensor node is in sleep mode, its radio is turned off which reduces energy consumption.The transmit mode allows the sensor node to transmit data to the sink along with the capability of transmitting data from the neighboring sensor node.During listen mode, the sensor node can receive data only.Each node dynamically adapts these three modes in each period based on the DRL approach.A deep Q-Learning is adopted that comprises the following operations: states, actions, Q-value, payoff, and reward [33].The algorithm learns the environment state information and obtains the best action.

THE JOINT EFFECT OF THE PARAMETERS FROM MULTIPLE LAYERS
Most of the literature studies focus on tuning the parameters of a specific layer on the performance.Nevertheless, there is a joint effect of multiple parameters from different layers on the performance, particularly when multiple performance requirements are to be met simultaneously and there are inherent trade-offs between them.As a result, relying on the parameter tuning for each layer at the time may only be able to provide suboptimal performance trade-offs.
The authors in [29] introduce a multilayer perceptron model [30] to predict the energy consumption along with the packet delivery ratio.Around 48,000 configurations of 7 key parameters at different layers are considered for each stack parameter configuration.The parameters used at the PHY layer are the distance (DT) between nodes and the transmission power level (TP).At the MAC layer, we find the maximum number of transmissions (MT), the retry delay time for new retransmission (RD), and the maximum queue size (QS) of the queue on top of the MAC layer used to buffer packets when they are waiting for re-transmission.The Application layer comes with the packet inter-arrival time (TP) and the packet payload size (D).A summary of the stack parameters, their values range, and the rationales behind the considered values are summarized in table 1 and table 2. Prediction results showed that TP, DT, AT, and OF are the most common and contributed features that significantly optimize the prediction error for both PDR and EC.Therefore, it is necessary to consider both metrics when it comes to predicting their values.In their following works [31], the same parameters are used to predict the delay between sending and receiving a packet.The delay is an important WSN delaysensitive and mission-critical application such as emergency response and healthcare.The results showed the impact of AT, IAT, RD, PS, DT, TP, QS, and MT on minimizing the prediction error.As a result, all features have a contribution in predicting delay.
The trade-off between PDR and signal-to-noise ratio (SNR) is studied in [32].An MLP approach is adapted to predict both metrics.Prediction outcome showed the impact of five features for both PDR and SNR.For SNR the features are: OF, AQS, AT, TP, and DT.And for SNR the features that yield the optimal prediction error are OF, AQS, IAT, TP, and DT.Consequently, four out of five parameters for predicting SNR and PDR are the same.

CONCLUSION
The design of wireless sensor networks implies new limitations compared to traditional networks.The major constraint is the management and configuration of WSNs given their variation over time.In this survey, we have addressed the ability of deep learning framework on performance prediction which will give the possibility of a dynamic and accurate configuration of the WSNs either on each stack layer or cross-layer.In the future, we plan to include more DL-based approaches for WSNs such as quality of services prediction and network security based on packet parameters.Furthermore, we recommend going deep on cross-layer approaches which will give a better trade-off between network performances, and also clear the confusion of the configuration of each layer separately.

MT 5 attempts
are enough in our worst channel condition.RD Two different values are chosen to test the effect of retry delay.TP TinyOS provides 8 output power levels from -25 dBm to 0 dBm DT The maximum distance in the experiment scenario is 35 m.PT The protocol type which nodes are used in the current networks MTY The modulation type of received signal SA The spectrum availability to schedule connections.NSM Node scheduling mode for routing protocols.

Table 1 .
Stack parameter configuration

Table 2 .
Stack parameters description