A crowd sensing data collection framework based on edge computing

. With the rapid increase in the use of mobile devices equipped with built-in sensors, mobile crowd sensing (MCS) as a human driven perception mode came into being. Since a large number of users submit data to the cloud servers in parallel, it will not only increase the pressure on the cloud servers, but also lead to the problem of data redundancy. In order to solve this problem, this paper introduces edge computing into mobile crowd sening for collecting perception data, and proposes a group intelligence perception network data collection model based on edge computing. The data is observed and sampled at the edge node through the compressed sensing algorithm, and the compressed data is transmitted to the cloud server.Using Cl_ BP algorithm restores the compressed data in cloud server .The results show that compared with the orthogonal matching pursuit algorithm (OMP), the data collection model based on edge cloud computing proposed in this paper can better solve the problem of data redundancy.


Introduction
Mobile crowd sensing is a new form of data collection, which combines the concept of crowdsourcing with the perception of intelligent devices. Through their existing intelligent mobile devices, users use the embedded sensors to collect and transmit sensing data, so as to complete large-scale sensing and information sharing [1]. Mobile crowd sensing benefits from a large user group, which makes it a source for us to obtain all kinds of rich information in the environment [2].
The implementation of traditional mobile crowd sensing service is a centralized service based on cloud [3][4][5]. In this way, in the case of large-scale MCS deployment, device management and real-time data processing require a lot of computing resources. However, cloud based group intelligence perception will put a great load on mobile networks and increase the traffic of running cloud servers. For real-time perception scenarios, due to the frequently changing context when a large number of devices participate in MCS tasks, it will increase the delay of data and information dissemination. In view of this phenomenon, edge computing provides a new processing method.
Mobile edge computing (MEC) offloads computing near mobile devices by introducing a new middle layer [6], which is responsible for data aggregation, filtering, processing and storage. As shown in Figure 1. In the context of group intelligence perception, MEC can preprocess the original perception data collected by intelligent devices, and can also independently support real-time perception of scenes and deal with users' frequently changing context in time without contacting cloud server. The mobile crowd sensing task execution process based on edge computing [7] reduces the computational complexity of centralized cloud services, reduces the delay caused by sending notifications and alerts to mobile devices in real-time swarm intelligence aware scenarios, and reduces privacy threats. Mobile crowd sensing system will face many problems in the process of data collection. On the one hand, it is necessary to ensure that mobile users cover all perceived areas in order to obtain reliable and complete data. However, in practice, the platform cannot afford to recruit so many users because of the large cost involved, and sometimes there are no users available in the required sensing area. At the same time, this method may lead to data redundancy and consume additional system resources. On the other hand, the platform will collect a large amount of data after publishing tasks, and processing a large amount of perceptual data will greatly increase the load of cloud server. Therefore, how to reduce the transmission of redundant information and reduce the consumption of system resources has become a major problem in the research of swarm intelligence perception system.
To sum up, this paper designs a data collection mode based on edge computing. Firstly, after processing the data collected by users by the edge server, the data is observed and sampled at the edge node through the compressed sensing algorithm [8], and the compressed data is transmitted to the cloud server. Secondly, using Cl_ BP algorithm restores the compressed data, so as to effectively restore the data in the target area.

System model
The model proposed in this paper consists of three layers, including cloud server, multiple edge servers in the region and a large number of mobile users. The task requester first issues a perception task to the cloud server. The cloud server decomposes the request according to the required location and distributes the sub request to the corresponding edge server. The edge server selects the appropriate user to perform the sensing task. Then, the user sends the collected data to the edge server through short-range communication technology. The edge server preprocesses the data, then compresses it and forwards it to the cloud server. Finally, after collecting all the sensing data, the cloud server reconstructs the data and displays the aggregated results to the requester. The flow of the whole model is shown in Figure 1.

Data compression based on compressed sensing algorithm
This section will compress the data in the region at the edge layer through the idea of compressed sensing. Then, the filled complete data of all users from time 1 to time t can be compressed. For a single user P, the perception data obtained after t times is The collected perception data can be expressed as The union of areas covered by different users at different perception times as X = X ୲ ଵ ‫‬ X ୲ ଶ ‫.‬ . ‫.‬ X ୲ ୫ . When using the compressed sensing principle for data collection, according to the requirements of CS, if you want to compress the collected data, first analyze the sparsity of the collected sensing data to determine whether the collected data is compressible, and then use discrete cosine transform (DCT) to analyze the sparsity of the collected data. If the data can be compressed, some time points should be selected through the row vector in the observation matrix. At the selected time point, the weighted measurement value multiplied by the collected perceptual data and the corresponding elements in the observation matrix is generated in the edge server. Therefore, the measurement value submitted by a single user can be expressed as (1): Where ୧୨ is the collection coefficient.
The perceptual data x 1 collected by the user at time t 1 is multiplied by the collection coefficient, and then the data is sent to time t 2 ; At time t 2 , the sum of the observed value at time t 1 and the perceived data x 2 collected by the user at time t 2 multiplied by the collection coefficient and the added weight is sent to the next time until the user collects the data within his perceived time. After t* times of observation, the observation value formed by user P at the time of t* is shown in (2), where t* << t

Data reconstruction algorithm based on CL_BP
In the previous section, after the edge node has compressed the data, this section uses the reconstruction algorithm of base pursuit cloud (cl_bp) of cloud server to restore the original data. In the discrete cosine transform domain, the data X collected by the user meets the compressibility, and then the data construction method of the original data X can be described by using the compressed sampling observation value Y. then the estimated value X of the original data X can be reconstructed to solve the optimization problem in (3): Where X is the estimated value of the original data X, and X is sparse in the transform domain, then the sparse vector s can be determined. According to (4), it can be equivalent to solving (3): Among them s ො is sparse data s estimates, s is sparse coefficient of original data X in , is a sparse transformation base in this study Due to the (4) is a NP hard problem, Donoho put forward when the signal is sparse can use L 1 norm instead of L 0 norm, it is s ො = argmin||s|| ଵ s. t. Y = ȰȲs (5) Therefore, in order to solve the above problems, the data reconstruction adopts CL_BP algorithm to solve, although the amount of calculation is large, the reconstruction effect of the original data is better. For (5), the L1 norm regularization can be re expressed as a linear programming problem and can be solved effectively. CL_ BP algorithm generally adopts interior point method or gradient projection method to solve this kind of problem. Gradient projection method is faster, but the result often has deviation. Compared with interior point method, the solution speed is slower, but the result is relatively accurate.

Dataset
The data used in this experiment is the real-life air quality data set [9], which collects the air quality information of Beijing, mainly PM2 5, which records the air quality collected by 36 air quality monitoring stations at the same sensing time. The monitoring station records once every hour, with a total of 9504 data points. The comparison algorithm we use is the orthogonal matching pursuit algorithm OMP , which is one of the greedy algorithms.

Results and analysis
This paper is verified by compression ratio (CR) and normalized mean square error (NMAE), where CR and NMAE are as shown in (7) and (8): Where N represents the length of the original data and M represents the length of the compressed data.x ୧ represents real data and x న ෝ represents recovered data.  Figure 2 shows the reconstruction effect of cl-bp algorithm on the air quality data of one of the 36 stations at different times. It can be seen from the figure that the original data and reconstructed data are basically the same, indicating that cl-bp algorithm has good reconstruction effect and high reconstruction accuracy.    Figure 3 and 4 are comparative experiments of reconstruction errors of cl-bp algorithm and OMP algorithm when the compression ratio of air quality data is 0.1 to 0.5. The abscissa of the two figures is the compression ratio and the ordinate is the reconstruction error. It can be seen from the figure that with the increase of compression ratio, the reconstruction error of the two algorithms is generally decreasing. This is because with the increase of compression ratio, the amount of compressed data is also increasing, that is, the observed value is increasing, so the reconstruction accuracy is high. However, the reconstruction error of OMP algorithm is relatively large before the compression ratio is 0.45, while the reconstruction error of the proposed cl-bp algorithm tends to be stable after the compression ratio is 0.25, which is basically 0, Moreover, the reconstruction error is very small, which is better than OMP algorithm.

Conclusion
This paper introduces edge computing into the data collection process of mobile crowd sensing. In the edge server, the compressed sensing algorithm compresses the sensing data, which reduces the communication pressure of the cloud server and solves the network delay caused by the transmission of user data to the cloud server. Through Cl_ BP algorithm recovers data in the cloud, which can reconstruct data with less data and ensure less error. The layered model based on edge computing proposed in this paper can better solve the problem of data redundancy and reduce the load of cloud severs.