Market trends of Machine-Type Communication (MTC) devices deployments are exponential increases. This is determined by many applications and services discovered such automatic industries monitoring, smart metering, surveillance cameras, environment monitoring and trace devices  . Additionally, according to a broadly used report by Cisco VNI in 2016 up to 2020 the traffic generated from MTC devices expected to reach 49 exabytes  . MTC usually includes a large number of devices deployed randomly in a hostile and highly dynamic environment. Since the data collected coverage and network coverage (communication range) are normally constant at randomly allocated, a high density of the redundancy devices is used to conserve the preferred level of network coverage in order to accomplish enough data collection. Addition, MTC devices are normally the event-driven systems where various devices try to send data at any event of interest occurred  . Since the MTC devices are densely deployed, then the data transmitted from devices to the aggregator (sink) or base station for computational are spatially correlated.
Therefore, the wasted of the computational resources occurred at CA when the data collected from different MTC devices are processing independently. The significant amount of computational resources can be saved by considering the advantages of existence of a spatial correlation.
However, the growth of the data traffic collected by MTC devices increase the pressure on the mobile operators specifically for the delay-sensitive applications, which requires a very short time to be processed. Several approaches are proposed to deal with that challenge such as edge computing, data offloading and data caching    , where the data computation and terminal requests move very close to the data source. Besides, it attempts to reduce the pressure on the mobile operators regarding with the limits of data generated from the terminal devices such as MTC devices, but still encounters the problem of data overloading when the amount of data generated are increases. That amount of data required to be analyzed. Recently, the big data analysis implemented in cloud or enterprises centralized data centers to analyze the data generated from terminal devices. Moreover, data aggregation technique proposed in  where the network congestion originated from the massive data generated by MTC devices reduced. In addition, the data aggregator act as intermediate nodes on the cellular networks can further help minimize the power consumed and transmission delay by MTC devices while transmitting data to the networks  .
Generally, the MTC devices are deployed to perform specific tasks collectively; the data collected from each device are not completely independent rather correlated. Thus, in such case to avoid the resource wasted for the individual device processing at the aggregator, the CA combines the correlated devices together to form a group  . In this paper, we use the centralized aggregator (CA) as a sink or nearby devices computing the data collected from MTC devices in the event-based area. By examining the existence of correlated between the MTC device we propose k-means grouping technique to combine all the spatial correlated devices together. With the limited computational resources at CA, some data will be offloaded to the nearby server allocated to the base station called mobile edge computing server. We investigate the energy consumption at the CA by computing the optimal solution using the brute force method. The main contributions of this paper described on the following:
• We define the correlation model to compute the correlation coefficient between the MTC devices by considering the device coordinate points on the hyperplane.
• We propose device k-means grouping technique to groups MTC devices based on spatial correlation around the data collected coverage.
• We use the differential entropy framework to compute the size of the data in each group.
• Based on the size of the data we introduce the concepts of the data offloading and we define the optimization problem for minimizing the energy consumption on CA for processing the data collected from MTC devices.
• We use the brute force method to find the optimal solution for total energy consumption on the centralized aggregator.
The remaining part of this structured as follows. Related work discussed in Section 2. In Section 3, we discuss the network model that includes the network model together with details of the correlation model and the proposed k-means grouping technique. The Section 4 present the theoretical concepts of computation together with the optimization problem used to minimize the energy consumption and Section 5 discuss the results used to verify our context and lastly we conclude by summarizing the idea presented on whole paper.
2. Related Work
There are many emerging works exist in MTC communication and wireless networks. In  authors proposed the medium access control (MAC) protocol with low latency and energy for hierarchical structured of M2M networks. It permits the effective data originated from the terminal nodes transmitted to a sink node via a cluster head. The cloud based lightweight for mobile core networks investigated by  that improve the challenges MTC occurred from system overload and network congestion .Similarly extended access barring (EAB) in the LTE-A studied in  to deals with the problem of overloaded that originated from a massive number of MTC devices that compete to access networks in a limited burst. The efficiency of data aggregation for M2M network discussed in  with the goal of reducing the total energy expenditure together with coverage probability.
Conventionally, to describe the concept of computational capability, a mobile cloud computing (MCC) have been widely introduced  . However, a cloud allocated far away from terminal devices cannot offer an extensive solution for the low latency services (emergence applications), real-time applications and regular data transmission such as location updates. Therefore, to allow terminal devices communicate directly with cloud will not be feasible and cost-effective because it needs more resources allocation and complex routing protocol design. To deal with these challenges a mobile edge computing technique proposed in  . In MEC, the computation resources allocated near to the terminal devices in wireless cellular networks that helps reduce the delay and energy consumption especially for the delay-sensitive applications  . Then, the energy consumption and computation defined as jointly optimization problem in  . MTC devices use the virtual networks to handle the computing tasks and other tasks are offloaded to the MEC server through wireless cellular networks. A cooperative data aggregation algorithm developed in  for MTC data offloading with help of the multiple trusted mobile relays. The result shows the proposed algorithm prolong the lifetime of MTC devices battery.
Besides the MTC devices have sensing capability, implied the correlation between adopted as in wireless sensor networks. In  authors propose clustered aggregation technique to reduce the number of transmission from the sensor by utilizing the existing of the spatial correlation. Similarly the correlation characteristic of the spatial information is taken as advantages to eliminate the network redundancy  . Furthermore, the compression between the correlation MTC sources based on distributed source coding discussed in  . The three clustering algorithms such as grid dividing alorithm, Weighted Pair Group Method with Arithmetic Mean and K-medoids proposed to balance the tradeoff between compression ratio and decoding delay. In this paper, we study the existence of the correlation when the MTC devices are spatial distributed in the events-based area collects the data and then transmit to the Centralized Aggregator for computation.
3. Network Model and Descriptions
In this section, we describe the details about the network model together with theoretical part of used correlation model and proposed k-means grouping technique.
3.1. Network Model
We consider the network model having the set of MTC devices that consists of sensing ability shown in Figure 1. The MTC devices are randomly distributed on the region to collects the data and then send to nearby Centralized Aggregator (CA) for the computing. Due to the limits of computational resources on the CA can’t save all the data sends from nearby MTC devices, then some required computation data are offloaded to the Base Station that collocated with mobile edge computing server (MEC) for more processing. After the computation to either CA or MEC, the results will be feedback to the corresponding MTC device for future action. From the physical nature of the MTC devices are randomly located on the plane; then the data collected by proximally devices can have higher degree of the correlation. With that case we analysis the presence of MTC devices correlation on the next subsection.
3.2. Description of Correlation Model
The idea of a data correlation in computing is to reduce the size of data collected
Figure 1. Network model for MTC devices data sensing.
by MTC devices by evaluating the existence of data similarity or data dependency. Since the CA wastes the computational resources for processing the individual device that contains the same data independently, therefore we adapt the data correlation model used on the Wireless Sensor Networks to evaluate the presence of correlation between the spatial correlated MTC devices. The correlation model verified by using the covariance function that decreases with Euclidean distance for 0 at and 1 at , where l represents the Euclidean distance between the locations of MTC devices. Additionally, we assume the data collected from the MTC devices denoted as have nature of the multivate Gaussian distribution having mean of and variance of . Hence, the covariance existence between the MTC device and calculated as on  using the pair coordinated points corresponding to the devices and as
where and denotes to variance of devices and respectively. We can further improve the expression as
where denotes a correlation function with given correlation parameter of and represents the existence distance between the device and . Furthermore, we assume, the value of related with data collected from MTC device and as and respectively. To determine the value of covariance function models proposed such as Rational Quadratic, Spherical, Power Exponential and Matern based on the structure of the correlation  . In this paper, we choose the Power Exponential model to express the correlation function existed between the MTC devices that expressed as
where and represents the control parameters for a given correlation and the smoothness at given random region. For simplicity we use and the correlation coefficient between the device and in short represented as becomes
where represents the exponent that control correlation that exist between the devices. Hence, we can determine the correlation matrix of M MTC devices as
The above correlation matrix provides overall existed correlation between the devices with the value between 0 and 1. That’s mean when the value of the devices are located very far not correlated and similarly when the devices are very near to each other have highly data correlation.
3.3. Device Grouping Technique
In utilizing the limited computational resources at CA, we use the grouping technique to groups all the MTC devices based on the coordinate points. Since all MTC devices in each group assume to be very close, then data collected by the individual device may be identical. Hence, the computation resources reduced on the CA by processing together the MTC in one group in either to the CA or MEC.
The proposed grouping technique of MTC devices consist of two part; the first we need to compute the distance or similarities existing between the two devices and then grouping the MTC devices using the clustering technique. In this paper, we compute the distance between the two devices according to the equations on the previous section and using the k-means clustering algorithm to group the MTC devices.
The k-means algorithm frequently used to partition set data automatically into K disjoint clusters or groups as described in  . With the given the MTC devices that are randomly distributed, the aim is to group into different clusters which are near to initialized centroids. We randomly deciding the number of groups as K and then we analyze the k-means technique according to Euclidean distance expressed as
where represents the coordinate point of device and denotes the initialized coordinate point of centroid cluster. Then, the centroid cluster updated iteratively with associated devices as expressed on
where is the set of number of the MTC devices contained on group or cluster. The Algorithm 1 explain in details.
Algorithm 1. K-means grouping technique.
4. Computation and Problem Formulation
In the section, we describe the theoretical details of computational model and we define the energy consumption optimization problem regardless with the data processing decision which either locally at CA or offloaded to the remote at MEC server.
4.1. Computation Model for CA and MEC
The CA collocated near to the MTC devices that receive all the data collected from them. As we assume the MTC devices are randomly distributed, the CA uses the correlation framework to check whether the correlation between the devices existed. After that the devices grouping occurred. Then, the correlation matrix corresponding to the member of each group determined according to the Equation (5). With the data collected by each MTC device follows a multi-variate Gaussian distribution, then we can use the idea of information entropy to find the size of data on each group. Therefore, we adopt the differential entropy as explained in  with assumption that the data collected by MTC devices are quantized with the identical quantization size as . Then, the entropy from the multi-variate Gaussian distribution model expressed as
where represents the correlation matrix of any group with S number MTC devices. Then, the size data in each group based on correlation matrix modeled as entropy corresponding to the data set on each group evaluated as
where represents number of MTC devices presented in the group g. With the above expression, we obtain size of the input data for each group.
4.1.1. Local Execution Analysis
In the local computing, we consider the computation time and energy consumption for processing each group. For the computing process requires the computation capability of CA denoted by and CPU clock frequency represented as  . Then the computation time for processing group i expressed as given by
where represents the size of the input data on the group i that obtained from Equation (9). Furthermore, the energy consumption per each group i given by
where represents the power coefficient at CA and is a constant that determined by architecture of capacity of the chip. The parameter represent frequency exponent with constant value of . The frequently parameter used is approximate equal to 3 as illustrated by  , in this paper we define .
4.1.2. MEC Execution Analysis
As we assume the CA has limits of the computational resources, then some groups are offloaded to MEC server for processing. In this scenario the computation time on MEC obtained by considering the transmission time of uplink, downlink and execution time to the MEC. In this paper, we ignore the transmission time of downlink because the data size remained after processing too small  . Then, the transmission time of uplink evaluated as
where R represents the transmission rate between the CA and MEC which obtained as
where W denotes the channel bandwidth reserved for CA and is the transmission power of CA consumed to offloads to MEC and is the channel gains that normalized by the power of white Gaussian noise. Then, we can evaluate the energy consumption used for transmission as
In addition, we evaluate the computation time at MEC as
where and represents the CPU capability and clock frequency of the MEC server respectively and is the size of input data of the group i. Therefore, the total offloading time for CA, obtained as
4.2. Problem Formulation
From the Equations obtained from previous section we formulate the optimization problem for total energy consumption at CA by considering the energy consumed when groups processed at CA plus the transmission energy when the groups offloaded to the MEC formulated as
where the constraints C1 and C2 represents limits of the completion time for processing the whole group and C3 denotes the decision computing variable for processing on each group either at CA or at MEC. is a maximum completion time to process all the MTC devices based on number of groups.
The optimization problem formulated above is an integer-programming problem, which we can solve using various heuristic approaches such as generic programming, dynamic programming, brute force or exhaustive search method and variable relaxation approaches (linear and semidefinite relaxation programming). Based on analyzing the performance of the proposed grouping technique we present the brute force approach to determine the optimal solution obtained from the finite number of iterations.
5. Performance Analysis and Evaluation
This section describes performance and results for the proposed scheme. To evaluate the performance of proposed scheme we use the parameters referred in  and the devices are randomly distributed at a 150 × 150 cm region. We assume the CA is N900 as smartphone device CPU frequency of 600 × 106 Cycle/sec, CPU capability equal to 650 × 106 Cycle/bytes, computing power of 0.9 W and the value of obtained as
Then, the value of a transmitted power of CA equal 1.012 W, normalized channel gain equal to 17 dB and the transmission bandwidth is 0.185 MHz. Addition, the CPU frequency and capability of MEC equal to 600 × 107 Cycle/sec and 960 × 107 Cycle/sec respectively. The values of quantization level range between minimum and maximum of 1/2 (1 bit) and 1/256 (8 bits) respectively with a degree of the correlation equal to 0.05. We evaluate the performance of the system by comparing the energy consumption when the data processed on the three different implementation scenario in terms of a number of MTC devices and the energy consumption comprises of both energy consumption for computation and for transmission. We simply choose the number of group or cluster equal to 4 and 8 to verify our proposed grouping technique.
5.1. Comparison Based on Computation Decision
The energy consumption succeeded by the proposed grouping technique as the number of MTC devices increases as illustrated in Figure 2(a) and Figure 2(b). With the different number of a group, the energy consumption obtained from optimal BFM outperforms the local computation (CA) at delay constraints while at the remote computing (MEC) performed better without satisfy the delay constraints. From the two figures as number group increases the energy consumption increases, which influenced by increases of the data size in the case of optimal BFM energy consumption almost 18% of the total energy increases between the two groups. Similarly in Figure 3(a) and Figure 3(b) compared the computation time with the number of MTC devices as illustrated the computation time achieved by the proposed technique as the number of MTC devices increases. Shown that, the computation time at the local computing (CA) was much small compared to the other two because the CA is very near to MTC devices. In addition, as the number of a group increases the optimal BFM approaches to remote computing (MEC). This implied that much number of groups are offloaded compared to the groups that processed at the local. We further investigate the tradeoff between energy consumption and computation time as shown in Figure 4(a) and Figure 4(b). From, the curves demonstrated that the data processing at the MEC consumed the small amount of energy compared to the CA but it consumes much time compare to processing at CA. With our applications that required the computing time becomes small, then the
Figure 2. Energy consumption for MTC devices groupped at (a) K = 4 and (b) K = 8.
Figure 3. Computaion time for MTC devices groupped at (a) K = 4 and (b) K = 8.
Figure 4. Trade-off between the energy consumption and computaion time for MTC devices grouped at (a) K = 4 and (b) K = 8.
local computing more favorable compared to offloading to remotely computing (MEC). Moreover as illustrated in a small number of groups the size of data processing is much small compared to a larger number of groups. The optimal BFM solution approaches at CA computing for grouping MTC devices into a small number of groups and at larger number of groups it approaches at MEC computing.
5.2. Impact of Grouping Technique
Figure 5(a) and Figure 5(b) illustrate the benefit of grouping technique based on the number of MTC devices. The result shows that when the number of the group increases the energy consumption and the computation time is increased, which means the size of the data processed increases and then, the optimal BFM solution for grouping technique outperforms compared with to processes un-grouping devices in which the each MTC device considered the individual.
6. Conclusion and Future Work
In this paper, we investigate the problem of data correlated in MTC devices based on the resource-constrained allocated at Centralized Aggregator for computing and processing. We propose k-means grouping technique to group MTC devices corresponding to a spatial correlation on the event-based area. With combining the MTC devices, we reduced the data redundancy caused by similar data processing and saving the computation resources at CA. Then, we use the differential entropy to measure the size of data contents in each group. Through the extensive simulations, we illustrated the benefits of our proposed grouping technique compared with the individual MTC device computation. Our simulation results indicate that the optimal BFM solutions performance is very close between the computation at CA and MEC in terms of energy consumption and the trade-off in computation time. In the future, we will investigate more grouping techniques together with multiple aggregators.
Figure 5. Benefits of grouping technique in different number of groups.
We thank the Editor and the referee for their comments. This work was partially supported by Natural Science Foundation of China (Grant No. 61461136002), Key Program of National Natural Science Foundation of China (Grant No. 61631018), Fundamental Research Funds for the Central Universities, and Huawei Innovation Research Program.
 Zheng, K., Ou, S., Alonso-Zarate, J., Dohler, M., Liu, F. and Zhu, H., (2014) Challenges of Massive Access in Highly Dense LTE-Advanced Networks with Machine-to-Machine Communications. IEEE Wireless Communications, 21, 12-18.
 Khodashenas, P.S., Ruiz, C., Siddiqui, M.S., Betzler, A. and Riera, J.F. (2017) The Role of Edge Computing in Future 5G Mobile Networks: Concept and Challenges. Cloud and Fog Computing in 5G Mobile Networks: Emerging Advances and Applications, 70, 349.
 Rebecchi, F., De Amorim, M.D., Conan, V., Passarella, A., Bruno, R. and Conti, M. (2015) Data Offloading Techniques in Cellular Networks: A Survey. IEEE Communications Surveys and Tutorials, 17, 580-603.
 Li, L., Zhao, G. and Blum, R.S. (2018) A Survey of Caching Techniques in Cellular Networks: Research Issues and Challenges in Content Placement and Delivery Strategies. IEEE Communications Surveys & Tutorials, 20, 1710-1732.
 Dawy, Z., Saad, W., Ghosh, A., Andrews, J.G. and Yaacoub, E., (2017) Toward Massive Machine Type Cellular Communications. IEEE Wireless Communications, 24, 120-128.
 Guo, J., Durrani, S., Zhou, X. and Yanikomeroglu, H. (2017) Massive Machine Type Communication with Data Aggregation and Resource Scheduling. IEEE Transactions on Communications, 65, 4012-4026.
 Dai, R. and Akyildiz, I.F. (2009) A Spatial Correlation Model for Visual Information in Wireless Multimedia Sensor Networks. IEEE Transactions on Multimedia, 11, 1148-1159.
 Park, I., Kim, D. and Har, D. (2015) MAC Achieving Low Latency and Energy Efficiency in Hierarchical M2M Networks with Clustered Nodes. IEEE Sensors Journal, 15, 1657-1661.
 Malak, D., Dhillon, H.S. and Andrews, J.G. (2016) Optimizing Data Aggregation for Uplink Machine-to-Machine Communication Networks. IEEE Transactions on Communications, 64, 1274-1290.
 Sajnani, D.K., Mahesar, A.R., Lakhan, A. and Jamali, I.A. (2018) Latency Aware and Service Delay with Task Scheduling in Mobile Edge Computing. Communications and Network, 10, 127-141.
 Salam, T., Rehman, W. and Tao, X. (2017) Cooperative MTC Data Offloading with Trust Transitivity Framework in 5G Networks. 2017 IEEE Global Communications Conference, Singapore, 4-8 December 2017, 1-7.
 Yoon, S. and Shahabi, C. (2005) Exploiting Spatial Correlation towards an Energy Efficient Clustered Aggregation Technique (CAG) [Wireless Sensor Network Applications]. 2005 IEEE International Conference on Communications, 5, 3307-3313.
 Dogandzic, A. and Qiu, K. (2008) Decentralized Random-Field Estimation for Sensor Networks Using Quantized Spatially Correlated Data and Fusion-Center Feedback. IEEE Transactions on Signal Processing, 56, 6069-6085.
 Wang, W., Zhu, J., Zhang, S. and Zhou, W. (2018) Tradeoff between Compression Ratio and Decoding Delay of Distributed Source Coding for Uplink Transmissions in Machine-Type Communication. International Journal of Distributed Sensor Networks, 14, 1550147718787109.
 Vuran, M.C. and Akyildiz, I.F. (2006) Spatial Correlation-Based Collaborative Medium access Control in Wireless Sensor Networks. IEEE/ACM Transactions on Networking, 14, 316-329.
 Berger, J.O., De Oliveira, V. and Sansó, B. (2001) Objective Bayesian Analysis of Spatially Correlated Data. Journal of the American Statistical Association, 96, 1361-1374.
 You, C., Huang, K., Chae, H. and Kim, B.H. (2017) Energy-Efficient Resource Allocation for Mobile-Edge Computation Offloading. IEEE Transactions on Wireless Communications, 16, 1397-1411.
 Yu, Y., Zhang, J. and Letaief, K.B. (2016) Joint Subcarrier and CPU Time Allocation for Mobile Edge Computing. 2016 IEEE Global Communications Conference, Washington DC, 4-8 December 2016, 1-6.
 Miettinen, A.P. and Nurminen, J.K. (2010) Energy Efficiency of Mobile Clients in Cloud Computing. Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing, HotCloud’10, Boston, 22-25 June 2010, 4.