Received 7 February 2016; accepted 28 March 2016; published 31 March 2016
Recent advances in technology especially in electronics and communications allowed the emergence of WSNs. A WSN is a collection of sensor nodes.
Sensor nodes have sensing capabilities which makes them a suitable solution for sensing and collecting data from different environments  .
Sensor nodes are low-cost nodes with limited computing power, scarce memory, low bandwidth, and most importantly limited energy source. Sensor nodes are operated by batteries, which in most cases are not rechargeable or easy to replace  .
A WSN is a collection of hundreds or thousands of wireless sensor nodes that are often deployed in remote areas as shown in Figure 1, whose job is to collect data wirelessly and deliver it to a base station  .
Although WSNs are a subcategory of Ad-Hoc networks, they differ mainly in the energy constraint nature of WSN nodes. This forced routing algorithms for WSN to focus on energy consumption over other features to prolong the network lifetime. This gave rise to a class called energy-aware routing protocols  .
WSNs have a wide range of applications in different fields, but the most common use for WSNs is monitoring. The WSN is usually deployed over a region or a structure to monitor one or more phenomenon then report the reading collected, by its sensors, to base station (sink node) which can convey the aggregated data to a human operator for further processing  .
Some common examples of WSNs applications are military, where sensor nodes are used to detect enemy movement, traffic control,where sensor nodes collect data about car jams, health sector to monitor a patient’s condition, and many more  .
Also, WSNs are a backbone for the new emerging Internet of Things (IoT) technology. WSNs can cooperate with RFID systems to better track the status of things, i.e., their location, temperature, movements, etc. As such, they can augment the awareness of a certain environment and, thus, act as a further bridge between physical and digital world  .
The special nature of WSNs, mainly the limited energy source in addition to low computation and memory capabilities, made traditional routing algorithms unsuitable for WSNs  .
Figure 1. Wireless sensor network.
Figure 2. Wireless sensor node.
This motivated researchers to design routing algorithms that suit the needs and nature of WSNs. One of the early and well-known routing algorithms for WSNs is directed diffusion.
WSNs routing algorithms are classified into data-centric, hierarchical, location based, and QoS routing. Directed diffusion falls into the data-centric category  .
Directed diffusion tackles WSNs limitations by introducing the naming scheme, traversing packets using multi-hop, and enforcing certain paths to avoid flooding  .
Some of the major shortcomings of directed diffusion are it does not ensure the shortest path, does not consider energy levels, and does not avoid critical nodes  .
In WSNs, routing is based on local information among neighboring nodes. Routing decisions are made locally; each node will select the next hop without any clue about the other nodes on the path. Although a full knowledge about the network yields better routing, that is not feasible in WSNs due to memory limitation and to the high traffic needed to collect the needed data about all the nodes in the network  .
In our work, we take a middle way between full network knowledge (holistic) and local knowledge. Aware diffusion purses a semi-holistic approach. Instead of collecting data about the whole network we only collect the needed data about the potential paths between the source (sensing node) and destination (sink node). This means that at the moment of choosing the next hop the node will have information about the potential paths leading from the node making the routing decision to the destination.
In this paper, we will be introducing aware diffusion which ensures data is being sent through the best path between the source and the sink according to path length and energy level metrics. This will result in less and healthier nodes being used to transmit the same data which in turn will result in less energy consumption and longer network lifetime.
The rest of the paper is organized as follows: Section 2 is dedicated to Data-Centric protocols. In Section 3 we will give a general overview of directed diffusion and how it works. Section 4 will briefly touch on some of the attempts that were made to improve on directed diffusion and their shortcomings. A short introduction to machine learning is presented in Section 5. Our proposed protocol is described in Section 6. Section 7 is dedicated to simulation results, and finally, the conclusion is presented in Section 8.
2. Data-Centric and Flat-Architecture Protocols
As mentioned earlier, the huge number of sensor nodes makes it very hard to assign IDs to nodes. Therefore, data-centric protocols treat all nodes equally, so the focus here is the data, not the nodes. Data here is identified by attributes; the requesting data is done by the attributes of the phenomenon  .
Data-centric networks have a flat structure where all nodes play the same role in routing the data, and all nodes collaborate to perform the routing task. This gives data-centric networks the advantage of simplicity where no topology management is required  .
Examples of Data-centric protocols are: Flooding, Gossiping, SPIN, and Directed Diffusion  .
3. Directed Diffusion
Directed diffusion is a data-centric data dissemination protocol that is also application-aware in that data generated by sensor nodes is named by attribute-value pairs. The main idea of directed diffusion is that nodes request data by sending interests (also called queries) for named data. This interest dissemination sets up gradients within the network that are used to direct sensor data toward the recipient, and intermediate nodes along the data paths can combine data from different sources to eliminate redundancy and reduce the number of transmissions  .
Directed diffusion does not rely on globally valid node identifiers, but instead uses attribute-value pairs to describe a sensing task and to steer the routing process. For example, a description for a simple vehicle-tracking application could be:
type = vehicle//detect vehicle location.
interval = 20 ms//send data every 20 ms.
duration = 10 s//perform task for 10 s.
rect = [−100, −100, 200, 200]//from sensors within rectangle.
That is, a task description expresses a node’s desire (or interest) to receive data matching the provided attributes. The data sent in response to such interests is also named in the same manner, that is, using attribute- value pairs  .
Once an application has been described using this naming approach, the interest must be diffused through the sensor network. This process is shown in Figure 3. A sink node periodically broadcasts an interest message to its neighbors, which continue to broadcast the message throughout the network. Each node establishes a gradient toward the sink node, where a gradient is a reply link toward the neighbor from which the interest was received. As a consequence, using interests and gradients, paths between event sources and sinks can be established. Once a source begins to transmit data, it can use multiple paths for transmission toward the sink. The sink can then reinforce one particular neighbor based on some data-driven local rule. For example, a sink could reinforce a neighbor from which the sink has received a previously unseen event. Toward this end, the sink resends the original interest message to the neighbor, which in turn reinforces one or more of its neighbors based on its own local rule  .
4. Directed Diffusion Improvements
Several attempts were made to improve on directed diffusion after it had been initially developed. In this section, we don’t claim a full review of directed diffusion improvements or modification but using few examples to show the major trends in these modifications.
In  they proposed an energy-efficient diagonal-based directed diffusion. In  a passive clustering approach was used. In  they also used clustering to improve the performance of directed diffusion. All of these attempts focus on topology change to increase scalability and minimize the cost of flooding the interests. These approaches add topology management overhead and rip directed diffusion from its simplicity.
In  and  nodes’ locations are being introduced to directed diffusion to partition the sensed area or build geographic grids. Nodes’ location are being used to minimize interest flooding, direct path enforcing and reduced redundant data transmission.
As in  -  the work done in  and  takes directed diffusion to another topology paradigm which is location-based routing. Location-based routing comes with its own overhead; special devices or algorithms to determine node’s location.
In  an Energy Aware Directed Diffusion protocol (EADD) was proposed. It gives preference to nodes with higher energy level by assigning them a shorter response time compared to nodes with lower energy level which will end up by choosing nodes with higher energy level.
In  EAADD (Energy-Aware Adaptive Directed Diffusion) improves on the work done in  by considering node’s drainage history. This is done by bearing in mind the correlations of nodes’ available energy between adjacent rounds, then they use an adaptive algorithm to choose next hop node which is more durable.
In  they save energy consumption by setting a gradient diffusion depth (GRE-DD), so interests’ propagation stops when GRE-DD is reached to avoid total interest flooding. In the same time, only nodes with energy level higher than a set minimum are chosen as gradients in the gradient setup phase.
As can be seen in  -  the routing decisions is based on local knowledge.
5. Machine Learning
Machine learning is a field of Artificial Intelligence. Machine learning focuses on improving a learned concept. The learning process is achieved by feeding the system with training examples from the environment. The system uses this data to form a general concept.
(a) (b) (c)
Figure 3. (a) Interest propagation; (b) Gradients setup; (c) Data delivery.
Machine learning could be categorizedas incremental and non-incremental learning. In non-incremental learning, the concept is learned from an initial set of data and not modified after that until a new process is initiated. In incremental learning, on the other hand, the learning process continues after the initial set of data by continuously collecting data about the environment and refining the learned concept  .
Based on the fact that transmission is the main source of energy depletion in WSNs, in our work we decided to use non-incremental learning to avoid the continues data collection required in incremental learning which will result in more data collection packets being sent on regular basis, and as result more energy consumption that overweighs the benefit of improving the learned concept  .
In aware diffusion, the learning data is collected every time a new query is initiated and same learning data is used for the duration of the query lifetime.
6. Proposed Protocol
This section will be dedicatedto describing the design of our proposed modified directed diffusion routing protocol.
The design of the reinforcement scheme in directed diffusion is targeting minimum delay or a maximum number of packets delivered. The problems with this design are:
We try to solve these problems by collecting data about all the available paths from the sensing node to the sink and then use these collected data to decide the best path using non-incremental machine learning according to our criteria.
The collected data are total energy on the path, number of nodes on the path (Hop Count), and the lowest energy level of a node on the path to be able to identify the critical nodes.
Critical nodes are nodes which have been used more than others due to their location, which results in energy drainage.
The steps of our proposed routing algorithm are shown in Figure 4 and are described below:
A. Interests Propagation
Interests are flooded through the sensor network. For each active task, the sink will broadcast an interest message shown in Figure 5 to all its neighbors. Each node that receives an interest message will also broadcast it to all its neighbors.
Interest entries in the cache do not contain information about the sink, but just about the immediately previous hop. Also identical interests are aggregated into a single entry.
When a node receives interest, it checks to see if the interest exists in the cache. If no matching entry exists the node creates an interest entry. This entry has a single gradient (a gradient specifies a direction in which to send events) pointing toward the neighbor from which the interest was received. If an interest entry does exists, but no gradient for the sender of the interest, the node adds a gradient with the specified value.
Finally, if both an entry and a gradient do exist, the node simply updates the attribute fields if they are different.
When an interest’s entry has expired, the interest’s entry is removed from the interests’ cache.
Not all received interests are re-sent. A node may suppress a received interest if it recently re-sent a matching interest.
B. Query Response
The query response stage is one of the major changes we introduced to directed diffusion. This stage does not exist in the original directed diffusion. This stage is which allows us to collect the needed data about each available path and then use these data in enforcing the best path according to our criteria.
In our case, as shown in Figure 6, we are collecting total energy on the path, number of nodes on the path (Hop Count), and the lowest energy level of a node on the path to be able to identify critical node, but the set of collected data could be different from one application to another according to what is important to the appli- cation in use.
For example, an application that is concerned about fast delivery could collect data about nodes’ buffer size to avoid congested nodes.
Figure 4. Flow chart of algorithm steps.
Figure 5. Interest packet.
Figure 6. Query response packet.
When a node has, at least, one active interest, the node will switch on its sensors and start sensing for the requested data.
If the sensing node senses data that matches the requested data by the interest, it will generate a Query Response Packet and send a copy of it to all the gradients associated with the interest.
The base station will receive the Query Response Packet through multiple paths. Then the base station can choose the shortest path and reinforce the source sensors to use the chosen path by using enforcing packets.
Forwarding nodes, on the other hand, could receive the same Query Response Packet flooded by the sensing node from multiple neighbors, but it will only forward one of them.
Choosing the best path is done by calculating the promising factor (PF) for each path from the source (sensing node) to the destination (sink node) using the following formula:
TE: Path Total Energy Ratio.
LE: Path Lowest Energy Level Ratio.
HC: Path Hop Count Ratio.
α, β, g: are the weights given to each factor in the Equation (default value for all weight is 1).
The term TE in Equation (1) will give preference to paths with higher energy, which is an indicator of the health level of a path. The LE will help us avoid critical nodes, which has been used more than others and their energy level dropped significantly. The term HC will give preference to shorter paths over longer ones which will result in faster delivery.
The Path Total Energy Ratio (TE) is calculated using the following formula:
The Path Lowest Energy Level Ratio (LE) is calculated the following formula:
The Path Hop Count Ratio (HC) is calculated the following formula:
where Hop Count is the number of nodes on path
After choosing the best path, the base station will send an enforcing packet to the neighboring node that forwarded the Query Response Packet containing the best path information. In turn, each forwarding node on the path of the enforcing packet will forward the enforcing packet to the node it received the Query Response Packet from. The forwarding node will know which node to forward the enforcing packet to from the interests table because the interest table of each node will cache the Query Response Packets forwarded in response to the interest we are currently working on. And the cached Query Response Packet contains the source node address.
C. Data Propagation
After the reinforcing phase is done the source nodes know which neighboring node to use to forward the data packets.
So every time it senses a matching data to interest requested data it will generate a data packet and forward the data packet towards the base station using the enforced node.
Every node along the path will do the same thing and forward the data packet through the enforced neighbor until the data packet reaches the base station.
This process will continue until the “time to live” field associated with the interest becomes zero. Then this interest will be removed from the table of interests in the source node and no more data packets will be generated in response to this interest.
7. Simulation and Results
To evaluate the performance of aware diffusion we implemented it using Castalia simulator. Castalia simulator is a framework that can be used on top of OMET++ to simulate WSNs, Body Area Networks (BAN) and generally networks of low-power embedded devices. Castalia is an open source simulator which allows researchers to develop and implement their own protocols  .
To compare the performance of aware diffusion with directed diffusion, we have simulated both protocols using the same simulation parameters shown in Table 1.
Four simulation experiments were conducted. The first simulation experiment focus was to compare the total energy consumption between the two protocols in quest to prove that aware diffusion will result in less energy consumption.
The second simulation experiment focus was to compare the total number of packets delivered to the sink node between the two protocols in quest to prove that aware diffusion will provide higher reliable delivery of data packets to the sink node.
The third simulation experiment focus was to compare the network lifetimein quest to prove that aware diffusion will extend the network lifetime.
The fourth simulation experiment focus was to compare the load balancing in quest to prove that aware diffusion will provide a better distribution of energy consumption among network nodes.
A. Experiment One: Energy Consumption
The simulation was performed with a different number of nodes ranging from 25 to 256 to prove that energy conservation will occur regardless of networks size.
Figure 7 shows the simulation results comparing the total energy consumption (in Joules) for both directed diffusion and aware diffusion.
As shown in Figure 7 the total energy consumption was always less in the case of aware diffusion.
Table 1. Simulation parameters.
Figure 7. Comparing total energy consumption.
As shown in Figure 8 the difference in the total energy consumption was increasing as the number of nodes increased because in bigger networks the paths tend to be longer and assuring the healthier and shorter paths will decrease energy consumption significantly.
B. Experiment Two: Reliable Delivery
Figure 9 shows the simulation results comparing the total number of packets delivered to the sink for both directed diffusion and aware diffusion.
As shown in Figure 9 the total number of packets was always more in the case of aware diffusion.
Actually, as shown in Figure 10 the difference in the total number of packets delivered was increasing as the number of nodes increased because in bigger networks the paths tend to be longer and assuring healthier and shorter paths will increase packets delivery.
C. Experiment Three: Network Life Time
The simulation was performed with a different number of nodes ranging from 25 to 225 to prove that network lifetime will be extended regardless of network size.
We measure the network lifetime from the moment the simulation starts till the point when the sink node stops receiving any new data packets.
Figure 11 shows the simulation results comparing the network lifetime for both directed diffusion and aware diffusion.
As shown in Figure 10 the network lifetimes was always higher in the case of aware diffusion.
In Figure 12 we present another important metric which is first node death time. This metric is also used in the literature to compare network lifetime. The first node death time was always higher in the case of aware diffusion.
D. Experiment Four: Load balancing
The simulation was performed with a different number of nodes ranging from 25 to 256 to prove that energy balance will occur regardless of networks size.
Figure 13 shows the simulation results comparing the standard deviation of energy consumption for all nodes for both directed diffusion and aware diffusion.
As shown in Figure 13 the standard deviation of energy consumption was always less in the case of aware diffusion.
As shown in Figure 14 the difference in the standard deviation of energy consumption tends to increase as the number of nodes increased because in bigger networks the paths tend to be longer, and more energy is consumed.
Figure 8. Difference in total energy consumption.
Figure 9. Comparing total number of packets delivered to sink node.
Figure 10. Difference in total packets delivered.
Figure 11. Comparing network life time.
Figure 12. Comparing network life time.
Figure 13. Comparing standard deviation of power consumption.
Figure 14. Difference in power consumption standard deviation.
The performance of aware diffusion was compared with that of directed diffusion through simulation experiments. Four experiments were conducted. The first experiment compared energy consumption, the second experiment compared data delivery, the third experiment compared network lifetime, and the fourth experiment compared load balancing. From the experiments results, it is concluded that aware diffusion performed better in all four aspects. Also, as the network size increased, the difference in the performance in these aspects increased.