Cluster Based Communication In Wsn: A Survey

: WSN establishes an essential role in industrial applications. WSN consists of multiple tiny sensor nodes (SNs), which are organized in an outlying area.These sensor nodes consist of battery and thus it’s very challenging to interchange these SNs in a brassy environment once deployed.Therefore, in order to increase the network life, the sensor node energy needs to be preserved. Thus, in some protocols these wsn’s nodes grouped to form a cluster for efficient communication. In hierarchical approach,there aretwo concepts-one based on cluster and other one based on grid concepts. In cluster-based concepts, SN’s assembled to form clusters, where nodes having more residual energy become a leader of the cluster and thus known as Cluster Head. Thus, the network is separatedinto virtual grids in case of a grid based concept. This paper discusses the hierarchical routing protocol, its design challenge andan advantages and disadvantages of protocols.


Introduction
With the growing era, new ideas have encouraged the design of tiny electronic, low-powered SNs.These nodes are located in remote areas to calculate various physical quantities.These nodes cluster together to form networks known as wireless sensor networks.It finds applications in multiple industrial uses.For example, observing of environmental condition, management of disaster,monitoring of health, etc.The network's sensor nodes (SNs) interact with one another to handoverinformation to the base station (BS).It has not used any pre-existing infrastructure but depends on the routing as shown in figure (1) protocol [1,2].

Figure 1: Wireless Sensor Network
The routing algorithm represents the choice of route (s) selection, which is mostly developed by two ways either by clustering algorithm [3, 4, 5, and 6] or a load-balanced tree algorithm [7, 8, 9, and 10].Thus, for the purpose to upgrade the network life, energy consumption must be bringing down by the routing protocol.Therefore, routing is quite difficult in sensor networks.The nodes in the network consist of a battery, so the SN's consumption of energy must therefore be adequately achieved to extend the life of the network .Also, maintaining routes is a significant issue as there are a lot of topological changes in the network, and thus, it may causes high energy consumption.
According to the literature survey done, the routing approaches are listed into four types: Build on network structure, topology centred, reliable routing, and communication.Each of themis further categorized into subtypes, as given in Figure 2. The network structure technique is further categorized as flat and hierarchical.In flat approach, all sensing nodes are similar.It has some benefits, for e.g., maintaining of topology is not needed and thus provides good link between source and destination.It uses the concept of flooding, in which large amount of energy consumption take place because of redundant messages [10].In hierarchical approach,initially clusters are formed by the sensing nodes, and then a cluster head (CH) is elected, to perform routing.
Mostly, in hierarchical routing approach, the routing is done in two steps.In one of the step, the SNs are used for sensing purpose, and in another step routing is performed.Nodes having less energy are utilized for detecting, and the sensing nodes having more residual energy are utilized to gather and communicate the information [11].
In hierarchical routing, we further have two approaches, based on cluster and grid.The benefits of utilizing group based strategies are expanded versatility, effective information collection, and channel transfer speed.They also have some disadvantages such as non-uniform clustering, which brings about high energy dispersal of nodes, and subsequently, builds the energy utilization [13,14].In this paper, the study of current energy-efficient hierarchical clustering approaches, taking into account parameters such as network life and energy consumption, is being carried out here.The benefits and bad marks is likewise being talked about, for both hierarchical methods to deal with aid researchers and experts, the brief summary is also drawn up to choose the most appropriate procedure depending on the need for the application.

Hierarchical Routing Protocol
The nodes were grouped together by the Hierarchical Routing Protocol to form clusters in a network.The cluster in a network consists of one cluster head and various members of the cluster (SNs).These members or sensor nodes collect the information from its own monitoring area.Cluster head task is to manage cluster members, receive information from every member, and then sink data .The main task of today's era in WSNsis to establish thenetwork topology and efficient route selection process to enhance the lifetime of the network [15,16].We can use either single level or double level communication.For large scale networks, we use double level as WSN is not efficient in long haul communication [17].The function of CH in double level communication is to ensure load adjustment between sensor nodes [18].Intra-cluster defines communication within a group and intercluster between clusters, as reflected in figure 3.

2(A) HIERARCHICAL APPROACHES (I) Cluster-Based Hierarchical Approaches
The cluster-based approach is used in hierarchy to save the node battery, thus minimizing energy intake and extending the life span of the network.To form clusters, the sensor devices are assembled together.The cluster head (CH) in clustering is then used to gather and incorporate the information from the nodes and then convey directly or through an intermediate CH to the BS (Base Station)as presented in a diagram 3.Thus, the process reduces energy consumption as the number of packets reduces.The multi-hop clustering has more than one CH in the network.In single hop we have only one CH [19].
The various protocols are: (i) LEACH representsLow Energy Adaptive Clustering Hierarchy.It is the first energy-efficient routing method that was introduced by the author.It is used today as a progressive approach WSN protocol [20].CH is chosen among the different sensor nodes in this protocol and then each and every node will have a chance to become CH after every rotation so that every node's energy is used.Each node here selectsthe number in midway of 0 and 1.The selected number is then compared with the threshold value if it is lesser than the threshold value; the head of the cluster turn out to be the node.The selected CH will now send messages to all nodes of the sensor.The sensing nodes then pick out any of the cluster and convey a message depending on the power of the signal received.
Sensor nodes sense the data in the steady-state phase and then send it to the selected CH, which is then aggregated and the sensed information is conveyed to the BS.The LEACH protocol does not require any global identification .In [20][21][22], the authors presented some of the improved Leach protocol version.It also has some drawbacks with the benefit of working as a simple protocol, i.e. a probabilistic method of choosing a random number for the choice of cluster head, which indirectly results in high energy consumption.
(ii) LEACH-C:It represents LEACH-Centralized.It's an improved version of the protocol for LEACH.The Base station performs cluster formation in this,including selection of CH and information distribution whereas in LEACH protocol each node self-configures to form clusters.There are two phases of working.Firstly, the set-up process, which is similar to that of the protocol for LEACH.The node sends information to the BS concerning the energy level and node location.The BS after this calculates the average energy of the node.Sensing devices having energy more than average value will be elected as CHs for the present round.Once the CH is selected, the BS transmits its node ID to all the network nodes.More overhead will be used in the reselection of CH because of its centralized architecture, as BS has to take the reselection decision.Thus energy consumption increases.
(iii) CCWM-It's a Weighted Metrics Cluster Chain [23] used to extend the life of the network.By using these metrics, a set of CHs is chosen through the use of these metrics.Sensor node uses direct communication to transfer data to BS. Network overheads increases as the election of CH is non-optimized.
(iv) Means Algorithm:-In this, we have three phases.Initially, Cluster head is elected by the LEACH protocol.Based on the measurement of Euclidean distance, clusters are developed in the latter phase wherebysensor nodes unite with their closest CH.In the last phase, once the group is formed,each node gets an ID depending on the distance from the centre.This results in the overall life of the network being prolonged, but results in an increase in the consumption of energy and an additional overhead network.
(v) CHEF-It shows Cluster Head Election Fuzzy.The author introduced here fuzzy approach [24].Cluster Head is choosen in every round depending on an arbitrary number.The two parameters used by CH are local distance and efficiency level.Local distance means the sum of the distances of all the adjacent nodes.Each CH uses if-then protocols of fuzzy to get some value and then advertise it.The CH with more important value will be selected as CH and will advertise itself so that sensor nodes are joined but the overhead of the network is increased.Network life time increases by using this.However, compared to previous solutions, the algorithm improves network lifespan, but adds network overhead and unnecessary loads.
(vi) UCS-It's an Unequal Clustering Size Model.The author in [25], presented a clustering method known as Unequal Clustering Size (UCS).There are two assumptions that are taken into account: the area is round, and is divided into two layers.Clusters of nodes having similar shape and size are used in the first layer, whereas different clusters are used in the second layer.Thereby to save energy CH must be located near to the centre of a cluster.However, this model gives good results in homogenous networks and reduces consumption of energy.
(vii) NUDND: -It represents Non -Uniform Deterministic Node Distribution.In this, the disadvantages of uniform clustering were examined [26].The algorithm assumed that sensor devices closer to Base Station would be used more than any other nodes in the network.
(viii) EADC: -It represents Energy-Aware Distributed Clustering approach.To solve the issues of restoration of hole, the algorithm utilizes clusters of unequal sizes.Here the sensing nodes are non-uniformly located.To select CH, the ratio of the average residual energy of the neighbouring sensing nodes and the energy of the sensing node itself is used.Few of the excess sensing nodes that consume more energy were neglected in this.
(ix) EADUC -It stands for Energy-Aware Distributed Unequal Clustering.It is used primarily to resolve the problem of coverage hole restoration.In order to solve the hole problem, sensor nodes with distinctresources of energy and clusters with differentsizes are assembled.It attains energy efficiency through dissimilar cluster formation.

(II) Grid-Based Approaches
Second clustering technique depends on grid.In these methodologies, the entire region is isolated into virtual grid.These approaches are utilized normally on account of their simplicity.The nodes themselves normally evaluate CH, which makes it reasonable for a large network.The purpose of this methodology is to use the limited resources effectively, for example, battery, which is neither replaceable nor battery-powered.With the assistance of gridding, node routing table can be diminished by managing the route set up.Thus,Grid based grouping methods are broadly utilized by analysts to accomplish energy proficiency and delay network lifetime [27][28][29][30][31].
(i) Grid-Based Data Dissemination: In [32], the grid is divided by BS into square mesh cells of equal size.The node which first sends the information is known as the crossing point (CP), and the network cells are created by the coordinators of the CP.Each node operates in two ways: high-power and low-power radio transmission.The network is split into virtual networks under IGBDD (Intelligent Network Based Data Deployment).This is an upgraded version of GBDD where cluster head is opted by focussing on the position of the crossing point (CP) and data is not needed to be transferred to the nearby CP selection nodes.In IGBDD, the entire system life has been strengthened as linear programming is used to define crossing point.
(ii) Grid-Based Hybrid Network Deployment Scheme: -The entire network is split up into a virtual square grid, whereby each grid defines a zone.Using a centralized system, the network topology is developed, where BS begins the development of the measure of grid and cluster head determination.For proper node distribution by the authors, split and merge methods were used.Low thickness and high thickness areas are recognized and candidate zones are called up dependent on the lower bound (LB) and the upper bound (UB) On the off chance that various nodes are not as much as LB, nodes converge with neighbouring zones in that particular zone based on the weighted score at that point called the weighted merge score (WMS).Then again, if the number exceeds UB, using any separation method of analysis, the BS will distribute the zone into sub-zones.There are three techniques presented, such as horizontal, vertical, diagonal 45° and diagonal 135°.Depending on various factors, the nonprobabilistic methodology is used for choosing cluster heads election to attain energy efficiency.The authors indicated that the suggested approach enhances the network's strength and lifetime and performs better than existing protocols.However, since it can be applied to a set number of nodes, it is not addressed as how a large network will respond.
(iii) Cycle Based Data Aggregation Scheme:-It is a grid dependent methodology in which the completeorganization is split into two dimensional square network cells that are measured.In this each head of the cell is connected with another one to form a chain.In each cycle, the cell head havinghigh energy levelgoes around like a cyclic head as selected by base station.Each cell head simply communicates information to the head of the cycle.The measurement of traffic is reduced through the cycle head, and energy utilization is lower as one cycle head is accountable for straightforward discussion with base station.The issue with the method is that the cyclic head located distant from base station due to the significant distance, thus consuming more energy and may die soon.In addition, far-away nodes will experience the adverse effects of such a problem and may split the network by quitting the chain.
(iv) Distributed Uniform Clustering Algorithm: -The method is used to uniformly allocate the CH and reduce cluster size differences.Every single grid is not a cluster.It mostly happens that overlapping regions are recognized in random deployment to further help reduce the cluster size.
(v) Grid and Genetic Amalgamation Algorithm for Clustering:-The method is actually the amalgamation of grid and genetic algorithm used in the sensor network.The entire grid depends on the position of the sensing nodes, and then the midpoints of the grid are calculated using the degree of membership in the genetic calculation.High-measurement test dimensionality is decreased and two-dimensional space is planned afterwards.Finally, in grids of various types, the sink is transmitted to cluster midpoints.
(vi) Data Aggregation depends on path based approach:-It is a method that uses one single chain based on a grid.The chain is designed by incorporate the heads of the cell from the farthermost line from left to right and the farthermost line afterward (option to left).Here, this will rehashed until the BS line is nearly reached.The authors made sure that the choice of cell heads depended on energy, which further extends the life of the network.
(vii) Sectoring of Grid: -It focuses on assumption of use of load and energy over uniform and arbitrary organization of network sensing devices.The complete organization is split into equivalent estimated grids and further divided into regions, each signifies a cluster.The sensing node that is located closer to the centre of a cluster becomes the cluster head.
(viii) Grid-Based Reliable Routing:-In [32], the authors introduced a reliable grid-based routing method in (GBRR).In this process, virtual clusters are shaped based on grids.In order to achieve versatility for dense and large-scale randomly deployed sensors, the method uses the benefits of both cluster and grid-based techniques.An active node is chosen as the head of the node to avoid CH depletion.The method therefore calculates the effective techniques in and between clusters.In addition, the source node does not have to be sent through the head node, and if the route to the BS is strong, it can sidestep it.Since one cluster consists of several grids, when compared to other clusters having one grid, the region covered by that cluster will be enormous.
(ix) Cluster Head Selection Using Analytical Network Process:-For node deployment at random, this divides the entire network into grids (zones).Therefore for topology development, the GHND approach should be followed to uniformly distribute nodes over each zone.For the selection of zone head (ZH), five parameters are considered: residual energy efficiency, distance from the neighbouring sensor nodes, and distance from the centre point of the zone, number of times the node has been elected as CH, and merged nodes.By means of a pair-wise examination of the ANP model, those parameters were relegated [33].

3(a)Design Challenges in Clustering
Concerning the development and implementation of clustering algorithms, there are some challenges.Sometimes, recharging the battery or replacing the sensor node is not feasible in many applications.
There are some points that need to be taken into account during the development of clustering algorithms, such as the process of cluster formation; choosing CH and clusters should be well balanced.A cluster forms a group of neighboring nodes.In most techniques, the choice of CH depends on various factors, such as the energy efficiency and the position of the sensing devices.

3(b) Clustering Factors
The factors that influence the process of cluster formation are as follows: (i) Count of Cluster (CC):-Cluster formation and CHS selection lead to different cluster counts.It is a primary factor in the clustering.
(ii) Formation of CH:-The selection of the cluster formation is handled by BS.The strategy for cluster formation can be unified, whereas clusters are framed with no coordination in the distributed approach.Hybrid strategies, which have the advantages of the two strategies, are also being used in the study.
(iii) Intracluster Communication: -It is the communication within a cluster conducted by the selected CH.
(iv) Mobility: -The nodes and CHs are static in a static network and thus result in stable clusters.In addition, a facilitated system (intra cluster and inter cluster) is brought about by the static node position.They also require high maintenance.
(v) Types of Network: We have two heterogeneous and homogeneous types of networks.CHs are furnished with effective communication and resources in a heterogeneous network.Nodes are of the same types in homogeneous networks, and not many of them are selected as CH based on an effective algorithm.
(vi) Cluster Head Election:-Efficiency of the system is based on selecting CHs.The CH's are predefined in heterogeneous environments.The selection of CH depends on several factors in homogeneous cases, such as space from nodes and center, energy level, etc., or we can use a probabilistic approach.
(vii) Multilevel Cluster Hierarchy:-Many algorithms make use of multilevel cluster to decrease consumption of energy.

3(c) Clustering Approaches Taxonomy
The protocols for clustering are classified in several ways, such as: (i) Homogeneous and Heterogeneous Networks-Homogenous systems, comprises of similar type of sensor devices having sameprocessing and hardware capabilities.Thus, one of the sensor node turn out to be a cluster head, dependingon few values i.e. residual energy and space from the midway of the cluster.In heterogeneous system, nodesare of two types i.e.CH nodes having higher processing capabilities and energy used within a cluster whereas other worksas sensor nodeshaving lower processingcapabilities whose function is to sense the data.
(ii) Centralized or Distributed.Centralized algorithms-These algorithms are mostly used for limited applications.Either CH or BS used to partition the network and form the cluster.However, sensor nodes are used in the distributed algorithm in order to achieve flexibility, quick implementation and convergence time.In homogeneous networks, these algorithms are mainly used.
(iii) Static and Dynamic Clustering-In WSN, depending on the application clustering, may be static or dynamic.The organization of groups and CH decisions are fixed in a static way.Whenever clusters are created and cluster head is chosen, it will continue to be static.
Whereas, sometimes in static clustering CHs are intermittently changed to accomplish high energy proficiency.But in dynamics, by reselecting CH, clustering offers high energy efficiency, and the groups are transformed.It finds applications where geography changes frequently and clusters need to be rearranged when geographical changes occur and thus result is enhanced energy level.

(iv) Probabilistic and Non-probabilistic Approaches-
In a probabilistic methodology, an earlier probability is assigned to the nodes to decide whether to use the CHs or any arbitrary determination procedure.In addition, the probabilities given to nodes go around as essential rules.In any event, during the cycle of CH reselection or group reorganization, some other optional models can also be used for improved use of energy and increased lifespan of the wsn.In addition, these processes need rapid time of execution and assembly and limit the amount of trade messages.Deteristic standards for CH determination and bunch growth/transformation are adopted in non-probabilistic clustering procedures.It also depends mostly on the data obtained from theconnected neighbours and needed ample words to be transmitted, ensuing in poor time complication compared to the probabilistic approaches.In addition, non-deterministic systems provide clusters that are more reliable, strong, and adjusted, as the determination depends on various measures, such as residual energy, portability, and power of transmission.

Conclusion:
We have studied the various hierarchal clustering algorithms of WSN.We found that the clustering algorithm reduces the node's energy usage, thereby expanding the lifespan of the network.A literature survey has been presented on both the approaches of hierarchal.Each of these approaches further hasits types which have been presented in this paper.Each and every algorithm has its own pros and cons.But the question arises when and where to use these algorithms to get better results.Algorithm selection is entirely based on the type of formulation we want to use, or the kind of network on which we are working.

Figure 3 :
Figure 3: Communication in hierarchy clustering