# Performance Analysis of High Speed Inexact Speculative Adder

V.Kannan<sup>a</sup>, V.JaganNaveen<sup>b</sup>, V.Vijayakumar<sup>c</sup>, V.Ganesan<sup>d</sup>

<sup>a</sup>Professor ,Department of ECE, GMR Institute of Technology, Rajam,AP,India
 <sup>b</sup>Professor ,Department of ECE, GMR Institute of Technology, Rajam,AP,India
 <sup>c</sup>AssistantProfessor,Department of ECE, Sathyabama Institute of Science and Technoloy, Chennai,TN,India
 <sup>d</sup>AssistantProfessor,Department of ECE, Bharath Institute of Higher Education and Research, Chennai,TN,India

Article History: Do not touch during review process(xxxx)

**Abstract:** A high-speed model of the recent Inexact Speculation Adder(ISA) design is presented in this paper. Also a fine grain pipeline and clock gating is included in this architecture to improve its speed and reduce the power consumption respectively. Simulations of the pipelined ISA, pipelined Speculator, PCLA and pipelined Compensator are presented. The pipelined ISA, pipelined Speculator, pipelined Compensator incorporated with the power saving techniques of sleep, stack and sleep stack approaches are also presented. HSPICE is used for stimulation and the required parameters such as delay average power, and power delay products are acquired and presented. It is observed that due to the application of the low power techniques of sleep, stack, the power delay product has been reduced by 80% compared to the carry-look ahead adder (CLA) based design of ISA.

Keywords: Inexact Speculation Adder, Pipeline, PCLA, Speculator, Compensator.

#### 1. Introduction (Times New Roman 10 Bold)

In the modern electronics era sophisticated signal processing's are being incorporated using the VLSI circuits. The circuits performing signal processing's are not only demanding complex computations but also consume more amount of energy. Hence the need for low power circuits are abundant since the number of operations being done is very large with very high frequency and it needs cooling. Now a days all electronics equipment becomes handy and portable. For its operation stored power in the form of battery is required. In order to ensure the proper operation and reliability, the portable devices needs considerable peak power considerations in its design. However the charge of the battery depends on the time averaged power and it is more important. Hence the life of such portable equipment's depend on the charge of the battery.

There are four sources of power dissipation in digital CMOS circuits: switching power, short-circuit power, leakage power and static power.Present day digital circuits are using more arithmetic operations such as addition for producing an output. Hence digital adders (**SubodhWairya et al.,2011**) are being used widely in electronic applications and he applications which need digital adders are multipliers, Microprocessors, DSP processors, etc. Since it is known that millions of instructions are executed per second in processors, the performance of multipliers should be considered in terms of its speed and power consumption. The performance of some of the adders are mentioned as follows. Compact design is exhibited by Ripple carry adders it is slow in its speed (**G. Sasi et al.,2019**). Carry look ahead occupies more area but it is fast(**JagannathSamanta et al., 2013**). But the carry select adder functions as a balance between the two said adders (**Mohammad Khadir et al., 2020**).

A hybrid adder design with carry look-ahead or carry select adder is presented (**Yuke Wang et al., 2002 and PadmanabhanBalasubramanian et al., 2018**) in order to improve the speed of addition. However the area efficient and power efficient logic systems designs are one of the emerging areas research in VLSI system design. The pace of addition is restricted by the propagation of carry in digital adders. A carry is sequentially generated and propagated to the next position only after the sum of the previous bit positions are completed in an elementary adder. To reduce the carry propagation delay the carry save look aheadadder(**Ravikumar A Javali et al., 2014**) is used in most computational systems which generates multiple carries independently and choose the appropriate carry and generate the sum.

The present era demands the high speed adders ,though power and area are equally important. It is possible to come out with a high speed adder with low power using approximate and inexact circuit technique (Vincent Camus et al., 2015). A trade off can be considered on accuracy on such circuits to enhance the speed and power by speculation. This type of adders are cited as inexact speculative adder. The literature reports a lot of such ISA(R. Priyaet al., 2013, ShivaniParmeret al., 2012, I-Chyn Wey et al., 2012, B. Ramkumar et al., 2012 and S.Manjui et al., 2013). However, further improvement in the speed of such adders are in need by retaining the exactness with minimum error. This work concentrates on pipelined carry look ahead adder(PCLA) based ISA

and it is further enhanced by adopting the low power techniques such as clock gating, stack approach, sleep approach and sleepy stack approach without changing the architecture.

#### 2. Proposed Architecture of Inexact Speculation Adder

Figure 1 shows the block diagram with data flow of traditional ISA for n-bit addition. In order to enhance the speed of the circuit, the adder unit is modified with 4-bit CLA. This architecture consists of the following sub blocks 1) Speculator and Adder Blocks: It has two n-bit operands which are required for addition and they are expressed as  $A = \{A0, A1, ....-1\}$  and  $B = \{B0, B1, ....Bn-1\}$ ; while, the carry input, carry output and the sum are expressed as *Cin*, *Cout* and  $S = \{S0, S1, ....Sn-1\}$ , respectively.



Figure.1. Block diagram of Inexact Speculative Adder

Fig 2. shows the circuit diagram of the adder design with speculator. This block uses the CLA logic for speculating the carry output in every block of 4-bit adder. Speculation is fetched out for 'r' msb bits in each block , where the size of block is greater than 'r', (i.e., r < x = 4). Eventually, the errors may be positive or negative and are introduced by the input carry (0 or 1) for each speculator block. The output carry is denoted by *Cso* for each and every speculator block and this carry is fed as input carry to the succeeding adder block. Hence there is no need for the 4-bit adder blocks to wait for the carry input which arise from the previous block of 4- bit adder.



Rather, the concerned speculator blocks provide the required carry simultaneously for all adder blocks to perform simultaneous additions. The computation in Speculator block is based on the equation as follows:

$$Pi = Ai \bigoplus Bi ; Gi = Ai .Bi ; Ci + 1 = Gi + (Pi .Ci)$$

$$(1)$$

Where (i+1)th carry bit is calculated using the carry (Ci), generate (Gi) and the propagate (Pi), of *ith* bit. This block is present through the critical path of ISA architecture, and also it computes the carry for few bits, it wont produce much delay. In the same time the CLA logic perform the addition of 4-bit input in the adder block and the equation used for this addition is shown below.

## $Si = Pi \oplus Ci$

(2)

The sum output is not the exact output from each parallel adder block, because the speculated carry inputs are used to perform addition. The compensator block performs the job of correction of such sum value and is shown in fig1. Maximum delay is being incurred by the adder and compensator blocks in the architecture.Inorder to reduce the delay the pipelining technique is used. Inorder to reduce power the power reducing techniques of sleep,stack and sleepy stack techniques are also followed.Figure 3 shows the compensator block architecture used in the ISA adder. In this block a comparison of carry output from each and every 4-bit adder block is done with the speculated carry with the help of an XOR gate.

Hence ,XOR gate generaes an error flag (fe) at its output which enables any one of the two compensation techniques. The compensation techniques are error correction and reduction techniques. If the XOR-gate output is '0' the sum value is directly routed to final output. Otherwise if it is 1, it indicates an occurrence of an error which may be positive or negative.

A low sum is induced due to the speculation of '0' instead of '1' which is indicated by a positive error. Similarly high sum is induced due to the speculation of '1' instead of '0' which is indicated by a negative error. An unsigned decrement or increment is done by the compensation block to the set of LSBs towards the error direction. Correction can be performed only if no overflow occurs. On the other case if any overflow occurs a set of MSBs of the previous sub-adder is balanced by the compensation block in the direction opposite to that of the error.

The balancing is carried out considering the condition as follows

 $2n > \{2n + 2n - 1 + 2n - 2 + \dots + 20\}$ (3)

Suppose if the sum is resulted in 2n errors then it can be compensated by causing LSB errors intentionally in the converse direction (for example,  $Sn-1 = 1 \rightarrow Sn-1 = 0$ ). Thus if each and every LSBs are balanced in the converse direction, the overall error is decreased in to 1, as shown below

 $\{2n - 2n - 1 - 2n - 2 - \dots - 20\} = 1$  (4)

The correction or balancing in the compensation block will be completed before the 4-bit adder block completes calculating the sum total of all the other bits. An important characteristics of this ISA adder is that both the compensation and the pre-computing of error correction are not present in its critical path. However the XOR gate, multiplexer.and de-multiplexer are present in the compensation block of the ISA in its critical path. In the proposed design, clock gating is done in order to fetch the clock signal into every stage. On performing this modification, the ideal stages of the architecture can be varied from clock switching which significantly reduces the power consumption. Such gating is valid only during the beginning and ending sessions of the addition process (PadminiG.Kaushiket al., 2013 and Ashutosh Gupta et al., 2016). Various power reduction techniques are also included in the design.

#### 2.1. Fine-Grain Pipelined Architecture

Fig. 4 shows the architecture of pipelined Inexact Speculative Adder. Let  $\partial comp$ ,  $\partial 4b-adder$  and  $\partial spec$  be the combinational delay of compensator, 4-bit adder, and speculator blocks in a conventional ISA architecture. The carry is speculated for each and every 4-bit adder block and the local sum is being calculated by the adder block based on it. Thus by analogize the speculated carry-in with the prior carry-out which arise from 4-bit adder, the faulty speculation can be identified.

At the same time the correction and balancing operations are performed by the compensator block. The delays present in the critical path of ISA due to speculator of the *i*thinstant, the 4-bit adder and the compensator delays of (i+1)th instant are expressed as follows,

 $\partial$  critical = ( $\partial$ 4b-adder) + ( $\partial$ spec)+1 + ( $\partial$ comp)i+1 (5) The detailed critical path delay due to compensator blocks and internal architectures of speculator are given below  $\partial$ critical+( $\partial$ 4b-adder)+(2× $\partial$ xor+ $\partial$ and) +1+( $\partial$ xor + $\partial$ demux + $\partial$ mux)i+1 (6)

where  $\partial demux$ ,  $\partial mux$ ,  $\partial xor$  and  $\partial and$ , are respectively the combinational delays of the de-multiplexer, multiplexer, logical XOR and logical AND gates. The critical path delay can be reduced by introducing the fine grain pipelining in ISA architecture and achieving a fast VLSI-architecture.

The pipeling process is explained using the architecture shown in figure in which the traditional blocks has been restored by the pipelined 4-bit CLA(PCLA), pipelined compensator (PCOMP) and pipelined speculator (PSPEC) units.



Figure.4 Architecture of pipelined Inexact Speculative Adder

Also two pipeline stages are present in the sub blocks of PSPEC, PCLA and PCOMP. This architecture consists of five stages of pipeline and six levels of levels of registers as shown in figure 5.Simiarly pipelines stages are incorporated in the designs of PSPEC, PCOMP and PCLA there by reducing the delay. The frequency of operation of the proposed ISA is decided by the critical path delay and it is given by the

The frequency of operation of the proposed ISA is decided by the critical path delay and it is given by the expression

$$\partial \text{crt-prop} = \partial \text{clk-(ff)} + \partial \text{xor} + 3 \times \partial \text{and} + \partial \text{setup(ff)}$$

(7)

(8)

where  $\partial setup(ff)$  and  $\partial clk-Q(ff)$  represent setup time and clock-to-Q delay of launch and capture flip flops, respectively, in the design. Thus the maximum operating clock frequency that can be obtained by this ISA architecture is given by

Fmax 
$$\leq 1/\{\partial \text{crt-prop} - \partial \text{skew}\}$$

where  $\partial skew$  depict the clock skew.

## 2.2 Low Power Techniques

There are several VLSI techniques available to achieve low power in order to achieve a better trade-offs between power, delay and its product. Every technique contributes an efficient method to reduce leakage power without altering the architecture. The low power techniques adopted in this work are 1)SLEEP approach,2)STACK approach and 3)SLEEPY STACK approach. Figure 5,6 and 7 shows the diagram of sleep ,stack and sleepy stack approaches respectively.

In SLEEP approach(Se Hun Kim et al., 2006 and Jun Cheol Park et al., 2006), in between the VDD and the pull-up network a "sleep" PMOS transistor is placed. An NMOS transistor is placed as a "sleep" transistor in between the GND and the pull-down network. These sleep transistors turns off the power to the circuit from the power rails when the circuit is idle and is turned on when it is active. Thus a reduction in leakage power is achieved effectively by cutting off the power source.



In STACK approach, stacking of the transistor is employed for further leakage reduction. The pull up and pull down network are divided into two half size(**Ali Peiravi**, **et al.**, **2012**). This dividing of the transistor will not affect the W/L ratio of the circuit (**K. Roy et al.**, **2003**). The stacking of the transistor is provided by the dividing of pull up PMOS network and it provides better leakage saving. Thus this method provides better performance by suppressing sub threshold leakage current and DIBL leakage (**D.Vijayalakshmi et al.**,**2016**).

A combination of sleep and stack approaches are employed in SLEEPY STACK approach (**SagarDafet al., 2015**). Like the stack approach in this technique also the existing transistors are divided in to two half size transistors. Then a sleepy transistor is added in parallel tone of the divided transistor

Thus the stacked transistors suppress the leakage current and the sleep transistors turned off the power during sleep mode. Also the delay is reduced during the active mode, since the sleep transistors are in parallel to any one of the stacked transistor.



Figure.7Sleepy Stack Approach

However the draw back of this approach is large area occupied due to presence of three transistors for a single transistor.

#### 3. Results and Discussions

The existing and proposed architecture are simulated by using HSPICE simulator. Output of the existing inexact speculative adder and the proposed inexact speculative adder are analysed. From the results and it is concluded that the proposed architecture reduces the delay compared with the existing system. The Synthesis is done using Xilinx ISE. Figure 8 shows the simulation output of a 4-bit pipelined carry look-ahead adder. Figure 9 shows the simulation output of a 4-bit pipelined carry look-ahead adder using sleep technique. Figure 10. shows the simulation output of pipelined compensator. Figure11 shows the simulation output of a pipelined compensator using sleep technique. Figure 12 shows the simulation output of a pipelined speculator using sleep technique.





| Table.1. Power Delay Analysis of Existing System |                                                                     |             |                               |             | <b>Table.2.</b> Power Delay Analysis of proposed ISAusing sleep technique |                                 |                           |                                      |  |
|--------------------------------------------------|---------------------------------------------------------------------|-------------|-------------------------------|-------------|---------------------------------------------------------------------------|---------------------------------|---------------------------|--------------------------------------|--|
| Device                                           | Average<br>Power(mW                                                 | ) Delay(ps) | Power<br>delay<br>Product(pJ) |             | Device                                                                    | Avera<br>Power<br>(mW)          | ge Delay(ps               | 6) Power<br>delay<br>Product<br>(nl) |  |
| Adder                                            | e 0.07                                                              | 268.45      | 17 735                        |             | Inexact<br>Speculativ                                                     | re 0.80'                        | 73 17.418                 | 2.574                                |  |
| Pipelined                                        | 1.245                                                               | 500.95      | 31.05                         |             | 4-bit PCL                                                                 | A 0.785                         | 59 198.12                 | 28.5                                 |  |
| Pipelined                                        | 3.49                                                                | 1.65        | 0.286                         |             | Pipelined<br>Speculator                                                   | 0.060                           | 59 341.39                 | 15.388                               |  |
| compensate                                       | or                                                                  |             |                               |             | Pipelined<br>compensat                                                    | 0.119                           | .6                        | 0.03646                              |  |
| Table.3. Powerstack technique                    | Table.3. Power Delay Analysis of proposed ISA using stack technique |             |                               |             | Device                                                                    | y Analysis<br>hnique<br>Average | s of propose<br>Delay(ps) | ed ISA using                         |  |
| Device                                           | Average<br>Power(mW)                                                | Delay(ps)   | Power<br>delay                |             |                                                                           | Power<br>(mW)                   |                           | delay<br>Product(pJ)                 |  |
| Inexact<br>Speculative                           | 0.725                                                               | 1.2994      | Product(pJ)<br>0.175          | I<br>S<br>A | nexact<br>peculative<br>Adder                                             | 0.7548                          | 7.659                     | 1.0588                               |  |
| Adder                                            |                                                                     |             |                               | 4           | -bit PCLA                                                                 | 0.7155                          | 250.5                     | 32.827                               |  |
| 4-bit PCLA                                       | 0.6871                                                              | 149.46      | 18.80                         | F           | Pipelined                                                                 | 0.04220                         | 498.4                     | 0.1417                               |  |
| Pipelined<br>Speculator                          | 0.03484                                                             | 250.29      | 5.875                         |             | peculator<br>pipelined                                                    | 0.09435                         | 1 7199                    | 10.933                               |  |
| Pipelined<br>compensator                         | 0.08391                                                             | 4.229       | 0.2390                        | C           | ompensator                                                                | 0.07455                         | 1.7175                    | 10.755                               |  |
|                                                  |                                                                     |             |                               |             |                                                                           |                                 |                           |                                      |  |

Figure 14 shows the simulation output of a pipelined inexact speculative adder of existing system.Figure 15 Output of pipelined inexact speculative adder of proposed system. Figure 16 shows the simulation output of a 4-bit pipelined-carry look-ahead adder (PCLA) output using stack technique.Figure 17 shows the simulationoutput of Pipelined compensator (PCOMP) output using stack technique.Figure 20. Power delay analysis of proposed ISA with stack technique. Table 1. depicts the parameters and the values obtained for the power delay analysis of existing system. Table 2. depicts the parameters and the values obtained for the power delay analysis of proposed ISA using sleep technique. Table 3. depicts the parameters and the values obtained for the power delay analysis of proposed ISA using stack technique.



Table 4. depicts the parameters and the values obtained for delay analysis of proposed ISA using sleepy stack technique. It is observed that an Figure 18. Shows the power delay analysis of existing ISA. Figure 19 shows the power delay analysis of proposed ISA with sleep technique. It is observed that the delay and power delay product is reduced in the proposed system with sleep technique. Figure20 shows the power delay analysis of proposed ISA with stack technique. It is also observed that the delay and power delay product is reduced in the proposed system with stack technique Figure 21 shows the power delay analysis of proposed ISA with sleepy stack technique. A further reduction is observed in the delay andpower delay product of the proposed system with sleepy stack technique. An increase in the power is observed in all the case ,it may be due to the extra circuitry added to obtain less delay. At the same time a drastic decrease in the delay is observed in the proposed system which leads to an increase in speed.

# 4. Conclusion

This paper presents the high-speed model of the recent ISA design. Fine grain pipeline and clock gating is included in this architecture to improve its speed and reduce the power consumption respectively. Simulation results of the existing ISA,Speculator, CLA and Compensator are presented. Also the simulation results of the proposed system including the pipelining techniques in ISA,Speculator, CLA and Compensator are presented. The results of pipelined ISA, pipelined Speculator, pipelined CLA and pipelined Compensator with together with the power saving techniques of SLEEP,STACK and SLEEP STACK approaches are also presented. It is observed that The average power consumption is high in the proposed system due to the inclusion of additional circuitry for the reduction of delay. But the delay is considerably reduced in the proposed system with pipelining and power reduction techniques.

# References

- 1. SubodhWairya, Rajendra Kumar Nagaria, and SudarshanTiwari(2011) ,Performance Analysis of High Speed Hybrid CMOS Full Adder Circuits for Low Voltage VLSI Design, VLSI Design , 2012, 1-18
- G. Sasi, G. Athisha, S. Surya Prakash(2019) ,Performance Comparison for Ripple Carry Adder Using Various Logic Design, International Journal of Innovative Technology and Exploring Engineering, 8 (4S2),372-377.
- 3. JagannathSamanta, MousamHalder, Bishnu Prasad De (2013), Performance Analysis of High Speed Low Power Carry Look-Ahead Adder Using Different Logic Styles, International Journal of Soft Computing and Engineering, 2(6), 330-336
- Mohammad Khadir, KancharapuChaitanya, S. Sushma, V. Preethi, Vallabhuni Vijay (2020), Design of carry select adder based on a compact carry look ahead unit using 18nm finFET technology, Journal of Critical Reviews, 7(6), 1164-1171
- 5. Yuke Wang C. Pai,Xian Song(2002),The design of hybrid carry-lookahead/carry-select adders,,IEEE Transactions on Circuits and Systems II Analog and Digital Signal Processing 49(1),16 24
- PadmanabhanBalasubramanian and Nikos Mastorakis (2018) ,Performance Comparison of Carry-Lookahead and Carry-Select Adders Based on Accurate and Approximate Additions, Electronics 7, (369),1-12
- Ravikumar A Javali, Ramanath J Nayak, Ashish M Mhetar, Manjunath C Lakkannavar(2014)., Design of high speed carry save adder using carry lookaheadadder, IEEE International Conference on Circuits, Communication, Control and Computing, 2014, 33-36.
- 8. Vincent Camus, Jeremy Schlachter, Christian Enz (2015), Energy-Efficient Inexact Speculative Adder with High Performance and Accuracy Control, IEEE International Symposium on Circuits and Systems (ISCAS), 45-48.
- 9. R. Priya and J. Senthil Kumar (2013), "Enhanced area efficient architecture for 128 bit Modified CSLA", International Conference on Circuits, Power and Computing Technologies, 989-992
- 10. Shivani Parmer and Kirat pal Singh, (2012), Design of high speed hybrid carry select adder", IEEE International Advance Computing Conference (IACC), 1656-1663
- 11. I-Chyn Wey, Cheng-Chen Ho, Yi-Sheng Lin, and Chien-Chang Peng (2012), An Area-Efficient Carry Select Adder Design by Sharing the Common Boolean Logic Term", Proceedings of the International MultiConference of Engineers and Computer Scientist, II
- 12. B.Ramkumar and Harish M Kittur (2012), Low-Power and Area-Efficient Carry Select Adder", IEEE Transactions on Very Large Scale Integration (VLSI) Systems,20(2),371-375
- 13. S.Manjui, V. Sornagopae (2013), An Efficient SQRT Architecture of Carry Select Adder Design by Common Boolean Logic", IEEEInternational Conference on Emerging Trends in VLSI, Embedded System, Nano Electronics and Telecommunication System (ICEVENT), 2013.
- 14. PadminiG.Kaushik, Sanjay M.Gulhane, Athar Ravish Khan (2013), Dynamic Power Reduction of Digital Circuits by Clock Gating, International Journal of Advancements in Technology, 4(1), 79-88
- 15. <u>Ashutosh Gupta</u>, <u>ShrutiMurgai</u>, <u>Anmol Gulati</u>, and <u>Pradeep Kumar</u> (2016), Power efficient, clock gated multiplexer based full adder cell using 28 nm technology, AIP Conference Proceedings, 3-6
- 16. Se Hun Kim, V.J. Mooney(2006) ,Sleepy Keeper: a New Approach to Low-leakage Power VLSI Design, IEEE International conference on VLSI, 367-372
- 17. Jun Cheol Park and Vincent J. Mooney III (2006), Sleepy Stack Leakage Reduction, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 14 (11)1250-1263
- 18. Ali Peiravi, Mohammad Asyaei (2012), Robust low leakage controlled keeper by current-comparison domino for wide fan-in gates" INTEGRATION, the VLSI Journal, 45, 22–32.
- 19. K. Roy, S.Mukhopadhyay, H. Mahmoodi-meimand (2003), "Leakage tolerant mechan- isms and leakage reduction techniques in deep-submicron CMOS circuits", Proceedings of the IEEE 91(2),305–327.
- D.Vijayalakshmi, Dr P.C Kishore Raja (2016), Leakage Power Reduction Techniques in CMOS VLSI Circuits – A Survey, International Journal of Scientific Development and Research,1(5),717-722
- SagarDaf, Priya Charles (2015), Sleepy Stack Approach for Leakage Reduction of Low Power Flip Flop, International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering, 4 (6)5341-5347.