

(An ISO 3297: 2007 Certified Organization)

Vol. 5, Issue 5, May 2016

# **Design of Low Power Adder in ALU Using Flexible Charge Recycling Dynamic Circuit**

Sudha  $H^1$ , Lavanya  $T^2$ 

Associate Professor, Dept. of ECE, BIT College, Bangalore, Karnataka, India<sup>1</sup> PG Student [VLSI and E.S], Dept. of ECE, BIT College, Bangalore, Karnataka, India<sup>2</sup>

**ABSTRACT**: Dynamic circuits are widely used in order to solve the problems occurred in the data path and the critical components of the microprocessor. The power consumption is significantly in dynamic circuits due to their switching activity. In order to obtain high performance dynamic circuits are used in microprocessors because of their special features such as speed and area. In this paper, we proposed flexible charge recycling design methodology and dynamic circuit selection algorithm in order to achieve high efficiency in the data path. According to the proposed methodology the simulation result of the design shows the consumption of the ALU (Arithmetic and Logic Unit) with proposed technique is reduced up to 50% to 60% than the conventional ALU. The proposed technology is carried out under 22nm technology.

**KEYWORDS:** charge recycling, low power, n-type dynamic circuit, p-type dynamic circuit.

### **I.INTRODUCTION**

The number of transistors in a chip has extremely grown continuously with advanced technology, over a past decade As there is an increase in the transistor count it demands for less power consumption and less area, these are the main issues occur during the design of the circuit, with an increase in transistor count it leads to increase the transistor density, power consumption of the microprocessor, as a result these are major issues as to be carried out to ensure the high performance of the proposed system. In a modern microprocessor the data paths perform computing operations through a critical path. The operating frequency of the microprocessor can be determined by the operating speed of the data path. The data path consumes a significant amount of total power consumption and also it is a most active component. This results in a problem in those applications which as intensive computation on it, such as multimedia processor with multiple core and digital signal microprocessor. Hence, in a modern microprocessor, the low power data path can be achieved.

The dynamic circuit is widely used as a solution in critical data path and critical components in microprocessors due to their features such as speed and area [1], [2]. The n-type dynamic circuit is applied to on-chip memory and arithmetic and logic unit (ALU) of some microprocessors to minimize the latency [1], [3]. A high performance microprocessors use dynamic circuits, which can achieve a timing constraint than a static CMOS. The performance of dynamic circuits is increased due to low switching thresholds, reduced input capacitance and circuit implementation is achieved using high efficient and complex gates. The dynamic circuits are used in the implementation of a high frequency internal processor by using a logic blocks which have a timing requirement.

In the proposed paper, a novel p-type/n-type dynamic circuit selection (PNS) algorithm and flexible charge recycling (FCR) methodology is designed in modern microprocessors in order to obtain a low power data path.

### **II.RELATED WORK**

In earlier research the techniques used were n-p dynamic circuits, n-p dynamic circuits are categorized as n-dynamic and p-dynamic. In the n-dynamic circuits it adopts high-speed NMOS transistor to achieve high performance and in p-dynamic circuits it adopts slower PMOS transistor hence the speed is slower, but the power efficiency is enhanced due to suppressed gate and subthreshold leakage current of PMOS transistor. The n-p dynamic circuit has been proposed as



(An ISO 3297: 2007 Certified Organization)

### Vol. 5, Issue 5, May 2016

a race-free dynamic CMOS technique for pipelined circuits [4]. The n-p dynamic circuit has lower intrinsic delay and requires less silicon area due to the more compact logic than with the static CMOS logic.

Binding algorithm is mainly based on framework for low leakage data path. This technique is effective only for some applications and there is a considerable speed loss and not suitable for any high performance application [5]. A macrodriven data path design methodology has developed and it generates different topologies for different macros [6]. Three more methods have developed for synthesizing dynamic circuits, but these only contain n-type dynamic circuits and failed to include p-type dynamic circuits. crosstalk-aware and speed-aware synthesis methodologies are presented, but neither of them considers power efficiency [7]-[8].

Finally, a dynamic data path is synthesized automatically, but requires a significant silicon area. The common feature of these techniques is that the potential for low power by combining different types of dynamic circuits is not effectively explored.

#### **III.PROPOSED METHODOLOGY**

The proposed PNS-FCR, exploring power saving opportunity for data path circuits, is presented in this section. The three-step PNS-FCR design methodology is depicted in fig1.



Fig.1. Proposed PNS-FCR methodology

#### Design Flow for a Data Path Based on PNS-FCR:

As shown in Fig. 2, the design flow for a data path based on PNS-FCR is as follows:

1) First, the gate library based on a p-type/n-type dynamic circuit is built. Two types of each gate occupy a similar layout area to avoid area penalty.

2) Based on the gate library, the appropriate type of gates is selected using PNS to implement the data path or critical path, satisfying the performance requirements of different applications.

3) Next, the FCR is utilized to achieve high power efficiency in critical path by inserting the charge recycling paths between two independent gates or two neighboring gates. Note that the FCR is a tradeoff between power, performance, and silicon area.

4) Then, apply the proposed PNS-FCR to noncritical paths. The critical path is typically much longer than the uncritical path in the data path, and therefore, the gates in the uncritical path employ p-type for power efficiency. However, if an uncritical path formed by all p-type gates is even slower than the critical path, n-type gates would be inserted to meet the delay constraint based on PNS, and then the FCR is used to enhance the power efficiency.

5) Finally, the routing is completed manually or by CAD tools.



(An ISO 3297: 2007 Certified Organization)

#### Vol. 5, Issue 5, May 2016

#### **PN Selection Algorithm**

Based on the gate library, the appropriate type of gate is selected to implement a data path, as shown in Fig. 1. To satisfy the performance requirements of different applications, a PNS algorithm is introduced based on the multidimensional multiple-choice 0-1 Knapsack problem (MMKP) [9]. The delay of the critical path determines the performance of the data path. The gates in the critical path are required to meet the performance constraint, while the gates in the noncritical paths, use a low power version to enhance power efficiency.

#### Flexible Charge Recycling technique:

A key design issue in low power data paths is exploring the choice of different power efficient n-type and p-type gates. Accordingly, the FCR is proposed to achieve high power efficiency, as shown in Fig. 2.

Consider a critical data path exist between two cascaded gates. The initial gate is an n-type dynamic gate, while the latter gate is a p-type dynamic gate. During the precharge stage (CLK = 0), the dynamic node of the n-type gate Nn is precharged to Vdd through transistor Pcn, while the dynamic node of the p-type gate Np is discharged to ground through the transistor Ncp.



#### Fig.2. FCR technique

During the evaluation stage, provided that the necessary input combination is applied, Nn is discharged to the ground and Np is charged to Vdd. Otherwise, the high state of Nn and low state of Np are maintained until the next precharge stage. As the evaluation process completes, Nn discharges from high to low and Np charges from low to high. In the following precharge stage, Nn and Np both consume dynamic power by charging Nn from Vdd and discharging Np to ground. If a switch is inserted between the two dynamic gates, Nn is charged by Np through a charge recycling path, thereby reducing the dynamic power.

A dynamic full adder is taken as an example, as shown in Fig.2. When the input vectors of full adder are respectively (1, 1, 0), (1, 0, 1), and (0, 1, 1), at the end of an evaluation stage, Nn has been discharged to Gnd while Np has been charged to Vdd, and the switch is turned on. And then, a desirable charge recycling path between Nn and Np is built.

With the FCR cell, in the precharge stage, the CLKB makes the recycle path available. Consequently, two supplies Vdd and Np charge Nn simultaneously, which makes the precharge speed much higher as compared with the conventional circuit with only single supply Vdd charging Nn.

The additional capacitance Cr between dynamic nodes Nn and Np due to an adding charge recycling path has a negligible effect on evaluation speed. Accordingly, in the evaluation stage, the voltage waveforms of Nn and Np without and with the FCR cell almost overlap. The charge recycling path exists between two independent gates as well as two neighboring gates. If r p-type gates and q n-type gates are selected for a critical path, min(r, q) charge recycling paths can be inserted to reduce power.



(An ISO 3297: 2007 Certified Organization)

Vol. 5, Issue 5, May 2016

#### **IV.EXPERIMENTAL RESULTS**

#### Verification of FCR:

To verify the effectiveness of FCR, a full adder with clock-delay as designed, which is usually employed along the critical path in data path. A full adder with FCR includes one n-type Sum cell, one p-type Carry cell (dynamic full adder), and one FCR cell to enhance power efficiency (Fig. 2).

The fig.3 and fig.4 shows the schematic design of the adder circuit without and with FCR respectively. The output waveforms shows the voltage and dynamic node variation with respect to clk signal.



Fig.3. Schematic design of full adder circuit

In the above figure the adder circuit is designed and it is a dynamic full adder which consumes less power than the conventional adder. This adder circuit has 3 inputs and 2 outputs and the adder is called as ripple carry order where the carry out of one adder is given to the next adder and the carry out is obtained at the last adder circuit. This adder is implemented in ALU in order to achieve high efficiency.



Fig.4. Schematic design of full adder with FCR circuit



(An ISO 3297: 2007 Certified Organization)

### Vol. 5, Issue 5, May 2016

Fig.4 shows the schematic view of the adder circuit with proposed technology. In this circuit the FCR circuit is implemented in order to have a charge recycling path between two dynamic circuits. This FCR circuits acts as a switch between dynamic circuits, this leads to low switching activity and high efficiency of the circuit.



Fig.5. Waveforms of CLK and dynamic nodes of FCR cell

Fig 5 shows the simulation results of FCR cell. As there is a variation in the clock signal the nodes of the dynamic circuit also varies according to the inputs of the adder.

| Technology node | Avg. Power  | Max. Power | Timing |
|-----------------|-------------|------------|--------|
| 32nm            | 1.1243e-004 | 2.057e-004 | 8.08s  |
| 22nm            | 7.4344e-005 | 1.386e-005 | 1.50s  |

Table.1. Power and delay comparison

The simulation results are listed in table.1. and their respective waveforms are shown in fig.5.

#### **V.CONCLUSION**

A novel methodology is presented in this paper for designing dynamic circuits in the functional units of modern processors. The proposed PNS-FCR methodology achieves high power efficiency, while satisfying specific timing constraints. Simulation results shows that the power consumption of adder in different submicron technology. The proposed adder utilizes low power than the conventional adder and the high performance is achieved. This proposed technique is implemented in ALU which consumes low power and achieves a higher efficiency than the conventional ALU. This methodology can be extended to static CMOS, pass gate, transmission gate, tristate gate, and other logic families.

#### REFERENCES

- [1] H. McIntyre et al., "Design of the two-core x86–64 AMD 'Bulldozer' module in 32 nm SOI CMOS," IEEE J. Solid-State Circuits, vol. 47, no. 1, pp. 164–176, Jan. 2012.
- [2] M. Golden, S. Arekapudi, and J. Vinh, "40-entry unified out-of-order scheduler and integer execution unit for the AMD Bulldozer x86–64 core," in IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers, Feb. 2011, pp. 80–82.
- [3] R. Riedlinger et al., "A 32 nm, 3.1 billion transistor, 12 wide issue Itanium processor for mission-critical servers," IEEE J. Solid-State Circuits, vol. 47, no. 1, pp. 177–193, Jan. 2012.
- [4] K. Limniotis, Y. Tsiatouhas, T. Haniotakis, and A. Arapoyanni, "A design technique for energy reduction in NORA CMOS logic," IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 53, no. 12, pp. 2647–2655, Dec. 2006



(An ISO 3297: 2007 Certified Organization)

### Vol. 5, Issue 5, May 2016

- [5] C. Gopalakrishnan and S. Katkoori, "KnapBind: An area-efficient binding algorithm for low-leakage datapaths," in Proc. 21st Int. Conf. Comput. Design, Oct. 2003, pp. 430–435
- [6] M. Nemani and V. Tiwari, "Macro-driven circuit design methodology for high-performance datapaths," in Proc. ACM/IEEE Design Autom. Conf., Jun. 2003, pp. 661–666.
- [7] Y.-Y. Liu and T. Hwang, "Crosstalk-aware domino-logic synthesis," IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 26, no. 6, pp. 1115–1161, Jun. 2007
  [8] T. J. Thorp, G. S. Yee, and C. M. Sechen, "Design and synthesis of dynamic circuits," IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol.
- [8] T. J. Thorp, G. S. Yee, and C. M. Sechen, "Design and synthesis of dynamic circuits," IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 11, no. 1, pp. 141–149, Feb. 2003.
- [9] H. Kellerer, U. Pferschy, and D. Pisinger, Knapsack Problems. Berlin, Germany: Springer-Verlag, 2004.
- [10] J. Wang, N. Gong, S. Geng, L. Hou, W. Wu, and L. Dong, "Low power and high performance Zipper domino circuits with charge recyclepath," in Proc. IEEE 9th Int. Conf. Solid-State Integr.-Circuit Technol., Oct. 2008, pp. 2172–2175.