Open Access

Comparative Analysis of Heterogeneous Adders: Evaluating Performance across 12-bit, 14-bit, and 16-bit Configurations

, ,  and   
Feb 20, 2025

Cite
Download Cover

Introduction

Data and signal processors, along with controller-based systems, heavily depend on data manipulation for their functionality. Arithmetic operations, including addition, subtraction, multiplication, and division, are fundamental and widely utilized in various digital systems. These operations play a crucial role in architectures such as DSPs, microprocessors, microcontrollers, and data processing units. Among these operations, adders stand out as indispensable components in arithmetic circuits, particularly in processor-based systems [1].

To mitigate challenges in digital design, the implementation of efficient low-power and area design techniques becomes essential. Area-efficient designs not only result in smaller chip sizes but also contribute to lower costs and reduced device weight, making electronic devices more portable and convenient. The incorporation of these techniques aims to enhance the overall reliability of circuits by lowering the average power consumption. This reduction in power consumption not only improves circuit reliability but also diminishes cooling requirements, subsequently leading to decreased packaging and cooling costs [2].

Reference [3] demonstrates that the section-carry based carry look ahead adder (SCBCLA) design exhibits better optimization compared to the conventional carry look ahead adder (CCLA) design. Achieving superior optimization in design metrics is facilitated by the use of a heterogeneous Carry Look-Ahead (CLA) architecture when compared to a homogeneous CLA architecture [4]. The study in [5] focuses on the flexibility of reconfigurable design, exploring various combinations of adder variants to achieve low power consumption and a small area for enhanced design performance.

In this paper, diverse adder architectures are formulated with a focus on heterogeneity, and a comparative analysis is conducted, evaluating them based on area and power considerations. The comparison involves examining simulation results and implementing the designs through VIVADO 2017.1. The implementation of the heterogeneous adder is carried out using VHDL as the programming environment [6]. VHDL facilitates the integration of various abstraction levels within the same model, offering a versatile design flow for the heterogeneous adder structures.

The paper is planned as follows: Section II offers an overview of various related works, offering context and insights from existing research. Section III presents a concise explanation of the proposed model. Section IV details the performance evaluation, shedding light on the assessment criteria and results. Lastly, Section V serves as the conclusion, summarizing key findings and implications of the study.

Correlated Work
RCA: Homogeneous Adder

The construction of an RCA involves the sequential connection of full adders, wherein the carry output from one full adder serves as the input carry for the subsequent stage. A full adder, functioning as the fundamental building block, is pivotal in this design. Consequently, the creation of an n bit parallel adder necessitates the incorporation of n full adders. Termed an “RCA,” this architecture is characterized by the sequential propagation of carry bits to each successive full adder. While the layout of an RCA facilitates rapid design, its operational speed is comparatively modest. This is attributed to the dependency of each full adder on the calculation of the carry bit from the preceding full adder, resulting in a sequential and potentially slower computation process [7].

In Fig. 1, the n-bit inputs are denoted as A and B, C0 represents the carry input, S signifies the n-bit outputs, and Cn represents the carry output. The intermediate carry signals, labeled as C1, C2, up to Cn-1 in the diagram, are integral components referred to as signals in VHDL code. These intermediate carry signals play a vital role in the internal workings of the adder, facilitating the accurate calculation of the carry output Cn based on the input and intermediate carry values.

Figure 1.

n-bit RCA

The primary drawback of the RCA is its susceptibility to increased delay as the bit length grows. Consequently, the RCA becomes less suitable for the addition of a large number of bits. The principal factor contributing to this delay is the carry propagation mechanism. Consequently, it becomes crucial to compute the carry delay from input to output for effective performance evaluation. For an n-bit RCA, the delay for carry (Tc) can be intended as part of assessing the overall efficiency and speed of the adder circuit. Tc=TFA((A0,B0) To C0)+(n2)xTFA(CinCout)+TFA(Cin -Sout (n-1)){T_c} = {T_{FA}}\left( {\left( {{A_0},{B_0}} \right){\rm{ To }}{C_0}} \right) + (n - 2)x{T_{FA}}\left( {{C_{in - }}{C_{out}}} \right){\rm{ + }}{T_{FA}}\left( {{C_{in - }}{S_{out }}(n{\rm{ - 1}})} \right)

Here, TFA denotes the delay of a full adder along the pathway from its designated input to its output [8].

CLA: Homogeneous Adder

A CLA enhances operational speed by minimizing the time required for carry bit determination. In contrast to an RCA, where each bit must wait for the previous carry to be calculated, the CLA computes one or more carry bits earlier than the sum, thereby reducing the waiting time for computing the results of higher-value bits [913]. Understanding the CLA involves manipulating Boolean expressions related to a full adder [1416]. The key components in a full adder, the propagate “Pi” and generate “Gi” are expressed in equation (2), delineating their roles in the adder’s functionality. Pi=xin xoryin carrypropagateGi=xinandyin carrygenerate\matrix{ {{P_i} = {x_{in{\rm{ }}}}x\;or\;{y_{in{\rm{ }}}}carry\;propagate} \cr {{G_i} = {x_{in}}\;and\;{y_{in{\rm{ }}}}carry\;generate} \cr }

Both the propagate and generate signals are dependent solely on the input bits, ensuring their validity after a single gate delay. Equation (3) provides the updated expressions for the sum output and carryout. These expressions encapsulate the modified relationships, reflecting the efficient computation achieved through the CLA design [17,18].Sumout=Si=Pixor(i1)Carryout=C(i+1)=Gi+Pi and Ci\matrix{ {{\rm{Su}}{{\rm{m}}_{out}} = {S_i} = {P_i}{{{\mathop{\rm xor}\nolimits} }_{(i - 1)}}} \cr {{\rm{Carr}}{{\rm{y}}_{out}} = {C_{(i + 1)}} = {G_i} + {P_i}{\rm{ and }}{C_i}} \cr }

The equations illustrate two scenarios in which a carry signal will be produced:

If both input bits, xin and yin, are 1.

If either xin or yin is 1, and simultaneously, the carry-in signal is also 1.

By implementing the overhead equations for a 4-bit adder, the expression will be as follows [17]: C(1)=G(0)+P(0)*C(0)C(2)=G(1)+P(1)*C(1)=G(1)+P(1)*G(0)+P(1)*P(0)*C(0)C(3)=G(2)+P(2)*C(2)=G(2)+P(2)*G(1)+P(2)*P(1)*G(0)+P(2)*P(1)*P(0)*C(0)C(4)=G(3)+P(3)*C(3)=G(3)+P(3)*G(2)+P(3)*P(2)*G(1)+P(3)*P(2)*P(1)*G(0)+P(3)*P(2)*P(1)*P(0)*C(0)\matrix{ {C(1) = G(0) + P(0)*C(0)} \cr {C(2) = G(1) + P(1)*C(1) = G(1) + P(1)*G(0) + P(1)*P(0)*C(0)} \cr {C(3) = G(2) + P(2)*C(2) = G(2) + P{{(2)}^*}G(1) + P{{(2)}^*}P(1)*G(0) + P(2)*P(1)*P(0)*C(0)} \cr {C(4) = G(3) + P(3)*C(3) = G(3) + P(3)*G(2) + P(3)*P(2)*G(1) + P(3)*P(2)*P(1)*G(0) + P(3)*P(2)*P(1)*P(0)*C(0)} \cr }

Similarly, the general expression can be expressed in equation (5) as follows: Ci+1=Gi+Pi*Gi1+Pi*P(i1)*G(i2)++Pi*P(i1)P2*P1*G0+Pi*P(i1)P1*P0*C0{C_{i + 1}} = {G_i} + {P_i}*{G_{i - 1}} + {P_i}*{P_{(i - 1)}}*{G_{(i - 2)}} + \ldots + {P_i}*{P_{(i - 1)}} \ldots {P_2}*{P_1}*{G_0} + {P_i}*{P_{(i - 1)}} \ldots {P_1}*{P_0}*{C_0}

Fig. 2 illustrates how the structure of a CLA can be segmented into three main components: the propagate/generate generator, the sum generator, and the carry generator.

Figure 2.

n-bit CLA

Proposed Model

In this study, the heterogeneous adders of 12-bit, 14-bit, and 16-bit configurations are crafted by diverse combinations of m-bit CLAs and n-bit RCAs, as outlined in Table 1. The design and evaluation are executed through VIVADO 2017.1, encompassing simulation, RTL synthesis, and implementation phases to generate power summary and utilization summary reports [1922]. The escalating need for low-power components in diverse computing applications, as highlighted in [19], emphasizes flexibility and portability. This paper presents the implementation of heterogeneous adder designs with a focus on identifying the design with the lowest power consumption. Additionally, utilization summary reports are generated to compare area performance, aiding in the determination of the design with the least area utilization. The internal structure of the planned model is depicted in Fig. 3.

Figure 3.

Internal structure of the proposed model

Possible alliances

12-Bit

m-bit CLA n-bit RCA
4 8
6 6
8 4
10 2

14-Bit

m-bit CLA n-bit RCA
4 10
6 8
8 6
10 4
12 2

16-Bit

m-bit CLA n-bit RCA
4 12
6 10
8 8
10 6
12 4
14 2
Performance Evaluation

The primary challenge for any VLSI designer lies in creating systems that operate at higher speeds while consuming minimal power and occupying minimal physical area. Therefore, accurate performance estimation becomes indispensable in the design process. This involves predicting and optimizing factors such as speed, power consumption, and area utilization to achieve an optimal balance and meet the specific requirements of the given application or system.

Power Dissipation

Static CMOS gates are known for their low power dissipation in idle or static conditions. In static CMOS, power dissipation mainly consists of two components: static power dissipation and dynamic power dissipation.PTotal=Pstatic+Pdynamic{P_{Total}} = {P_{static}} + {P_{dynamic}}

Static dissipation: Pstatic = Istatic VDD

Dynamic dissipation: Pdynamic = αCV2DD f

Here, activity factor (α) = 1 of a clock; meanwhile, it rises and falls in each cycle. In VIVADO tool, Pdynamic=PSignals+PLogic+PIO{P_{dynamic}} = {P_{Signals}} + {P_{Logic}} + {P_{IO}}

Delay Estimation

In digital design, critical paths refer to specific paths within a circuit that impose the most significant constraints on the overall timing performance. These paths have the longest propagation delays and, therefore, play a crucial role in determining the maximum achievable operating frequency of the circuit. For a RCA, tRipple=tpg+(N1)tAO+txor{t_{Ripple}} = {t_{pg}} + (N - 1){t_{AO}} + {t_{xor}} where tpg is the 1-bit generate/propagate gates delay, tAO is AND_OR gate delay, txor is XOR delay final sum.

The delay of a CLA adder can be influenced by the concept of k-groups of n-bits each, where k represents a grouping of bits within the adder. The delay in a CLA adder often involves the calculation of the carry-out from each group of bits, and this can be expressed mathematically which is represented in (9) [2325].tCLA=tPG+tPG(n)+[ (n1)+(p1) ] tAO+txor{t_{CLA}} = {t_{PG}} + {t_{PG\left( n \right)}} + \left[ {\left( {n - {\rm{1}}} \right) + \left( {p - {\rm{1}}} \right)} \right]{\rm{ }}{t_{AO}} + {t_{xor}} where tPG(n) is AND_OR…. gate delay to calculate n number of generated signals.

12-bit Adder Presentation
A=>011101001011B=>110010101001Cin =>+1Sum =>1001111110101{{\matrix{ {{\rm{A}} = > 011101001011} \hfill \cr {{\rm{B}} = > 110010101001} \hfill \cr {{\rm{Cin }} = > \quad \;\;\;\; + \quad \;\;\;\;1} \hfill \cr } } \over {{\rm{Sum }} = > 1001111110101}}

The simulation and synthesis of different combination of 12-bit heterogeneous adder is done by using VIVADO tool and the result is tabulated.

14-bit Adder Presentation
A=>3 B3D B=>CB3 ACin=>+1Sum=>1287 A{{\matrix{ {{\rm{A}} = > 3{\rm{B}}3{\rm{D}}} \hfill \cr {{\rm{B}} = > {\rm{CB}}3{\rm{A}}} \hfill \cr {{\rm{Cin}} = > \;\; + \;\;1} \hfill \cr } } \over {{\rm{Sum}} = > 1287{\rm{A}}}}

The simulation and synthesis of different combination of 14-bit heterogeneous adder is done by using VIVADO tool and the result is tabulated.

16-bit Adder Presentation
A=>B790B=>8763Cin=>+1Sum =>13EF4{{\matrix{ {{\rm{A}} = > \;\;{\rm{B}}790} \hfill \cr {{\rm{B}} = > \;\;8763} \hfill \cr {{\rm{Cin}} = > \;\; + 1} \hfill \cr } } \over {{\rm{Sum }} = > \;\;13{\rm{EF}}4}}

The simulation and synthesis of different combination of 16-bit heterogeneous adder is done by using VIVADO tool and the result is tabulated.

Tabulation and Comparison

Power and area analysis of 12-bit adder with different combinations are signified in Table 2 & Table 3, respectively.

Power calculation of 12-bit adder

Types Total Power (in W) Dynamic Power PD (in W) PD = PS+PL+PI (in W)
Signals Power PS Logic Power PL IO Power PI
12CLA 8.279 0.292 0.084 7.801 8.177
2RCA+10CLA 8.161 0.278 0.080 7.701 8.059
4RCA+8CLA 8.141 0.279 0.074 7.686 8.039
6RCA+6CLA 8.145 0.282 0.078 7.683 8.043
8RCA+4CLA 8.145 0.283 0.077 7.683 8.043
12RCA 8.280 0.293 0.084 7.801 8.178

Area analysis of 12-bit adder

Types LUT (41000) Slice (10250) Cells Nets
12CLA 12 4 29 97
2RCA+10CLA 11 5 61 48
4RCA+8CLA 10 5 46 80
6RCA+6CLA 10 4 38 72
8RCA+4CLA 10 5 30 64
12 RCA 12 4 12 49

From Table 2, it is noticed that the heterogeneous adder model designed using 4-bit RCA and 8-bit CLA has the least total on-chip power consumption of 8.141 W. In Table 3, the heterogeneous adder (6-bit CLA + 6-bit RCA) has the least no. of slices and, hence, the least area utilization.

Power and area analysis of 14-bit adder with different combinations are tabulated in Table 4 & Table 5, respectively.

Power calculation of 14-bit adder

Types Total Power (in W) Dynamic Power PD (in W) PD = PS+PL+PI (in W)
Signal Power PS Logic Power PL IO Power PI
14CLA 9.865 0.523 0.087 9.146 9.756
2RCA+12CLA 9.510 0.352 0.092 8.959 9.403
6RCA+8CLA 9.489 0.349 0.092 8.942 9.383
8RCA+6CLA 9.483 0.345 0.089 8.942 9.376
10RCA+4CLA 9.482 0.345 0.089 8.941 9.375
4RCA+10CLA 9.495 0.341 0.086 8.944 9.371
14RCA 9.623 0.360 0.096 9.059 9.515

Area analysis of 14-bit adder

Types LUT (41000) Slice (10250) Cells Nets
14CLA 14 7 48 128
2RCA+12CLA 13 6 64 104
6RCA+8CLA 12 5 48 88
8RCA+6CLA 12 6 40 80
10RCA+4CLA 12 6 32 72
4RCA+10CLA 12 6 71 111
14RCA 14 5 14 57

From Table 4, it is observed that the 10-bit RCA + 4-bit CLA heterogeneous adder architecture design has the least total on-chip power consumption of 9.482 W. In Table 5, the heterogeneous adder (6-bit RCA + 8-bit CLA) has the least no. of slices and, hence, the least area utilization.

Power and area analysis of 16-bit adder with different combinations are given in Table 6 & Table 7, respectively.

Power calculation of 16-bit adder

Types Total Power (in W) Dynamic Power PD (in W) PD = PS+PL+PI (in W)
Signal Power PS Logic Power PL IO Power PI
16CLA 10.971 0.422 0.118 10.319 10.859
2RCA+14CLA 10.848 0.410 0.108 10.218 10.736
4RCA+12CLA 10.827 0.408 0.104 10.202 10.714
6RCA+10CLA 10.830 0.410 0.108 10.200 10.718
8RCA+8CLA 10.833 0.415 0.106 10.200 10.721
10RCA+6CLA 10.828 0.412 0.105 10.200 10.717
12RCA+4CLA 10.832 0.416 0.104 10.199 10.719
16RCA 10.962 0.418 0.114 10.316 10.848

Area calculation of 16-bit adder

Types LUT (41000) Slice (10250) Cells Nets
16CLA 16 7 37 129
2RCA+14CLA 15 7 74 120
4RCA+12CLA 14 7 83 129
6RCA+10CLA 14 6 58 104
8RCA+8CLA 14 7 50 96
10RCA+6CLA 14 7 42 88
12RCA+4CLA 14 7 34 80
16RCA 16 6 16 65

From Table 6, it is observed that the 4-bit RCA + 12-bit CLA heterogeneous adder architecture design has the least total on-chip power consumption of 10.827 W.

In Table 7, the heterogeneous adder (6-bit RCA + 10-bit CLA) has the fewest slices, hence the least area utilization. Finally, Tables 8 and 9 summarize the best combinations for power and area analysis of 12-bit, 14-bit, and 16-bit heterogeneous adders, respectively.

Best power analysis of 12-bit, 14-bit & 16-bit heterogeneous adders

Types Combinations Total Power (in W)
12-bit 4RCA+8CLA 8.141
14-bit 10RCA+4CLA 9.482
16-bit 4RCA+12CLA 10.827

Best area analysis of 12-bit, 14-bit, and 16-bit heterogeneous adders

Types Combinations Slice
12-bit 6RCA+6CLA 4
14-bit 6RCA+8CLA 5
16-bit 6RCA+10CLA 6
Conclusion

The paper successfully presents the design and analysis of 12-bit, 14-bit and 16-bit heterogeneous adders, exploring several groupings of m-bit CLA and n-bit RCA adders. The criteria for selecting the optimal models involve choosing the one with the least power consumption, determined by the least dynamic power. Additionally, the model with the finest area utilization is nominated based on the minimum LUT count. In the case of 12-bit adders, the 8-bit CLA + 4-bit RCA combination of heterogeneous adder exhibits the least total on-chip power, and the 6-bit CLA + 6-bit RCA combination of heterogeneous adder demonstrates the least area utilization. In the case of 14-bit adders, the grouping of 4-bit CLA and 10-bit RCA in the heterogeneous adder shows the least total on-chip power and the combination of 8-bit CLA and 6-bit RCA in the heterogeneous adder exhibits the least area utilization. In the case of 16-bit adders, the grouping of 12-bit CLA and 4-bit RCA in the heterogeneous adder demonstrates the least total on-chip power, and the combination of 10-bit CLA and 6-bit RCA in the heterogeneous adder has the least area utilization. These results provide valuable insights into the power and area efficiency of different heterogeneous adder architectures for various bit widths, aiding in the selection of optimized designs based on specific performance criteria. Therefore, the proposed heterogeneous adders with the perfect combination of RCA and CLA can be implemented for high speed, less area, and low power operation in advanced digital system applications.