

# DESIGNING MULTIPLIER OF FPGA USING LOW POWER TECHNIQUES

# Dr. Lokesh Kumar<sup>1</sup>, Anil Kumar<sup>2</sup>, Mr. Jitendra Singh<sup>3</sup>

<sup>1</sup>Assistant Professor, Usha Martin University, Ranchi, Jharkhand.

<sup>2</sup>Assistant Professor, Department of Electrical and Electronics Engineering., Mangalayatan University, Beswan, Uttar Pradesh.

<sup>3</sup>Assistant Professor, Department of Electrical and Electronics Engineering., Mangalayatan University, Beswan, Uttar Pradesh.

Article Info Volume 83

Page Number: 39-43
Publication Issue:
September/October 2020

Article History

Article Received: 4 June 2020

Revised: 18 July 2020 Accepted: 20 August 2020

**Publication**: 15 September 2020

#### **Abstract**

Surmised figure may be useful for applications that need expert data management and mistake correction, such as sign and image creation, PC vision, and artificial intelligence. To reduce the amount of effort required to process information, inexact registration circuits are being examined. Rough multipliers that rely on certain partial part-based truncation multiplier circuits can now be implemented in FPGAs, according to this paper. Proposal multiplier's presentation is compared to rough multiplier relying on precise calculations introduced in terms of force usage, precision, and time delay. The estimated configuration achieved an energy-efficient mode with a high degree of accuracy. Comparing the proposed model to the standard direct truncation method reveals how much of a difference it makes to the presentation. An inexact multiplier design, on the other hand, proved to be more energy efficient.

**Key Words:** Power consumption, power estimation, FPGA, high-level power estimation, ASIC, power modeling, tools.

## Introduction

Now days each circuit to face the power consumption issue for booth portable device battery life is the primary goal. A crucial arithmetic logic operation, multiplication, relies on it. The execution time of a DSP system is dominated by the multiplication process. As a result, power is now given equal weight to other factors like surface area and speed. By reducing feature size and increasing chip density and operating frequency, power consumption is lowered. At the technological, physical, circuit, and logic levels, low-power multipliers have been extensively researched. Multiplier modules aren't the only ones that can benefit from these low-level techniques; other types of modules can also use them. Furthermore, data switching patterns have a direct impact on power consumption. However, low-level power optimization does not allow for consideration of applicationproperties. specific data Algorithm development using existing hardware with

digital computer arithmetic for logic design as a focus.

**Literature Review** 

STEFANIA PERRI, FANNY SPAGNOLO, FRUSTACI, **PASQUALE FABIO** CORSONELLO (2020) This paper presents a novel method for developing efficient power-oftwo multipliers using existing FPGAs. Fixedpoint power-of-two multiplications may be used to reduce the computational complexity of computationally intensive applications such as computer vision, deep learning, and many others. Modern FPGA devices provide faster IP cores, such as embedded modules, such as DSP blocks, and area-optimized IP cores, such as look-up tables and flip-flops. Because of their restricted availability or low operating frequency, IP cores such as these cannot fully exploit an FPGA device's total processing capabilities.

**SALIM ULLAH (2020)** Multiplication is a common mathematical operation in many programmes, including multimedia processing



and artificial neural networks. Energy, critical route latencies, and resources are all affected by the employment of multipliers in these applications. In particular, systems based on FPGAs are vulnerable to these problems. An area-optimized, low14 latency, and energy efficient architecture for an accurate signed multiplier is presented in this letter because of these restrictions. The Viva do area-optimized multiplier IP 16 consumes up to 40% less space and reduces latency and energy by 70.0 percent compared to our alternatives.

NASSER, **YEHYA JORDANE LORANDEL** (2020) Electronic circuits have a huge issue in terms of power usage. One method to approach this problem is to think about it early in the design phase so that several design options can be explored. The standard design flow requires that suitable models be given at the beginning of the process. In order to find a correlation between power and other measures, power modeling approaches can be used. Even more importantly, it is essential to use efficient power measurement techniques. This research aims to provide an overview of RTL to transistor-level power modelling estimation methods for FPGAs and ASICs.

GAURAV VERMA (2017) in this study, power-saving strategies some communication-centric architectures aimed at FPGAs are presented. Most of the strategies described in the literature are employed only at the device level. X Power Analyzer can be used as a CAD tool to help reduce power consumption at the architectural level of the design hierarchy. The arithmetic and logical unit (ALU) circuits of portable wireless devices are examined using analytical methods. XILINX ISE has been used to verify and implement the circuit on a Spartan 3E FPGA. According to the findings, power consumption has decreased significantly.

## **Proposed system**

Fig. 1 shows a 4bit approximation of two 16-bit inputs that have been converted to fractional parts using leading one-bit placement. In addition to the 4bit inputs, we'll include all of the estimated values. The final outcome will be determined by moving the final result.



Fig.1 proposed system flow chart.

Every integer N can be denoted by

Here k represents the place of leading bit and xi represents the i<sup>th</sup> bit.

$$P \times Q = 2kP + kQ \times XP \times XQ. \tag{2}$$

Xp and XQ represent the width. The approximate value is calculated from the fractional value of Xp and XQ.

$$P \times Q = 2kP+kQ \times (1 + YP + YQ + YP \times YQ)$$
.  
Here KP is the leading one-bit position of P and KO is the



Fig.2. "Dot diagram to assume t=7 and h=3."

Fig. 2 Shows the dot diagram of term  $1+ (YP)t + (YQ)t + (YP) APX \times (YQ)APX$  where t = 7 and t = 3.

Now, the approximate of (3) may be expressed as  $P \times Q \approx 2kP + kQ \times (1 + YP + YQ + (YP) APX \times (YQ) APX)$ . YP and YQ bits are reduced to t bits to increase the speed  $P \times Q \approx (P \times Q) APX = 2kP + kQ \times 1 + (YP)t + (YQ)t + (YP)APX \times (YQ)APX$  16bit X 16-bit Multiply by Y INPUT PARAMETERS (h, t) h- height t – fraction part



Ex: 1011\_1000\_001 can be represented as 1.011\_1000\_001 x 2<sup>-10</sup>

where Green – fraction part

0000\_0010\_0001\_0 101 x 0001\_1010\_001 1\_1100

 $2^{15}$ ,  $2^{14}$ ,  $2^{13}$ ......  $2^{1}$ ,  $2^{0}$ 

Check first 1 from MSB and find its binary location

Here for X first '1' comes at  $2^9$  KP=9 Here for Y first '1' comes at  $2^{12}$  KQ=12

Next to KA fractional parts ASSUME t =7; h=3;

XA) t = 0000101 (YA) t = 1010001

APX first 3 bits from (XA) t and pad '1' at LSB side

(XA)APX = 0001 (YA)APX = 1011

Final computation = ((XA)APX) x (YA)APX + 1 = 8-bit output +((XA)APX) x (YA)APX + 1 +

 $(XA)t + (YA) = 0000\_1011 + 0000 1010$  (pad 0 at LSB side) +

 $1010\_0010$  (pad 0 at LSB side) = 01 1011 0111

There have been 13 instances like this. As a result, the final value is 35, 96,288 as indicated by the output of the function 01 1011 0111. The exact output is 35, 79,628 and is represented by the number 11 0110 1001 1110 1110 1100. As a result of this, there is a difference of 16,660 approximately 1% error rate and 99.999% accuracy. We'll figure out how to connect the t and h borders so that we may get a high degree of precision while still using sufficient energy and speed. For unsigned operands, the proposed method of duplication is possible. One strategy for tracking the supreme value of the information operands used for marked multipliers is to utilize the information operands for marked multipliers and then re-do the computation that has been recommended before.

## **Software implementations**

"Field-programmable" refers to a circuit that can be rearranged by its creator after it has been assembled. The majority of the FPGA setup made use of a language for equipment representation (HDL). ASIC diagrams, which show the design in precise terms, have recently been employed. Using FPGAs, you can implement any ASIC-like capabilities that an FPGA can. However, certain applications gain from the capacity to renew usefulness after transportation, fractional re-setup of the

bit of the plan, and decreased non-repeating design expenses when compared to an ASIC plan, despite the typically higher unit cost. A simulation of the outcome is given in Figure 3.



Fig.3. Simulated output

To put it another way: An FPGA is essentially a one-chip programmable breadboard that can be "wired together" via a series of reconfigurable interconnects called "rational blocks." XOR and other simple rationale entryways can be implemented using rationale squares as well as more complicated ones. FPGA logic blocks often include memory components, such as rudimentary flip-lemons or more complete squares of memory, in addition to the logic itself. The FPGA (field programmable gate array) design market is growing rapidly. The increased complexity of the FPGA's design means that it can now be used in a much broader range of applications than before. The latest generation of FPGAs is leading away from a simple "rationale only" architecture to one with dedicated squares for specialised purposes. It is essential for the creator to become familiar with the numerous models and their attributes, but he also needs an efficient way to evaluate the presentation of his strategy when he is focusing on different innovations. With the most recent contributions from the main FPGA suppliers, this article briefly reviews the most recent advances before looking at the necessity of utilising the suitable amalgamation tool to focus on a similar technique to these diverse accomplishments. [page needed]



## **Result and discussion**



**Fig.4 Power Consumption Report** 

If you look at Figure 4, you see that the total thermal power consumption is about 61.31mW. This is because the core dynamic dissipation is 0.01mW, the core static dissipation is about 46.15mW, and the thermal power dissipation of I/O is about 15.15mW.



Fig.5 Area Utilization Report

There are 409/5,136 logic elements and 136/5,136 logic registers in the fig.5 (8% and 3%, respectively). 67/183 pins were used in total (37 percent). The default value for the virtual and memory pins is zero.

Table 1 "Trade off analyses of approximate DCT over DCT over QUARTUS II hardware synthesis using CYCLONE II family"

| DCT model                                    | Area (LE's<br>used | Speed (MHz) | Total<br>dissipation | power |
|----------------------------------------------|--------------------|-------------|----------------------|-------|
| Conventional Direct<br>truncation multiplier | 1197               | 109.33 MHz  | 150.10mW             |       |
| Approximated DCT                             | 409                | 164.58 MHz  | 6131mW               |       |

# Conclusion

When the operands h and t are both shortened,

the area and energy efficiency of this multiplier are shown. Finally, the real error is reduced by rounding to the nearest odd number. Because this multiplier uses FPGA hardware synthesis to boost performance and scalability, the balance is stable. "Speed increased by 50%, area decreased by 70%, and energy was cut by 60%." Delay, power and efficiency are all demonstrated by the technique, which may be expanded to various algorithms for both signed and unsigned data in the 16- and 32-bit range. Investigating rounding patterns can help with active partial product rows.

## Reference

- 1. Stefania Perri,Fanny Spagnolo,Fabio Frustaci,Pasquale Corsonello(2020)," Parallel architecture of power-of-two multipliers for FPGAs," First published: 28 February 2020 https://doi.org/10.1049/iet-cds.2019.0246.
- 2. Salim Ullah (2020)," Energy-Efficient Low-Latency Signed Multiplier for FPGA-Based Hardware Accelerators," 1943-0663 c 2020 IEEE 3. Yehya Nasser, Jordane Lorandel(2020)," RTL to Transistor Level Power Modelling and Estimation Techniques for FPGA and ASIC: A Survey," HAL Id: hal-02866921 https://hal.archives-ouvertes.fr/hal-02866921 Submitted on 12 Jun 2020
- 4. Gaurav Verma(2017)," Analysis of Low Power Consumption Techniques on FPGA for Wireless Devices," uly 2017Wireless Personal Communications 95(2) DOI:10.1007/s11277-016-3896-2
- 5. S. Reda and A. N. Nowroz, "Power modeling and characterization of computing devices," Foundations and Trends R in Electronic Design Automation, vol. 6, no. 2, pp. 121–216, 2012.
- 6. A. B. Darwish, M. A. El-Moursy, and M. A. Dessouky, Power Modeling and Characterization. Cham: Springer International Publishing, 2020, pp. 47–57.
- 7. Y. Nasser, C. Sau, J.-C. Prevotet, T. Fanni, F. Palumbo, M. H'elard, and L. Raffo, "Neupow: Artificial neural networks for power and behavioral modeling of arithmetic components in 45nm asics technology," in Proceedings of the 16th ACM International Conference on Computing Frontiers, ser. CF '19. New York, NY, USA: ACM, 2019, pp. 183–189.
- 8. D. Bellizia, D. Cellucci, V. Di Stefano, G. Scotti, and A. Trifiletti, "Novel measurements setup for attacks exploiting static power using dc



pico-ammeter," in 2017 European Conference on Circuit Theory and Design (ECCTD). IEEE, 2017, pp. 1–4.

- 9. P.-E. Gaillardon, E. Beigne, S. Lesecq, and G. D. Micheli, "A survey on low-power techniques with emerging technologies: From devices to systems," J. Emerg. Technol. Comput. Syst., vol. 12, no. 2, pp. 12:1–12:26, Sep. 2015.
- 10. A. Nafkha and Y. Louet, "Accurate measurement of power consumption overhead during fpga dynamic partial reconfiguration," in 2016 International Symposium on Wireless Communication Systems (ISWCS), Sept 2016, pp. 586–591.
- 11. M. A. Rihani, F. Nouvel, J. Prevotet, M. Mroue, J. Lorandel, 'and Y. Mohanna, "Dynamic and partial reconfiguration power consumption runtime measurements analysis for zynq soc devices," in 2016 International Symposium on Wireless Communication Systems (ISWCS), Sept 2016, pp. 592–596.
- 12. R. Jevtic and C. Carreras, "Power measurement methodology for fpga devices," IEEE Transactions on Instrumentation and Measurement, vol. 60, no. 1, pp. 237–247, 2011.
- 13. J. Oliver, F. Veirano, D. Bouvier, and E. Boemo, "A low cost system for self measurements of power consumption in field programmable gate arrays," Journal of Low Power Electronics, vol. 13, no. 1, 2017.
- 14. J. J. Davis, E. Hung, J. M. Levine, E. A. Stott, P. Y. Cheung, and G. A. Constantinides, "KAPow: High-accuracy, low-overhead online per-module power estimation for FPGA designs," ACM Transactions on Reconfigurable Technology and Systems, 2018. Z. Lin, S. Sinha, and W. Zhang, "An Ensemble Learning