Accelerating Inference on Binary Neural Networks with Digital RRAM Processing

David Atienza; Pierre-Emmanuel Gaillardon; João Vieira; Edouard Giacomin; Yasir Qureshi; Marina Zapater; Xifan Tang; Shahar Kvatinsky

doi:10.1007/978-3-030-53273-4_12

Conference Papers Year : 2020

Accelerating Inference on Binary Neural Networks with Digital RRAM Processing

(1) , (2) , (3) , (2) , (1) , (1) , (2) , (4)

1
2
3
4

David Atienza

Function : Author

Ecole Polytechnique Fédérale de Lausanne

Pierre-Emmanuel Gaillardon

Function : Author

University of Utah

João Vieira

Function : Author
PersonId : 1120078

Universidade de Lisboa = University of Lisbon = Université de Lisbonne

Edouard Giacomin

Function : Author

University of Utah

Yasir Qureshi

Function : Author

Ecole Polytechnique Fédérale de Lausanne

Marina Zapater

Function : Author

Ecole Polytechnique Fédérale de Lausanne

Xifan Tang

Function : Author

University of Utah

Shahar Kvatinsky

Function : Author

Technion - Israel Institute of Technology [Haifa]

Abstract

The need for efficient Convolutional Neural Network (CNNs) targeting embedded systems led to the popularization of Binary Neural Networks (BNNs), which significantly reduce execution time and memory requirements by representing the operands using only one bit. Also, due to 90% of the operations executed by CNNs and BNNs being convolutions, a quest for custom accelerators to optimize the convolution operation and reduce data movements has started, in which Resistive Random Access Memory (RRAM)-based accelerators have proven to be of interest. This work presents a custom Binary Dot Product Engine(BDPE) for BNNs that exploits the low-level compute capabilities enabled RRAMs. This new engine allows accelerating the execution of the inference phase of BNNs by locally storing the most used kernels and performing the binary convolutions using RRAM devices and optimized custom circuitry. Results show that the novel BDPE improves performance by 11.3%, energy efficiency by 7.4% and reduces the number of memory accesses by 10.7% at a cost of less than 0.3% additional die area.

Keywords

Machine Learning Embedded systems Binary Neural Networks RRAM-based Binary Dot Product Engine

Domains

Computer Science [cs]

Fichier principal

501403_1_En_12_Chapter.pdf (486.62 Ko)

Origin	Files produced by the author(s)

Hal Ifip : Connect in order to contact the contributor

https://inria.hal.science/hal-03476606

Submitted on : Monday, December 13, 2021-9:24:37 AM

Last modification on : Monday, May 27, 2024-2:10:17 PM

Long-term archiving on : Monday, March 14, 2022-6:28:41 PM

Dates and versions

hal-03476606 , version 1 (13-12-2021)

Licence

Attribution

Identifiers

HAL Id : hal-03476606 , version 1
DOI : 10.1007/978-3-030-53273-4_12

Cite

David Atienza, Pierre-Emmanuel Gaillardon, João Vieira, Edouard Giacomin, Yasir Qureshi, et al.. Accelerating Inference on Binary Neural Networks with Digital RRAM Processing. 27th IFIP/IEEE International Conference on Very Large Scale Integration - System on a Chip (VLSI-SoC), Oct 2019, Cusco, Peru. pp.257-278, ⟨10.1007/978-3-030-53273-4_12⟩. ⟨hal-03476606⟩

Export

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

IFIP IFIP-AICT IFIP-TC IFIP-WG IFIP-VLSISOC IFIP-TC10 IFIP-WG10-5 IFIP-AICT-586

49 View

65 Download

Accelerating Inference on Binary Neural Networks with Digital RRAM Processing

Abstract

Keywords

Domains

Dates and versions

Licence

Identifiers

Cite

Export

Collections

Altmetric

Share