Header

UZH-Logo

Maintenance Infos

EdgeDRNN: Enabling Low-latency Recurrent Neural Network Edge Inference


Gao, Chang; Rios-Navarro, Antonio; Chen, Xi; Delbruck, Tobi; Liu, Shih-Chii (2020). EdgeDRNN: Enabling Low-latency Recurrent Neural Network Edge Inference. In: 2020 2nd IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS), Genova, Italy, 31 August 2020 - 2 September 2020.

Abstract

This paper presents a Gated Recurrent Unit (GRU) based recurrent neural network (RNN) accelerator called EdgeDRNN designed for portable edge computing. EdgeDRNN adopts the spiking neural network inspired delta network algorithm to exploit temporal sparsity in RNNs. It reduces off-chip memory access by a factor of up to 10x with tolerable accuracy loss. Experimental results on a 10 million parameter 2-layer GRURNN, with weights stored in DRAM, show that EdgeDRNN computes them in under 0.5 ms. With 2.42 W wall plug power on an entry level USB powered FPGA board, it achieves latency comparable with a 92W Nvidia 1080 GPU. It outperforms NVIDIA Jetson Nano, Jetson TX2 and Intel Neural Compute Stick 2 in latency by 6X. For a batch size of 1, EdgeDRNN achieves a mean effective throughput of 20.2 GOp/s and a wall plug power efficiency that is over 4X higher than all other platforms.

Abstract

This paper presents a Gated Recurrent Unit (GRU) based recurrent neural network (RNN) accelerator called EdgeDRNN designed for portable edge computing. EdgeDRNN adopts the spiking neural network inspired delta network algorithm to exploit temporal sparsity in RNNs. It reduces off-chip memory access by a factor of up to 10x with tolerable accuracy loss. Experimental results on a 10 million parameter 2-layer GRURNN, with weights stored in DRAM, show that EdgeDRNN computes them in under 0.5 ms. With 2.42 W wall plug power on an entry level USB powered FPGA board, it achieves latency comparable with a 92W Nvidia 1080 GPU. It outperforms NVIDIA Jetson Nano, Jetson TX2 and Intel Neural Compute Stick 2 in latency by 6X. For a batch size of 1, EdgeDRNN achieves a mean effective throughput of 20.2 GOp/s and a wall plug power efficiency that is over 4X higher than all other platforms.

Statistics

Citations

Dimensions.ai Metrics

Altmetrics

Downloads

2 downloads since deposited on 15 Feb 2021
2 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Conference or Workshop Item (Paper), refereed, original work
Communities & Collections:07 Faculty of Science > Institute of Neuroinformatics
Dewey Decimal Classification:570 Life sciences; biology
Scopus Subject Areas:Physical Sciences > Artificial Intelligence
Physical Sciences > Computer Science Applications
Physical Sciences > Hardware and Architecture
Physical Sciences > Electrical and Electronic Engineering
Language:English
Event End Date:2 September 2020
Deposited On:15 Feb 2021 15:22
Last Modified:16 Feb 2021 21:02
Publisher:IEEE
ISBN:9781728149226
OA Status:Green
Publisher DOI:https://doi.org/10.1109/aicas48895.2020.9074001

Download

Green Open Access

Download PDF  'EdgeDRNN: Enabling Low-latency Recurrent Neural Network Edge Inference'.
Preview
Content: Accepted Version
Filetype: PDF
Size: 390kB
View at publisher