Header

UZH-Logo

Maintenance Infos

Event-driven Pipeline for Low-latency Low-compute Keyword Spotting and Speaker Verification System


Ceolini, Enea; Anumula, Jithendar; Braun, Stefan; Liu, Shih-Chii (2019). Event-driven Pipeline for Low-latency Low-compute Keyword Spotting and Speaker Verification System. In: ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, United Kingdom, 12 May 2019 - 17 May 2019, IEEE.

Abstract

This work presents an event-driven acoustic sensor processing pipeline to power a low-resource voice-activated smart assistant. The pipeline includes four major steps; namely localization, source separation, keyword spotting (KWS) and speaker verification (SV). The pipeline is driven by a front-end binaural spiking silicon cochlea sensor. The timing information carried by the output spikes of the cochlea provide spatial cues for localization and source separation. Spike features are generated with low latencies from the separated source spikes and are used by both KWS and SV which rely on state-of-the-art deep recurrent neural network architectures with a small memory footprint. Evaluation on a self-recorded event dataset based on TIDIGITS shows accuracies of over 93% and 88% on KWS and SV respectively, with minimum system latency of 5 ms on a limited resource device.

Abstract

This work presents an event-driven acoustic sensor processing pipeline to power a low-resource voice-activated smart assistant. The pipeline includes four major steps; namely localization, source separation, keyword spotting (KWS) and speaker verification (SV). The pipeline is driven by a front-end binaural spiking silicon cochlea sensor. The timing information carried by the output spikes of the cochlea provide spatial cues for localization and source separation. Spike features are generated with low latencies from the separated source spikes and are used by both KWS and SV which rely on state-of-the-art deep recurrent neural network architectures with a small memory footprint. Evaluation on a self-recorded event dataset based on TIDIGITS shows accuracies of over 93% and 88% on KWS and SV respectively, with minimum system latency of 5 ms on a limited resource device.

Statistics

Citations

Dimensions.ai Metrics
6 citations in Web of Science®
8 citations in Scopus®
Google Scholar™

Altmetrics

Downloads

30 downloads since deposited on 11 Feb 2020
29 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Conference or Workshop Item (Paper), refereed, original work
Communities & Collections:07 Faculty of Science > Institute of Neuroinformatics
Dewey Decimal Classification:570 Life sciences; biology
Scopus Subject Areas:Physical Sciences > Software
Physical Sciences > Signal Processing
Physical Sciences > Electrical and Electronic Engineering
Language:English
Event End Date:17 May 2019
Deposited On:11 Feb 2020 15:15
Last Modified:27 Jan 2022 01:10
Publisher:IEEE
ISBN:9781479981311
OA Status:Green
Publisher DOI:https://doi.org/10.1109/icassp.2019.8683669

Download

Green Open Access

Download PDF  'Event-driven Pipeline for Low-latency Low-compute Keyword Spotting and Speaker Verification System'.
Preview
Content: Published Version
Filetype: PDF
Size: 273kB
View at publisher