Streaming keyword spotting on mobile devices

@article{Rybakov2020StreamingKS,
  title={Streaming keyword spotting on mobile devices},
  author={Oleg Rybakov and Natasha Kononenko and Niranjan A. Subrahmanya and Mirk{\'o} Visontai and S. Laurenzo},
  journal={ArXiv},
  year={2020},
  volume={abs/2005.06720}
}
In this work we explore the latency and accuracy of keyword spotting (KWS) models in streaming and non-streaming modes on mobile phones. NN model conversion from non-streaming mode (model receives the whole input sequence and then returns the classification result) to streaming mode (model receives portion of the input sequence and classifies it incrementally) may require manual model rewriting. We address this by designing a Tensorflow/Keras based library which allows automatic conversion of… Expand
8 Citations
Keyword Transformer: A Self-Attention Model for Keyword Spotting
  • Highly Influenced
  • PDF
Learning Efficient Representations for Keyword Spotting with Triplet Loss
  • 2
  • Highly Influenced
  • PDF
Hardware Aware Training for Efficient Keyword Spotting on General Purpose and Specialized Hardware
  • 3
  • Highly Influenced
  • PDF
AST: Audio Spectrogram Transformer
  • PDF
Efficient Keyword Spotting through long-range interactions with Temporal Lambda Networks
  • PDF
AutoKWS: Keyword Spotting with Differentiable Architecture Search
  • 4
  • PDF
Real-time Speech Frequency Bandwidth Extension
  • PDF

References

SHOWING 1-10 OF 30 REFERENCES
MatchboxNet: 1D Time-Channel Separable Convolutional Neural Network Architecture for Speech Commands Recognition
  • 7
  • Highly Influential
  • PDF
SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition
  • 742
  • Highly Influential
  • PDF
End-to-End Streaming Keyword Spotting
  • 20
  • Highly Influential
  • PDF
Convolutional Recurrent Neural Networks for Small-Footprint Keyword Spotting
  • 99
  • Highly Influential
  • PDF
Hello Edge: Keyword Spotting on Microcontrollers
  • 160
  • Highly Influential
  • PDF
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
  • 6,920
  • Highly Influential
  • PDF
LSTM recurrent networks learn simple context-free and context-sensitive languages
  • 495
  • Highly Influential
  • PDF
Long Short-Term Memory
  • 38,155
  • Highly Influential
  • PDF
Temporal Convolution for Real-time Keyword Spotting on Mobile Devices
  • 27
  • Highly Influential
  • PDF
A neural attention model for speech command recognition
  • 37
  • Highly Influential
  • PDF
...
1
2
3
...