Environmental Sound Recognition on Embedded Systems: From FPGAs to TPUs

  title={Environmental Sound Recognition on Embedded Systems: From FPGAs to TPUs},
  author={Jurgen Vandendriessche and Nick Wouters and Bruno Alves da Silva and Mimoun Lamrini and Mohamed Yassin Chkouri and Abdellah Touhafi},
In recent years, Environmental Sound Recognition (ESR) has become a relevant capability for urban monitoring applications. The techniques for automated sound recognition often rely on machine learning approaches, which have increased in complexity in order to achieve higher accuracy. Nonetheless, such machine learning techniques often have to be deployed on resource and power-constrained embedded devices, which has become a challenge with the adoption of deep learning approaches based on… 

A Comparison of Deep Learning Inference Engines for Embedded Real-time

A comparison of four available deep learning inference engines for real-time audio classification on the CPU of an embedded single-board computer: TensorFlow Lite, TorchScript, ONNX Runtime, and RTNeural shows that all inference engines can execute neural network models in real- time with ap-propriate code practices, but execution time varies between engines and models.

On the Challenges of Embedded Real-time Music Information Retrieval

This paper identifies and discusses the challenges and limitations of embedded real-time MIR, and discusses potential solutions to these challenges, and demonstrates their validity by presenting an embeddedreal-time classifier of expressive acoustic guitar techniques.

Embedded Sensing System for Recognizing Citrus Flowers Using Cascaded Fusion YOLOv4-CF + FPGA

A lightweight object recognition model using cascade fusion YOLOv4-CF is proposed, which recognizes multi-type objects in their natural environments, such as citrus buds, citrus flowers, and gray mold, which has an excellent representation capability with an improved cascade fusion network and a multi-scale feature fusion block.

Research of Digital-Analog Conversion Method for Reproduction of Mechanical Oscillations

This paper considers the development of software-hardware complex for an alternative method of sound reproduction – by means of air, which is a mechanical device with connected fans for the distribution of air flows and software for its operation.



Evaluation of Classical Machine Learning Techniques towards Urban Sound Recognitionon Embedded Systems

This evaluation provides a real estimation of what can be expected when performing urban sound classification on embedded devices with respect to accuracy and execution time and a cascade approach is also proposed to combine ML techniques by exploiting embedded characteristics such as pipeline or multi-thread execution present in current embedded devices.

MosAIc: A Classical Machine Learning Multi-Classifier Based Approach against Deep Learning Classifiers for Embedded Sound Classification

Experimental results show that classical machine learning classifiers can be combined to achieve similar results to deep learning models, and even outperform them in accuracy, the cost of which is a larger classification time.

Utilization of FPGA for Onboard Inference of Landmark Localization in CNN-Based Spacecraft Pose Estimation

This paper investigates the use of a hybrid Field Programmable Gate Array and Systems-on-Chip device for efficient onboard inferencing of the Convolutional Neural Network (CNN) part of spacecraft pose estimation methods.

Toolflows for Mapping Convolutional Neural Networks on FPGAs

A survey of the existing CNN-to-FPGA toolflows is presented, comprising a comparative study of their key characteristics, which include the supported applications, architectural choices, design space exploration methods, and achieved performance.

hls4ml: An Open-Source Codesign Workflow to Empower Scientific Low-Power Machine Learning Devices

Hls4ml, an open-source software-hardware co-design workflow to interpret and translate machine learning algorithms for implementation in FPGAs and ASICs specifically to support domain scientists, is developed.

Automatic heterogeneous quantization of deep neural networks for low-latency inference on the edge for particle detectors

A method for designing optimally heterogeneously quantized versions of deep neural network models for minimum-energy, high-accuracy, nanosecond inference and fully automated deployment on chip is introduced.

Improving Post Training Neural Quantization: Layer-wise Calibration and Integer Programming

Two pipelines are introduced, advanced and light, where the former involves minimizing the quantization errors of each layer by optimizing its parameters over the calibration set and using integer programming to optimally allocate the desired bit-width for each layer while constraining accuracy degradation or model compression.

Machine Learning Algorithms for Environmental Sound Recognition: Towards Soundscape Semantics

This paper investigates methods aiming at the automatic recognition and classification of discrete environmental sounds, for the purpose of subsequently applying these methods to the recognition of

ESC: Dataset for Environmental Sound Classification

A new annotated collection of 2000 short clips comprising 50 classes of various common sound events, and an abundant unified compilation of 250000 unlabeled auditory excerpts extracted from recordings available through the Freesound project are presented.