• Corpus ID: 208910817

Visualizing Deep Neural Networks for Speech Recognition with Learned Topographic Filter Maps

@article{Krug2019VisualizingDN,
  title={Visualizing Deep Neural Networks for Speech Recognition with Learned Topographic Filter Maps},
  author={Andreas Krug and Sebastian Stober},
  journal={ArXiv},
  year={2019},
  volume={abs/1912.04067}
}
The uninformative ordering of artificial neurons in Deep Neural Networks complicates visualizing activations in deeper layers. This is one reason why the internal structure of such models is very unintuitive. In neuroscience, activity of real brains can be visualized by highlighting active regions. Inspired by those techniques, we train a convolutional speech recognition model, where filters are arranged in a 2D grid and neighboring filters are similar to each other. We show, how those… 

Figures from this paper

Analyzing and Visualizing Deep Neural Networks for Speech Recognition with Saliency-Adjusted Neuron Activation Profiles

TLDR
SNAPs are a flexible framework to analyze and visualize Deep Neural Networks that does not depend on visually interpretable data and are demonstrated how to utilize SNAPs for understanding fully-convolutional ASR models.

Interpreting intermediate convolutional layers of generative CNNs trained on waveforms

TLDR
Using the proposed technique, it is argued that averaging over feature maps after ReLU activation in each transpose convolutional layer yields interpretable time-series data.

Interpreting Intermediate Convolutional Layers In Unsupervised Acoustic Word Classification

  • G. BegušAlan Zhou
  • Computer Science
    ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2022
TLDR
A technique to visualize and interpret intermediate layers of unsupervised deep Convolutional networks by averaging over individual feature maps in each convolutional layer and inferring underlying distributions of words with non-linear regression techniques is proposed.

R EVISITING TRANSPOSED CONVOLUTIONS FOR IN TERPRETING RAW WAVEFORM SOUND EVENT RECOG NITION CNN S BY SONIFICATION

  • Computer Science
  • 2021
TLDR
This work proposes sonification, a method to interpret intermediate feature representations of sound event recognition convolutional neural networks trained on raw waveforms by mapping these representations back into the discrete-time input signal domain, highlighting substructures in the input that maximally activate a feature map as intelligible acoustic events.

Interpreting intermediate convolutional layers of CNNs trained on raw speech

TLDR
Using the proposed technique, one can analyze how linguistically meaningful units in speech get encoded in different convolutional layers by linearly interpolating individual latent variables to marginal levels outside of the training range.

Understanding Adaptive, Multiscale Temporal Integration In Deep Speech Recognition Systems

TLDR
The method provides a straightforward and general-purpose toolkit2 for understanding temporal integration in black-box machine learning models and suggests that deep speech recognition systems use a common motif to encode the hierarchical structure of speech: integrating across short, time-yoked windows at early layers and long, structure-yokes windows at later layers.

Exploration of Interpretability Techniques for Deep COVID-19 Classification using Chest X-ray Images

TLDR
Five different deep learning models and their Ensemble have been used, to classify COVID-19, pneumoniae and healthy subjects using Chest X-Ray, and qualitative results depicted the ResNets to be the most interpretable model.

References

SHOWING 1-10 OF 12 REFERENCES

Neuron Activation Profiles for Interpreting Convolutional Speech Recognition Models

TLDR
An alternative method which analyzes neuron activations for whole data sets and investigates whether phonemes are implicitly learned as an intermediate representation for predicting graphemes and that similarities between phonetic categories are reflected in the clustering of time-independent NAPs.

Introspection for convolutional automatic speech recognition

TLDR
This work investigates the application of common introspection techniques from computer vision to an Automatic Speech Recognition (ASR) task and proposes normalized averaging of aligned inputs (NAvAI): a data-driven method to reveal learned patterns for prediction of specific classes.

Understanding Neural Networks Through Deep Visualization

TLDR
This work introduces several new regularization methods that combine to produce qualitatively clearer, more interpretable visualizations of convolutional neural networks.

Going deeper with convolutions

We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition

Learning invariant features through topographic filter maps

TLDR
This work proposes a method that automatically learns feature extractors in an unsupervised fashion by simultaneously learning the filters and the pooling units that combine multiple filter outputs together.

Representational Similarity Analysis – Connecting the Branches of Systems Neuroscience

TLDR
A new experimental and data-analytical framework called representational similarity analysis (RSA) is proposed, in which multi-channel measures of neural activity are quantitatively related to each other and to computational theory and behavior by comparing RDMs.

The quantitative extraction and topographic mapping of the abnormal components in the clinical EEG.

  • Z. Koles
  • Medicine
    Electroencephalography and clinical neurophysiology
  • 1991

An Introduction to the Event-Related Potential Technique

TLDR
In An Introduction to the Event-Related Potential Technique, Steve Luck offers the first comprehensive guide to the practicalities of conducting ERP experiments in cognitive neuroscience and related fields, including affective neuroscience and experimental psychopathology.

MONOGRAPHS OF THE SOCIETY FOR RESEARCH IN CHILD DEVELOPMENT

MCCALL, ROBERT B.; APPELBAUM, MARK I.; AND HOGARTY, PAMELA S. Developmental Changes in Mental Performance. Monographs of the Society for Research in Child Development, 1973, 38 (3, Serial No. 150). A

Adaptation of the event-related potential technique for analyzing artificial neural networks

  • Cognitive Computational Neuroscience (CCN).
  • 2017