Optimisation of phonetic aware speech recognition through multi-objective evolutionary algorithms

@article{Bird2020OptimisationOP,
  title={Optimisation of phonetic aware speech recognition through multi-objective evolutionary algorithms},
  author={Jordan J. Bird and Elizabeth F. Wanner and Anik{\'o} Ek{\'a}rt and Diego Resende Faria},
  journal={Expert Syst. Appl.},
  year={2020},
  volume={153},
  pages={113402}
}
Optimizing Arabic Speech Distinctive Phonetic Features and Phoneme Recognition Using Genetic Algorithm
TLDR
The aim of this work is to consider methods to reduce the size of features vector employed for distinctive phonetic feature and phoneme recognition, which will lead to a reduced computational complexity of recognition algorithm, and an improved recognition accuracy.
LSTM and GPT-2 Synthetic Speech Transfer Learning for Speaker Recognition to Overcome Data Scarcity
TLDR
It is argued that speaker classification can be improved by utilising a small amount of user data but with exposure to synthetically-generated MFCCs which then allow the networks to achieve near maximum classification scores.
Domain decomposition of finite element models utilizing eight meta-heuristic algorithms: A comparative study
TLDR
The k-median of a graph is used to decompose the domain (mesh) of the continuous two- and three-dimensional finite element models.

References

SHOWING 1-10 OF 71 REFERENCES
Phoneme aware speech recognition through evolutionary optimisation
TLDR
A preliminary study on Artificial Neural Network (ANN) and Hidden Markov Model (HMM) methods of classification for Human Speech Recognition through Diphthong Vowel sounds in the English Phonetic Alphabet with a specific focus on evolutionary optimisation of bio-inspired classification methods.
Hybrid speech recognition with Deep Bidirectional LSTM
TLDR
The hybrid approach with DBLSTM appears to be well suited for tasks where acoustic modelling predominates, and the improvement in word error rate over the deep network is modest, despite a great increase in framelevel accuracy.
Random Forests of Phonetic Decision Trees for Acoustic Modeling in Conversational Speech Recognition
  • Jian Xue, Yunxin Zhao
  • Computer Science
    IEEE Transactions on Audio, Speech, and Language Processing
  • 2008
TLDR
A novel technique of constructing phonetic decision trees (PDTs) for acoustic modeling in conversational speech recognition using random forests to train a set of PDTs for each phone state unit and obtains multiple acoustic models accordingly.
Connectionist Speech Recognition: A Hybrid Approach
From the Publisher: Connectionist Speech Recognition: A Hybrid Approach describes the theory and implementation of a method to incorporate neural network approaches into state-of-the-art continuous
A Hybrid of Deep CNN and Bidirectional LSTM for Automatic Speech Recognition
TLDR
A hybrid architecture of CNN-BLSTM is proposed to appropriately use spatial and temporal properties of the speech signal and to improve the continuous speech recognition task and overcome another shortcoming of CNN, i.e. speaker-adapted features, which are not possible to be directly modeled in CNN.
Phoneme recognition using time-delay neural networks
The authors present a time-delay neural network (TDNN) approach to phoneme recognition which is characterized by two important properties: (1) using a three-layer arrangement of simple computing
Lithuanian Speech Recognition Using Purely Phonetic Deep Learning
TLDR
An ASR system for the Lithuanian language is proposed, which is based on deep learning methods and can identify spoken words purely from their phoneme sequences and is evaluated in isolated speech recognition task and long phrase recognition task.
Speech recognition with deep recurrent neural networks
TLDR
This paper investigates deep recurrent neural networks, which combine the multiple levels of representation that have proved so effective in deep networks with the flexible use of long range context that empowers RNNs.
Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques
TLDR
This paper presents the viability of MFCC to extract features and DTW to compare the test patterns and explains why the alignment is important to produce the better performance.
...
1
2
3
4
5
...