Mongolian Speech Recognition Based on Deep Neural Networks

  title={Mongolian Speech Recognition Based on Deep Neural Networks},
  author={Hui Zhang and Feilong Bao and Guanglai Gao},
Mongolian is an influential language. And better Mongolian Large Vocabulary Continuous Speech Recognition (LVCSR) systems are required. Recently, the research of speech recognition has achieved a big improvement by introducing the Deep Neural Networks (DNNs). In this study, a DNN-based Mongolian LVCSR system is built. Experimental results show that the DNN-based models outperform the conventional models which based on Gaussian Mixture Models (GMMs) for the Mongolian speech recognition, by a… 

Comparison on Neural Network based acoustic model in Mongolian speech recognition

It is found out the Long Short-Term Memory (LSTM) is the best model among them and refresh the recode of the Mongolian speech recognition.

Research on Mongolian Speech Recognition Based on FSMN

FSMN possesses better performance than DNN in the Mongolian ASR, and by using i-vector features combined with Fbank features as FSMN input and discriminative training, the word error rate is relatively reduced by 17.9% compared with the DNN baseline.

Research on Transfer Learning for Khalkha Mongolian Speech Recognition Based on TDNN

This paper investigates two different weight transfer approaches to improve the performance of Khalkha Mongolian ASR based on Lattice-free Maximum Mutual Information (LF-MMI), and shows that the weight transfer methods with out-of-domain Chahar speech can achieve great improvements over baseline model onKhalkha speech.

Speech Recognition in Mongolian Language using a Neural Network with pre-processing Technique

A neural network model is developed, which is capable of recognizing a limited number of words in Mongolian language, and four words were chosen for further designing and creating a special device with an audio interface.

Primi speech recognition based on deep neural network

The deep neural network could not only complete large vocabulary speech recognition, but also implemented its recognition rate significantly higher than the traditional HMM.

Research on Khalkha Dialect Mongolian Speech Recognition Acoustic Model Based on Weight Transfer

This paper investigates the modeling method of using different transfer learning ways in the Khalkha dialect Mongolian ASR system and shows that the optimal acoustic model is chain TDNN based on weight transfer method with Chahar dialect as the source domain.

An evaluation of Mongolian data-driven Text-to-Speech

A first attempt to evaluate data-driven speech synthesis of Mongolian trained on 1500-sentence female speech corpus using Phoneme confusion test, which contains all possible phoneme set in Mongolian.

Design and Implementation of Cyrillic Mongolian Speech Input System for Thyroid Ultrasound Report

  • Galbadrakh Gantumur
  • Computer Science
    IOP Conference Series: Materials Science and Engineering
  • 2020
A Cyrillic Mongoliaian speech input system for ultrasound examination report that uses the developing Mongolian speech recognition technology to complete the transformation from manual input to Speech input system and speech inputSystem for ultrasound exam report is developed and designed.

A Method of Segmentation and Recognition of Wa Voice Keyword based on Deep Neural Network

  • Yong ZhaoJuxiang ZhouJun Wang
  • Computer Science, Education
    2019 IEEE International Conference on Computer Science and Educational Informatization (CSEI)
  • 2019
Aiming at the shortage of speech resources in Wa language, an automatic segmentation technology and a speech recognition algorithm based on deep neural network are designed.

Towards End-to-End Speech Recognition with Deep Multipath Convolutional Neural Networks

Results from the experiments show that the newly proposed MCNN-CTC structure enables a reduction in the error rate arising from the construction of end-to-end acoustic model, compared to the traditional HMM-based or DCNN- CTC-based models with strong generalization performance.



A Mongolian Speech Recognition System Based on HMM

A Mongolian large vocabulary continuous speech recognition system is introduced and the experimental results indicated that the design of models related to the Mongolian speech recognition is rational and correct.

Improving of Acoustic Model for the Mongolian Speech Recognition System

The basic resources of Mongolian speech recognition system are optimized, and system recognition accuracy rates of word and sentence have been greatly improved and system performance has been optimized.

Researching of Speech Recognition Oriented Mongolian Acoustic Model

M Mongolian context-dependent acoustic model based on decision tree was proposed and decision tree based state tying was applied to the acoustic model designning in Mongolian speech recognition.

Segmentation-based Mongolian LVCSR approach

A segmentation-based Mongolian Large Vocabulary Continuous Speech Recognition (LVCSR) approach is proposed and results show that, by converting most of these words into their corresponding In-Vocabulary form, the proposed approach effectively recognizes most of the Mongolian words and greatly improves the sample sparseness problem in the language model.

A design and implementation of HMM based Mongolian speech recognition system

The design and development of HMM-based speech recognition system for the Mongolian language based on Hidden Markov Models is described and the performance of isolated word recognition with context independent and context dependent models is evaluated.

Recurrent neural network based language model

Results indicate that it is possible to obtain around 50% reduction of perplexity by using mixture of several RNN LMs, compared to a state of the art backoff language model.

Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups

This article provides an overview of progress and represents the shared views of four research groups that have had recent successes in using DNNs for acoustic modeling in speech recognition.

Deep Recurrent Neural Networks for Acoustic Modelling

A novel deep Recurrent Neural Network model for acoustic modelling in Automatic Speech Recognition (ASR) that combines a Deep Neural Network with Time Convolution (TC), followed by a Bidirectional Long Short-Term Memory (BLSTM), and a final DNN.

Comparison of feedforward and recurrent neural network language models

A simple and efficient method to normalize language model probabilities across different vocabularies is proposed, and it is shown how to speed up training of recurrent neural networks by parallelization.

ImageNet classification with deep convolutional neural networks

A large, deep convolutional neural network was trained to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes and employed a recently developed regularization method called "dropout" that proved to be very effective.