• Corpus ID: 59356101

Phonetically balanced Bangla speech corpus

  title={Phonetically balanced Bangla speech corpus},
  author={Firoj Alam and Rabia Sultana and Shammur Absar and Mumit Khan},
This paper describes the development of a phonetically balanced Bangla speech corpus. Construction of speech applications such as text to speech and speech recognition requires a phonetically balanced speech database in order to obtain a natural output. Here we elicited text collection procedure, text normalization, G2P 1 conversion and optimal text selection using a greedy selection method and hand pruning. 

Figures and Tables from this paper

ASR for low-resourced languages: Building a phonetically balanced Romanian speech corpus

All the language processing steps taken in order to obtain a proper set of phrases are described, some important aspects regarding Romanian phonetics are discussed and the phrase selection mechanism is emphasized.

SUST TTS Corpus: A phonetically-balanced corpus for Bangla text-to-speech synthesis

A large-scale, phonetically-balanced speech corpus containing more than 30 hours of speech for Bangla speech synthesis, and a synthetic voice which is comparable to the state-of-the-art TTS systems is obtained.

Recent Advancement in Speech Recognition for Bangla: A Survey

This paper presents a brief study of remarkable works done for the development of Automatic Speech Recognition (ASR) system for Bangla language. It discusses information of available speech corpora

Development of IIITH Hindi-English Code Mixed Speech Database

The design and development of IIITH Hindi-English code mixed (IIITH-HE-CM) text and corresponding speech corpus are presented and a large vocabulary code-mixing speech recognition system is developed based on a deep neural network (DNN) architecture.

SUST Bangla Emotional Speech Corpus (SUBESCO): An audio-only emotional speech corpus for Bangla

Kappa statistics and intra-class correlation coefficient scores indicated high-level of inter-rater reliability and consistency of this corpus evaluation, which is the largest emotional speech corpus available for Bangla language.

An evaluation of sentence selection methods on the different phone-sized units for constructing Indonesian speech corpus

Based on the experiments, LTM + Greedy by Suyanto produces a smaller number of sentences that contain large number of phone units.

An evaluation of sentence selection methods on the different phone-sized units for constructing Indonesian speech corpus

Collecting phonetically balanced text corpus is an important step to develop automatic speech recognition and text-to-speech systems. A corpus should have a small number of sentences but contains all

TTS for Low Resource Languages: A Bangla Synthesizer

A process for streamlining the bootstrapping of TTS systems for under-resourced languages by using crowdsourcing to collect data from multiple ordinary speakers and employing statistical techniques to construct multi-speaker acoustic models using Long Short-Term Memory Recurrent Neural Network and Hidden Markov Model approaches is proposed.

KSU rich Arabic speech database

A rich and comprehensive Arabic speech database that can be used in many Arabic, and non-Arabic, speech processing researches, such as speaker / speech recognition, speech analysis, accent identification, ethnic groups / nationality recognition, etc.

Acoustic analysis of Bangla vowel inventory

This paper describes the acoustic characteristics of Bangla vowels, obtained by analyzing the recordings of male and female voices. First, the duration of each phoneme was identified by averaging

Methods for optimal text selection

This work addresses how one can take advantage of control over the content of the speech data base, by discussing a number of variants of “greedy” text selection methods and showing their application in a variety of examples.

The CMU Arctic speech databases

The CMU Arctic databases designed for the purpose of speech synthesis research, which consist of approximately 1200 phonetically balanced English utterances, are distributed as free software, without restriction on commercial or non-commercial use.

Design of an optimal continuous speech database for text-to-speech synthesis considered as a set covering problem

The optimization of such as database according to phonetic criteria is presented, where a large corpus of texts is assembled, phonetized automatically and condensed to retain only 10 tokens of the most frequent triphonemes.

The Greedy Algorithm and its Application to the Construction of a Continuous Speech Database

Greedy and spitting methods performances are comparable; nevertheless greedy is a bit better and above all less time-consuming than its inverse, and pair exchange method is more time- consuming.

Handbook of standards and resources for spoken language systems

This handbook provides easy access to current practice and requirements in the main spoken language technologies.

Building voices in the Festival speech synthesis system

  • 2000