Phonetically balanced Bangla speech corpus
@inproceedings{Alam2011PhoneticallyBB, title={Phonetically balanced Bangla speech corpus}, author={Firoj Alam and Rabia Sultana and Shammur Absar and Mumit Khan}, year={2011} }
This paper describes the development of a phonetically balanced Bangla speech corpus. Construction of speech applications such as text to speech and speech recognition requires a phonetically balanced speech database in order to obtain a natural output. Here we elicited text collection procedure, text normalization, G2P 1 conversion and optimal text selection using a greedy selection method and hand pruning.
15 Citations
ASR for low-resourced languages: Building a phonetically balanced Romanian speech corpus
- Linguistics2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO)
- 2012
All the language processing steps taken in order to obtain a proper set of phrases are described, some important aspects regarding Romanian phonetics are discussed and the phrase selection mechanism is emphasized.
SUST TTS Corpus: A phonetically-balanced corpus for Bangla text-to-speech synthesis
- Computer Science, LinguisticsAcoustical Science and Technology
- 2021
A large-scale, phonetically-balanced speech corpus containing more than 30 hours of speech for Bangla speech synthesis, and a synthetic voice which is comparable to the state-of-the-art TTS systems is obtained.
Recent Advancement in Speech Recognition for Bangla: A Survey
- Computer Science
- 2021
This paper presents a brief study of remarkable works done for the development of Automatic Speech Recognition (ASR) system for Bangla language. It discusses information of available speech corpora…
Development of IIITH Hindi-English Code Mixed Speech Database
- Computer ScienceSLTU
- 2018
The design and development of IIITH Hindi-English code mixed (IIITH-HE-CM) text and corresponding speech corpus are presented and a large vocabulary code-mixing speech recognition system is developed based on a deep neural network (DNN) architecture.
SUST Bangla Emotional Speech Corpus (SUBESCO): An audio-only emotional speech corpus for Bangla
- PsychologyPloS one
- 2021
Kappa statistics and intra-class correlation coefficient scores indicated high-level of inter-rater reliability and consistency of this corpus evaluation, which is the largest emotional speech corpus available for Bangla language.
An evaluation of sentence selection methods on the different phone-sized units for constructing Indonesian speech corpus
- EducationInt. J. Speech Technol.
- 2020
Based on the experiments, LTM + Greedy by Suyanto produces a smaller number of sentences that contain large number of phone units.
An evaluation of sentence selection methods on the different phone-sized units for constructing Indonesian speech corpus
- EducationInternational Journal of Speech Technology
- 2019
Collecting phonetically balanced text corpus is an important step to develop automatic speech recognition and text-to-speech systems. A corpus should have a small number of sentences but contains all…
TTS for Low Resource Languages: A Bangla Synthesizer
- Computer ScienceLREC
- 2016
A process for streamlining the bootstrapping of TTS systems for under-resourced languages by using crowdsourcing to collect data from multiple ordinary speakers and employing statistical techniques to construct multi-speaker acoustic models using Long Short-Term Memory Recurrent Neural Network and Hidden Markov Model approaches is proposed.
KSU rich Arabic speech database
- Computer Science
- 2013
A rich and comprehensive Arabic speech database that can be used in many Arabic, and non-Arabic, speech processing researches, such as speaker / speech recognition, speech analysis, accent identification, ethnic groups / nationality recognition, etc.
Building Statistical Parametric Multi-speaker Synthesis for Bangladeshi Bangla
- Computer ScienceSLTU
- 2016
7 References
Acoustic analysis of Bangla vowel inventory
- Linguistics
- 2008
This paper describes the acoustic characteristics of Bangla vowels, obtained by analyzing the recordings of male and female voices. First, the duration of each phoneme was identified by averaging…
Methods for optimal text selection
- Computer ScienceEUROSPEECH
- 1997
This work addresses how one can take advantage of control over the content of the speech data base, by discussing a number of variants of “greedy” text selection methods and showing their application in a variety of examples.
The CMU Arctic speech databases
- Computer ScienceSSW
- 2004
The CMU Arctic databases designed for the purpose of speech synthesis research, which consist of approximately 1200 phonetically balanced English utterances, are distributed as free software, without restriction on commercial or non-commercial use.
Design of an optimal continuous speech database for text-to-speech synthesis considered as a set covering problem
- Computer Science, LinguisticsINTERSPEECH
- 2001
The optimization of such as database according to phonetic criteria is presented, where a large corpus of texts is assembled, phonetized automatically and condensed to retain only 10 tokens of the most frequent triphonemes.
The Greedy Algorithm and its Application to the Construction of a Continuous Speech Database
- Computer ScienceLREC
- 2002
Greedy and spitting methods performances are comparable; nevertheless greedy is a bit better and above all less time-consuming than its inverse, and pair exchange method is more time- consuming.
Handbook of standards and resources for spoken language systems
- Linguistics
- 1997
This handbook provides easy access to current practice and requirements in the main spoken language technologies.
Building voices in the Festival speech synthesis system
- 2000