Daniel Tihelka

Learn More
This paper deals with the problems of automatic segmentation for the purposes of Czech concatenative speech synthesis. Statistical approach to speech segmentation using hidden Markov models (HMMs) is applied in the baseline system. Several improvements of this system are then proposed to get more accurate segmentation results. These enhancements mainly(More)
This paper gives a survey of the current state of ARTIC – the modern Czech concatenative corpus-based text-to-speech system. All stages of the system design are described in the paper, including the acoustic unit inventory building process, text processing and speech production issues. Two versions of the system are presented: the single unit instance(More)
This paper presents recent improvements on ARTIC – the modern Czech corpus-based text-to-speech system. As a statistical approach (using hidden Markov models) was applied to create an acoustic unit inventory, several improvements concerning acoustic unit modelling, clustering and segmentation have been accomplished to increase the intelligibility of the(More)
This paper deals with the automatic segmentation for Czech Concatenative speech synthesis. Statistical approach to speech segmen-tation using hidden Markov models (HMMs) is applied in the baseline system [1]. Several experiments that concern various issues in the process of building the segmentation system, such as speech parameterization or HMM(More)
The paper describes the optimisation of Viterbi search used in unit selection TTS, since with a large speech corpus necessary to achieve a high level of naturalness, the performance still suffers. To improve the search speed, the combination of sophisticated stopping schemes and pruning thresholds is employed into the baseline search. The optimised search(More)
The paper deals with the process of designing a phonetically and prosodically rich speech corpus for unit selection speech synthesis. The attention is given mainly to the recording and verification stage of the process. In order to ensure as high quality and consistency of the recordings as possible, a special recording environment consisting of a recording(More)
The present paper focuses on the current handling of target features in the unit selection approach basically requiring huge corpora. In the paper there are outlined possible solutions based on measuring (dis)similarity among prosodic patterns. As the start of research, the feasibility of (dis)similarity estimation is examined on several intuitively chosen(More)