Learn More
This paper presents a two level lexical stress assignment model for out of vocabulary Slovenian words used in our text-to-speech system. First, each vowel (and consonant 'r') is determined, whether it is stressed or unstressed, and a type of lexical stress is assigned for every stressed vowel (and consonant 'r'). We applied a machine-learning technique(More)
We present an algorithm for parsing with detection of intra-clausal coordinations. The algorithm is based on machine learning techniques and helps to decompose a large parsing problem into several smaller ones. Its performance was tested on Slovene Dependency Treebank. Used together with the maximum spanning tree parsing algorithm it improved parsing(More)
The impact of clause and intraclausal coordination detection to dependency parsing of Slovene is examined. New methods based on machine learning and heuristic rules are proposed for clause and intraclausal coordination detection. They were included in a new dependency parsing algorithm, PACID. For evaluation, Slovene dependency treebank was used. At(More)
The new Slovenian text-to-speech engine is a modular system consisting of four independent modules (text normalization, grapheme-to-phoneme conversion, prosody generation and segmental concatenation), which are pipelined together. Each module is responsible for one portion of the problem of converting from text into speech. That enables easy improvements of(More)
This paper presents a text-to-speech (TTS) system, capable of synthesis of continuous Slovenian speech. The system is based on the concatenation of basic speech units, diphones, using TD-PSOLA technique improved with a variable length linear interpolation process. Input text is processed by a series of modules which are described in detail. A special(More)
We present a new dependency parsing algorithm based on the decomposition of large sentences into smaller units such as clauses and intraclausal coordinations. For the identification of these units, new methods combining machine learning techniques and heuristic rules were developed. The algorithm was evaluated on the Slovene dependency treebank text corpus.(More)
  • T. Sef
  • Proceedings of 2004 International Symposium on…
  • 2004
One of the characteristics of the Slovenian language is that lexical stress can be located almost arbitrarily on every syllable in the word, which makes the pronunciation very difficult. Some pronunciation rules exist, but their precision is not sufficient for efficient speech synthesis. Therefore a machine-learning technique (decision trees or boosted(More)