Learn More
This paper addresses some of the issues learned during the course of building a written language resource (called 'Peykare') for contemporary Persian. After defining five linguistic varieties and 24 different registers based on these linguistic varieties, we collected the texts for Peykare to do a linguistic analysis, including cross-register differences.(More)
Persian is one of the Indo-European languages which has borrowed its script from Arabic, a member of Semitic language family. Since Persian and Arabic scripts are so similar, problems arise when we want to process an electronic text. In this paper, some of the common problems faced experimentally in developing a corpus for Persian are discussed. The sources(More)
Generating pronunciation variants of words is an important applicable subject in speech researches and is used extensively in automatic speech segmentation and recognition systems. In this way, decision trees are extremely used to model pronunciation variants of words and sub-word unites. In the case of word unites and very large vocabulary, to train(More)
Persian clitic groups differ from words. Most importantly, a pitch accent (L+)H* is associated with the word-final (i.e. base-final) syllable of clitic groups, but with the word-final syllable of words, meaning that clitics remain outside the domain of the word. The pitch accent marks the stress, but we found no independent durational or spectral(More)
In this research, a Text-To-Speech system for Farsi language has been implemented. The proposed synthesizer concatenates Farsi syllables in a TD-PSOLA manner. This paper is mainly concentrated on investigation about pitch variations in Farsi sentences and presentation of some novel rules for modeling these variations. Based on the location of stressed(More)
Prosody is a suprasegmental feature of speech that has an undeniable role in human speech perception and generation. However, employing of prosodic features in CSR process mostly is difficult and we must not expect huge accuracy progress by using them. In this way, the main problem arises from high dependency of prosodic patterns to factors like speakers,(More)
This study examines the effect of non-sentential context prosody pattern on lexical activation in Persian. For this purpose a questionnaire including target and non-target words is used. The target words are homographs with two possible stress patterns belonging to different syntactic categories. Participants are asked to read out the words aloud and note(More)
Morphological and syntactic annotation of multi-token units confront several problems due to the concatenating nature of Persian script and so its orthographic variation. In the present paper, by the analysis of the different collocation types of the tokens, the compositional, non-compositional and semicompositional constructions are described and then, in(More)