Balázs Tarján

Learn More
In this paper, we re-evaluate morph (data-driven subword) and word lexical models used for large vocabulary continuous speech recognition of agglutinative languages. Since such speech recognition systems are applied mostly for information retrieval purposes we use evaluation metrics accordingly. Standard 3-gram language model with one million words(More)
Efficient large vocabulary continuous speech recognition of morphologically rich languages is a big challenge due to the rapid vocabulary growth. To improve the results various subword units called as morphs are applied as basic language elements. The improvements over the word baseline, however, are changing from negative to error rate halving across(More)
Morph-based language modeling has been efficiently applied in improving the accuracy of Large-Vocabulary Continuous Speech Recognition (LVCSR) systems especially in morphologically rich languages. However, the rate of improvements varies greatly and the underlying principles have been only superficially studied. Having a method that can predict the expected(More)
Under real-life conditions several factors may be present that make the automatic recognition of speech difficult. The most obvious examples are background noise, peculiarities of the speaker’s voice, sloppy articulation and strong emotional load. These all pose difficult problems for robust speech recognition, but it is not exactly clear how much each(More)
The improvement achieved by changing the basis of speech recognition from words to morphs (various sub-word units) varies greatly across tasks and languages. We make an attempt to explore the source of this variability by the investigation of three LVCSR tasks corresponding to three speech genres of a highly agglutinative language. Novel, press conference(More)
This paper introduces our work and results related to a multiple language continuous speech recognition task. The aim was to design a system that introduces tolerable amount of recognition errors for point of interest words in voice navigational queries even in the presence of real-life traffic noise. Additional challenges were that no task-specific(More)
In this paper, the application of LVCSR (Large Vocabulary Continuous Speech Recognition) technology is investigated for real-time, resource-limited broadcast close captioning. The work focuses on transcribing live broadcast conversation speech to make such programs accessible to deaf viewers. Due to computational limitations, real time factor (RTF) and(More)
Latin had served as an official language across Europe from the Roman Empire until the 19th century. As a result, vast amount of Latin language historical documents (charters, account books) survived from the Middle Ages, waiting for recovery. In the digitization process, tremendous human efforts are needed for the transliteration of textual content, as the(More)