Learn More
This paper proposes a classifier, based on hidden Markov model that can be used for solving the problem of part-of-speech tagging of the Slavic languages, such as Slovak, Czech or Polish. These languages are highly inflectional and morphologically rich and have a very large vocabulary. The probability matrices of the classical hidden Markov model are(More)
The presented corpus aims to be the first attempt to create a representative sample of the contemporary Slovak language from various domains with easy searching and automated processing. This first version of the corpus contains words and automatic morphological and named entity annotations and transcriptions of abbreviations and numerals. Integral part of(More)
In this paper we aim to describe recent advances in the statistical modeling of the Slovak language for transcription of dictated, semi-spontaneous and spontaneous conversational speech such as judicial readings, broadcast news TV and radio shows, parliament proceedings, educational talks and lectures, or interactive conversations. During the last months,(More)
Speech technologies have a potentiality to simplify the human-machine interaction as well as the communication between people. The use of speech technology applications has nowadays continuously growing trend. Each speech recognition system, which stands in the heart of every speech application, besides an algorithmic complexity, is strongly language(More)
This paper describes the design, development and evaluation of the Slovak dictation system for the judicial domain. The speech is recorded using a close-talk microphone and the dictation system is used for online or offline automatic transcription. The system provides an automatic dictation tool for Slovak people working in the judicial domain, and can in(More)
This paper describes the design, development and evaluation of the Slovak dictation system for the judicial domain. The speech is recorded using a close-talk microphone and the dictation system is used for on-line or off-line automatic transcription. The system provides an automatic dictation tool in Slovak for the employees of the Ministry of Justice of(More)
The robustness of n-gram language models depends on the quality of text data on which they have been trained. The text corpora collected from various resources such as web pages or electronic documents are characterized by many possible topics. In order to build efficient and robust domain-specific language models, it is necessary to separate(More)
The Slovak NLP Demo should provide overview of the state-of-the-art NLP tools where each single processing step can be tried with arbitrary text, results can be reviewed and feedback can be sent to the authors. The demo is accessible on a web site, arbitrary text can be inserted using a form. Processing tools can be selected and processed text will be(More)