Learn More
This paper presents specifications and requirements for creation and validation of large lexica that are needed in automatic Speech Recognition (ASR), Text-to-Speech (TTS) and statistical Speech-to-Speech Translation (SST) systems. The prepared language resources are created and validated within the scope of the EU-project LC-STAR (Lexica and Corpora for(More)
This paper describes the results of work made for ELRA during 2003-2004. It describes the methodology for validation of written language resources (WLRs), specifically lexica, which has been developed for ELRA and tested on a few resources in the ELRA catalogue. It discusses the importance of key issues in lexicon creation and validation such as the(More)
In this paper we present general strategies concerning Language Resources (LRs) – Written, Spoken and, recently, Multimodal – as developed within the ENABLER Thematic Network. LRs are a central component of the so-called " linguistic infrastructure " (the other key element being Evaluation), necessary for the development of any Human Language Technology(More)
This paper presents the experience and insights gained from developing and applying methodologies for quick quality checks (QQC) of third party language resources based on the existing methodologies for full validation, which were documented in validation manuals under contract for ELRA during 2003-2004. The types of resources are Spoken Language Resources(More)
This paper presents an overview of two pilot studies conducted at the University of Copenhagen during the spring of 2013. The studies are based on the use of the Let’sMT! platform, which is also presented in an overview. The purpose of the studies was to investigate whether experiments with the Let’sMT! platform would be adequate for the students as a(More)
LC-STAR II is a follow-up project of the EU funded project LC-STAR (Lexica and Corpora for Speech-to-Speech Translation Components , IST-2001-32216). LC-STAR II develops large lexica containing information for speech processing in ten languages targeting especially automatic speech recognition and text to speech synthesis but also other applications like(More)
  • 1