Niklas Paulsson

Learn More
The paper presents a comprehensive overview of existing data for the evaluation of spoken content processing in a multimedia framework for the French language. We focus on the ETAPE corpus which will be made publicly available by ELDA at the end of 2012, after completion of the evaluation, and recall existing resources resulting from previous evaluation(More)
The purpose of this work was to evaluate if it was possible to measure the alcohol concentration in breath by a multisensor array, i.e. an electronic nose. The most important aspects were to clarify technical advantages and disadvantages and if the technique is at all suitable for forensic breath alcohol analysis. Even though the system set-up was far from(More)
The NEMLAR project: Network for Euro-Mediterranean LAnguage Resource and human language technology development and support; ( is a project supported by the EC with partners from Europe and the Middle East; whose objective is to build a network of specialized partners to promote and support the development of Arabic Language Resources in the(More)
Broadcast news is a very rich source of Language Resources that has been exploited to develop and assess a large set of Human Language Technologies. Some examples include systems to: automatically produce text transcriptions of spoken data; identify the language of a text; translate a text from one language to another; identify topics in the news and(More)
This paper describes the collect and transcription of a large set of Arabic broadcast news speech data. A total of more than 2000 hours of data was transcribed. The transcription factor for transcribing the broadcast news data has been reduced using a method such as Quick Rich Transcription (QRTR) as well as reducing the number of quality controls performed(More)
The goal of the LILA project was the collection of speech databases over cellular telephone networks of five languages in three Asian countries. Three languages were recorded in India: Hindi by first language speakers, Hindi by second language speakers and Indian English. Furthermore, Mandarin was recorded in China and Korean in South-Korea. The databases(More)
  • 1