Thomas Eckart

Learn More
The Leipzig Corpora Collection offers free online access to 136 monolingual dictionaries enriched with statistical information. In this paper we describe current advances of the project in collecting and processing text data automatically for a large number of languages. Our main interest lies in languages of “low density”, where only few text data exists(More)
The quality of statistical measurements on corpora is strongly related to a strict definition of the measuring process and to corpus quality. In the case of multiple result inspections, an exact measurement of previously specified parameters ensures compatibility of the different measurements performed by different researchers on possibly different objects.(More)
The effects of total parenteral nutrition (TPN) versus enteral nutrition (TEN) were studied in 34 patients following major neurosurgery. Measurements were made of resting energy expenditure (REE), urea production rate (UPR), visceral proteins, parameters of liver and pancreas function, as well as gastrointestinal absorption. To predict nutritional status,(More)
Large textual resources are the basis for a variety of applications in the field of corpus linguistics. For most languages spoken by large user groups a comprehensive set of these corpora are constantly generated and exploited. Unfortunately for modern Indian languages there are still shortcomings that interfere with systematic text analysis. This paper(More)
Providing both user-friendly and machine-readable interfaces to digital resources is one of the key tasks of highly integrated research infrastructures like CLARIN. The presented implementation of the Canonical Text Service Protocol CTS covers many of the associated problems, like dealing with varying levels of text granularity, persistent identification(More)
Since 2011 the comprehensive, electronically available sources of the Leipzig Corpora Collection have been used consistently for the compilation of high quality word lists. The underlying corpora include newspaper texts, Wikipedia articles and other randomly collected Web texts. For many of the languages featured in this collection, it is the first(More)
From 2004 to 2016 the Leipzig Linguistic Services (LLS) existed as a SOAP-based cyberinfrastructure of atomic micro-services for the Wortschatz project, which covered different-sized textual corpora in more than 230 languages. The LLS were developed in 2004 and went live in 2005 in order to provide a webservice-based API to these corpus databases. In 2006,(More)