Louise Deléger

Learn More
We translated an existing English negation lexicon (NegEx) to Swedish, French, and German and compared the lexicon on corpora from each language. We observed Zipf's law for all languages, i.e., a few phrases occur a large number of times, and a large number of phrases occur fewer times. Negation triggers "no" and "not" were common for all languages;(More)
We present our participation in Task 1a of the 2013 CLEFeHEALTH Challenge, whose goal was the identification of disorder named entities from electronic medical records. We developed a supervised CRF model that based on a rich set of features learns to predict disorder named entities. The CRF system uses external knowledge from specialized biomedical(More)
PURPOSE Medical language, as many technical languages, is rich with morphologically complex words, many of which take their roots in Greek and Latin--in which case they are called neoclassical compounds. Morphosemantic analysis can help generate definitions of such words. The similarity of structure of those compounds in several European languages has also(More)
Whereas multilingual comparable corpora have been used to identify translations of words or terms, monolingual corpora can help identify paraphrases. The present work addresses paraphrases found between two different discourse types: specialized and lay texts. We therefore built comparable corpora of specialized and lay texts in order to detect equivalent(More)
Nowadays a large amount of health information is available to the public, but medical language is often difficult for lay people to understand. Developing means to make medical information more comprehensible is therefore a real need. In this regard, a useful resource would be a corpus of specialized and lay paraphrases. To this end we built comparable(More)
This paper presents the Bacteria Biotope task of the BioNLP Shared Task 2016, which follows the previous 2013 and 2011 editions. The task focuses on the extraction of the locations (biotopes and geographical places) of bacteria from PubMed abstracts and the characterization of bacteria and their associated habitats with respect to reference knowledge(More)
In previous work, we presented a preliminary study to identify paraphrases between technical and lay discourse types from medical corpora dedicated to the French language. In this paper, we test the hypothesis that the same kinds of paraphrases as for French can be detected between English technical and lay discourse types and report the adaptation of our(More)
OBJECTIVE (1) To evaluate a state-of-the-art natural language processing (NLP)-based approach to automatically de-identify a large set of diverse clinical notes. (2) To measure the impact of de-identification on the performance of information extraction algorithms on the de-identified documents. MATERIAL AND METHODS A cross-sectional study that included(More)
OBJECTIVE While essential for patient care, information related to medication is often written as free text in clinical records and, therefore, difficult to use in computerized systems. This paper describes an approach to automatically extract medication information from clinical records, which was developed to participate in the i2b2 2009 challenge, as(More)