François-Michel Lang

Learn More
MetaMap is a widely available program providing access to the concepts in the unified medical language system (UMLS) Metathesaurus from biomedical text. This study reports on MetaMap's evolution over more than a decade, concentrating on those features arising out of the research needs of the biomedical informatics community both within and outside of the(More)
We describe a natural language processing system (Enhanced SemRep) to identify core assertions on pharmacogenomics in Medline citations. Extracted information is represented as semantic predications covering a range of relations relevant to this domain. The specific relations addressed by the system provide greater precision than that achievable with(More)
The NLM LHC team approached the cohort selection task of the 2011 Medical Records Track as a question answering problem. We developed 60 training topics and then manually converted those topics to question frames. We started with the evidence-based medicine well-formed question frame and expanded it to explicitly capture temporal and causal relations. We(More)
Genomic functional information is valuable for biomedical research. However, such information frequently needs to be extracted from the scientific literature and structured in order to be exploited by automatic systems. Natural language processing is increasingly used for this purpose although it inherently involves errors. A postprocessing strategy that(More)
We propose a methodology using underspecified semantic interpretation to process comparative constructions in MEDLINE citations, concentrating on two structures that are prevalent in the research literature reporting on clinical trials for drug therapies. The method exploits an existing semantic processor, SemRep, which constructs predications based on the(More)
This paper presents SPQR (Selectional Pa t tern Queries and Responses), a module of the PUNDIT text-processing system designed to facilit a te the acquisition of domain-specific semantic information, and to improve the accuracy and efficiency of the parser. SPQR operates by interact ively and incremental ly collecting information about the semantic(More)
OBJECTIVE The authors used the i2b2 Medication Extraction Challenge to evaluate their entity extraction methods, contribute to the generation of a publicly available collection of annotated clinical notes, and start developing methods for ontology-based reasoning using structured information generated from the unstructured clinical narrative. DESIGN(More)
The NLM team used the relevance judgments for the 2011 Medical Records track (that focused on finding patients eligible for clinical studies) to analyze the components of our 2011 systems. The analysis showed that the components provided moderate improvements over the baseline (established submitting 2011 topics ‘as is’ to Lucene) for some topics and did(More)