Data Set Used
The automatic extraction of chemical information from text requires the recognition of chemical entity mentions as one of its key steps. When developing supervised named entity recognition (NER) systems, the availability of a large, manually annotated text corpus is desirable. Furthermore, large corpora permit the robust evaluation and comparison of… (More)
This paper describes the creation of a gold standard for chemistry-disease relations in patent texts. We start with an automated annotation of named entities of the domains chemistry (e.g. " propranolol ") and diseases (e.g. " hypertension ") as well as of related domains like methods and substances. After that, domain-relevant relations between these… (More)
This paper describes OCMiner, a high-performance semantic text processing system for large document collections of scientific publications, and its performance regarding chemical named entity recognition in patent texts within the BioCreative V CHEMDNER-Patents challenge which was set up for this purpose. OCMiner permits adjusting the quality of annotation… (More)
We present OCMiner, a high-performance text processing system for large document collections of scientific publications. Several linguistic options allow adjusting the quality of annotation results which can be specialized and fine-tuned for the recognition of Life Science terms. Recognized terms are mapped to semantic concepts which are ontologically… (More)
Introduction • natural language discourses are structured: relations between utterances at various levels • basic distinction between coherence and cohesion coherence: rhetorical relations between text segments cohesion: anaphoric relations between discourse entitites 1 Bridging Anaphora • in a bridging anaphor, an entity introduced in a discourse stands in… (More)
die Auflösung der Referenz von pronominalen Anaphern, definiten Kennzeichnungen und Bridging-Anaphern.