- Full text PDF available (12)
Data Set Used
The automatic extraction of chemical information from text requires the recognition of chemical entity mentions as one of its key steps. When developing supervised named entity recognition (NER) systems, the availability of a large, manually annotated text corpus is desirable. Furthermore, large corpora permit the robust evaluation and comparison of… (More)
Ontology-based semantic text analysis methods allow to automatically extract knowledge relationships and data from text documents. In this review, we have applied these technologies for the systematic analysis of pharmaceutical patents. Hierarchical concepts from the knowledge domains of chemical compounds, diseases and proteins were used to annotate… (More)
This paper describes the creation of a gold standard for chemistry-disease relations in patent texts. We start with an automated annotation of named entities of the domains chemistry (e.g. " propranolol ") and diseases (e.g. " hypertension ") as well as of related domains like methods and substances. After that, domain-relevant relations between these… (More)
die Auflösung der Referenz von pronominalen Anaphern, definiten Kennzeichnungen und Bridging-Anaphern.
We present OCMiner, a high-performance text processing system for large document collections of scientific publications. Several linguistic options allow adjusting the quality of annotation results which can be specialized and fine-tuned for the recognition of Life Science terms. Recognized terms are mapped to semantic concepts which are ontologically… (More)
This paper describes OCMiner, a high-performance semantic text processing system for large document collections of scientific publications, and its performance regarding chemical named entity recognition in patent texts within the BioCreative V CHEMDNER-Patents challenge which was set up for this purpose. OCMiner permits adjusting the quality of annotation… (More)