• Publications
  • Influence
The BioScope corpus: biomedical texts annotated for uncertainty, negation and their scopes
TLDR
A corpus annotation project that has produced a freely available resource for research on handling negation and uncertainty in biomedical texts, which is also a good resource for the linguistic analysis of scientific and clinical texts.
The CoNLL-2010 Shared Task: Learning to Detect Hedges and their Scope in Natural Language Text
TLDR
A general overview of the CoNLL-2010 Shared Task, including the annotation protocols of the training and evaluation datasets, the exact task definitions, the evaluation metrics employed and the overall results is provided.
The BioScope corpus: annotation for negation, uncertainty and their scope in biomedical texts
TLDR
A corpus annotation project that has produced a freely available resource for research on handling negation and uncertainty in biomedical texts and is called the BioScope corpus, which consists of medical free texts, biological full papers and biological scientific abstracts.
Universal Dependencies 2.1
TLDR
The annotation scheme is based on (universal) Stanford dependencies, Google universal part-of-speech tags, and the Interset interlingua for morphosyntactic tagsets for morpho-lingual tagsets.
Overview of the SPMRL 2013 Shared Task: A Cross-Framework Evaluation of Parsing Morphologically Rich Languages
TLDR
This paper presents and analyzes parsing results obtained by the task participants, and provides an analysis and comparison of the parsers across languages and frameworks, reported for gold input as well as more realistic parsing scenarios.
Cross-Genre and Cross-Domain Detection of Semantic Uncertainty
TLDR
A unified subcategorization of semantic uncertainty as different domain applications can apply different uncertainty categories is introduced and the domain adaptation for training the models offer an efficient solution for cross-domain and cross-genre semantic uncertainty recognition.
(Re)ranking Meets Morphosyntax: State-of-the-art Results from the SPMRL 2013 Shared Task
TLDR
This paper describes the IMS-SZEGED-CIS contribution to the SPMRL 2013 Shared Task, and makes significant improvements through high quality preprocessing and (re)ranking on top of strong baselines.
Automatic construction of rule-based ICD-9-CM coding systems
TLDR
The results demonstrate that hand-crafted systems – which proved to be successful in ICD-9-CM coding – can be reproduced by replacing several laborious steps in their construction with machine learning models.
Joint Morphological and Syntactic Analysis for Richly Inflected Languages
TLDR
Evaluation on five morphologically rich languages shows consistent improvements in both morphological and syntactic accuracy for joint prediction over a pipeline model, with further improvements thanks to lexical constraints and word clusters.
Universal Dependencies 2.0 – CoNLL 2017 Shared Task Development and Test Data
TLDR
This release contains the test data used in the CoNLL 2017 shared task on parsing Universal Dependencies, and complements the UD 2.0 release with 18 new parallel test sets and 4 test sets in surprise languages.
...
1
2
3
4
5
...