Share This Author
The BioScope corpus: biomedical texts annotated for uncertainty, negation and their scopes
- V. Vincze, György Szarvas, Richárd Farkas, G. Móra, J. Csirik
- LinguisticsBMC Bioinformatics
- 19 November 2008
A corpus annotation project that has produced a freely available resource for research on handling negation and uncertainty in biomedical texts, which is also a good resource for the linguistic analysis of scientific and clinical texts.
The CoNLL-2010 Shared Task: Learning to Detect Hedges and their Scope in Natural Language Text
- Richárd Farkas, V. Vincze, G. Móra, J. Csirik, György Szarvas
- Computer ScienceCoNLL Shared Task
- 15 July 2010
A general overview of the CoNLL-2010 Shared Task, including the annotation protocols of the training and evaluation datasets, the exact task definitions, the evaluation metrics employed and the overall results is provided.
The BioScope corpus: annotation for negation, uncertainty and their scope in biomedical texts
- György Szarvas, V. Vincze, Richárd Farkas, J. Csirik
- LinguisticsWorkshop on Biomedical Natural Language…
- 19 June 2008
A corpus annotation project that has produced a freely available resource for research on handling negation and uncertainty in biomedical texts and is called the BioScope corpus, which consists of medical free texts, biological full papers and biological scientific abstracts.
What helps where – and why? Semantic relatedness for knowledge transfer
- Marcus Rohrbach, Michael Stark, György Szarvas, Iryna Gurevych, B. Schiele
- Computer ScienceIEEE Computer Society Conference on Computer…
- 13 June 2010
This work addresses the question of how to automatically decide which information to transfer between classes without the need of any human intervention and taps into linguistic knowledge bases to provide the semantic link between sources (what) and targets (where) of knowledge transfer.
Hedge Classification in Biomedical Texts with a Weakly Supervised Selection of Keywords
- György Szarvas
- Computer ScienceAnnual Meeting of the Association for…
- 1 June 2008
This paper demonstrates the importance of hedge classification experimentally in two real life scenarios, namely the ICD9-CM coding of radiology reports and gene name Entity Extraction from scientific texts, and develops a maxent-based solution for both the free text and scientific text processing tasks.
The Multilingual Amazon Reviews Corpus
- Phillip Keung, Y. Lu, György Szarvas, Noah A. Smith
- Computer ScienceConference on Empirical Methods in Natural…
- 6 October 2020
The use of mean absolute error (MAE) instead of classification accuracy for this task, since MAE accounts for the ordinal nature of the ratings, is proposed.
Cross-Genre and Cross-Domain Detection of Semantic Uncertainty
- György Szarvas, V. Vincze, Richárd Farkas, G. Móra, Iryna Gurevych
- Computer ScienceInternational Conference on Computational Logic
- 1 June 2012
A unified subcategorization of semantic uncertainty as different domain applications can apply different uncertainty categories is introduced and the domain adaptation for training the models offer an efficient solution for cross-domain and cross-genre semantic uncertainty recognition.
Automatic construction of rule-based ICD-9-CM coding systems
The results demonstrate that hand-crafted systems – which proved to be successful in ICD-9-CM coding – can be reproduced by replacing several laborious steps in their construction with machine learning models.
Methods and results of the Hungarian WordNet project
This paper presents a complete outline of the results of the Hungarian WordNet (HuWN) project: the construction process of the general vocabulary Hungarian WordNet ontology, its validation and…
State-of-the-art anonymization of medical records using an iterative machine learning framework.
A de-identification model that can successfully remove personal health information (PHI) from discharge records to make them conform to the guidelines of the Health Information Portability and Accountability Act is developed.