• Publications
  • Influence
Unsupervised Ontology Acquisition from Plain Texts: The OntoGain System
The OntoGain system allows for transformation of the derived ontology into standard OWL statements and results are compared to both hand-crafted ontologies, as well as to a state-of-the art system in two different domains: the medical and computer science domains.
The CLIN27 Shared Task: Translating Historical Text to Contemporary Language for Improving Automatic Linguistic Annotation
The CLIN27 shared task evaluates the effect of translating historical text to modern text with the goal of improving the quality of the output of contemporary natural language processing tools appl
Word Semantic Similarity for Morphologically Rich Languages
This work proposes a methodology for selective stemming, based on a semantic distortion metric, which finds that the estimation error reduction is different when a word is used as a feature, rather than when it is use as a target word.
Enrichment and Structuring of Archival Description Metadata
This paper proposes the enrichment of existing cultural heritage metadata with automatically generated semantic content descriptors and proposes to use automatic term recognition and term clustering techniques for knowledge acquisition and content-based document classification purposes.
Proceedings of the 5th ACL-HLT Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities
A central issue for the majority of contributions to this LaTeCH workshop has been the problem of linguistic processing for historical language varieties and the respective resource development and tool adaptation.
Automatic document indexing in large medical collections
AMTEx is presented, specifically designed for the automatic indexing of documents in large medical collections such as MEDLINE, the premier bibliographic database of the U.S. National Library of Medicine (NLM).
Ontology-enablement of a system for semantic annotation of digital documents
The recent enhancement of the CAFETIERE formalism with the ability to link natural language words and phrases in textual documents with instances and classes from a language-enabled ontology is described.
Making Sense of Violence Risk Predictions Using Clinical Notes
. Violence risk assessment in psychiatric institutions enables interventions to avoid violence incidents. Clinical notes written by prac-titioners and available in electronic health records (EHR) are
UvT: The UvT Term Extraction System in the Keyphrase Extraction Task
In the UvT implementation, an extended noun phrase rule set is used to take into consideration orthographic and morphological variation, term abbreviations and acronyms, and basic document structure information.