• Publications
  • Influence
FarsiSum - A Persian Text Summarizer
FarsiSum is an attempt to create an automatic text summarization system for Persian that uses modules implemented in an existing summarizer geared towards the Germanic languages, a Persian stop-list in Unicode format and a small set of heuristic rules.
Stockholm EPR Corpus : A Clinical Database Used to Improve Health Care
A number of possible applications are described, including comorbidity networks, detection of hospital-acquired infections and adverse drug reactions, as well as diagnosis coding support.
Resource Lean and Portable Automatic Text Summarization
Today, with digitally stored information available in abundance, even for many minor languages, this information must by some means be filtered and extracted in order to avoid drowning in it. Autom
Summaries and the Process of Summarization from Evaluation of Automatic Text Summarization -a Practical Implementation
Text summarization (or rather, automatic text summarization) is the technique where a computer automatically creates an abstract, or summary, of one or more texts. The initial interest in automatic
Characteristics of Finnish and Swedish intensive care nursing narratives: a comparative analysis to support the development of clinical language technologies
A comparison of characteristics in Finnish and Swedish free-text nursing narratives from intensive care creates a framework for characterising and comparing clinical text and lays the groundwork for developing clinical language technologies.
Synonym Extraction of Medical Terms from Clinical Text Using Combinations of Word Space Models
It is demonstrated how synonyms of medical terms can be extracted automatically from a large corpus of clinical text using distributional semantics using Random Indexing and Random Permutation, effectively increasing the ability to identify synonymic relations between terms.
Exploitation of Named Entities in Automatic Text Summarization for Swedish
This paper presents a meta-modelling framework that automates the very labor-intensive and therefore time-heavy and expensive process of manually cataloging and cataloging named entities in a text.
Generation of Reference Summaries
This thesis contributes a novel approach to highly portable automatic text summarization, coupled with methods for building the needed corpora, both for training and evaluation on the new language.
The Stockholm EPR Corpus – Characteristics and Some Initial Findings
The characteristics of the Stockholm Electronic Patient Record Corpus (the SEPR Corpus), an important resource for performing research on clinical data, are described, which contains characteristics that are very interesting from a linguistic point of view, such as domain specific compounds and abbreviations, and various narratives.