• Corpus ID: 10760208

Automated Transformation of Semi-Structured Text Elements

  title={Automated Transformation of Semi-Structured Text Elements},
  author={Johannes Heurix and Antonio Rella and Stefan Fenz and Thomas Neubauer},
Interconnected systems, such as electronic health records (EHR), considerably improved the handling and processing of health information while keeping the costs at a controlled level. Since the EHR virtually stores all data in digitized form, personal medical documents are easily and swiftly available when needed. However, multiple formats and differences in the health documents managed by various health care providers severely reduce the efficiency of the data sharing process. This paper… 

Figures and Tables from this paper

Recognition and pseudonymisation of medical records for secondary use
MEDSEC is introduced, a system which automatically converts paper-based health records into de-personalised and pseudonymised documents which can be accessed by secondary users without compromising the patients’ privacy.
Recognition and privacy preservation of paper-based health records.
A system for the recognition and privacy preservation of personal data in paper-based health records is presented with the aim to provide clinical studies with medical data gained from existing paper- based health records.
Protecting Anonymity in Data-Driven Biomedical Science
An overview on the most important and well-researched approaches and open research problems in this area is provided, with the goal to act as a starting point for further investigation.


Viewpoint Paper: Repurposing the Clinical Record: Can an Existing Natural Language Processing System De-identify Clinical Notes?
The authors tested the ability of MedLEE to remove protected health information (PHI) by comparing 100 outpatient clinical notes with the corresponding XML-tagged output, and found that PHI in the output was highly transformed, potentially making re-identification more difficult.
A multi-lingual architecture for building a normalised conceptual representation from medical language.
The results obtained and the issues raised when implementing key principles of MENELAS are discussed, which include the output of natural language analysis must be a normalised conceptual representation of medical information.
Research Paper: A General Natural-language Text Processor for Clinical Radiology
Development of a general natural-language processor that identifies clinical information in narrative reports and maps that information into a structured representation containing clinical terms, using radiology as the test domain.
Extracting information from textual documents in the electronic health record: a review of recent research.
Performance of information extraction systems with clinical text has improved since the last systematic review in 1995, but they are still rarely applied outside of the laboratory they have been developed in.
Identification of patient name references within medical documents using semantic selectional restrictions
The proposed algorithm is based on estimating the fitness of candidate patient name references to a set of semantic selectional restrictions that place tight contextual requirements upon candidate words in the report text and are determined automatically from a manually tagged corpus of training reports.
Extracting Diagnoses from Discharge Summaries
A program for extracting the diagnoses and procedures from the past medical history and discharge diagnoses in the discharge summary of a case and coding these using SNOMED-CT in the UMLS using a limited amount of natural language processing.
Information Extraction
A taxonomy of the field is created along various dimensions derived from the nature of the extraction task, the techniques used for extraction, the variety of input resources exploited, and the type of output produced to survey techniques for optimizing the various steps in an information extraction pipeline.
A Natural Language Processing System to Extract and Code Concepts Relating to Congestive Heart Failure from Chest Radiology Reports
Results indicate that the system to extract and code clinical concepts related to congestive heart failure from 39,000 chest radiology reports had high specificity, recall and precision for each of the concepts it is designed to detect.
A methodology for the pseudonymization of medical data