Learn More
INTRODUCTION Systematic approaches to dealing with missing values in record linkage are still lacking. This article compares the ad-hoc treatment of unknown comparison values as 'unequal' with other and more sophisticated approaches. An empirical evaluation was conducted of the methods on real-world data as well as on simulated data based on them. (More)
INTRODUCTION Supervised record linkage methods often require a clerical review to gain informative training data. Active learning means to actively prompt the user to label data with special characteristics in order to minimise the review costs. We conducted an empirical evaluation to investigate whether a simple active learning strategy using binary(More)
Cleansing data from synonyms and homonyms is a relevant task in fields where high quality of data is crucial, for example in disease registries and medical research networks. Record linkage provides methods for minimizing synonym and homonym errors thereby improving data quality. We focus our attention to the case of homonym errors (in the following denoted(More)
  • Frank P. Schmidt, Mathias Basner, Gunnar Kröger, Stefanie Weck, Boris Schnorbus, Axel Muttray +5 others
  • 2013
AIMS Aircraft noise disturbs sleep, and long-term exposure has been shown to be associated with increases in the prevalence of hypertension and an overall increased risk for myocardial infarction. The exact mechanisms responsible for these cardiovascular effects remain unclear. METHODS AND RESULTS We performed a blinded field study in 75 healthy(More)
Availability of and access to data and biosamples are essential in medical and translational research, where their reuse and repurposing by the wider research community can maximize their value and accelerate discovery. However, sharing human-related data or samples is complicated by ethical, legal, and social sensitivities. The specific ethical and legal(More)
BACKGROUND It has been claimed that endoscopic calcaneoplasty offers some advantages over open techniques in the surgical treatment of Haglund's deformity due to reduced postoperative complications like stiffness and pain. Bony over-resection places patients at risk of these complications. The resulting question with regard to the quantitative differences(More)
Molecular data, e.g. arising from microarray technology, is often used for predicting survival probabilities of patients. For multivariate risk prediction models on such high-dimensional data, there are established techniques that combine parameter estimation and variable selection. One big challenge is to incorporate interactions into such prediction(More)
Record linkage or deduplication deals with the detection and deletion of duplicates in and across files. For this task, this paper introduces and evaluates two new machine-learning methods (bumping and multiview) together with bagging, a tree-based ensemble-approach. Whereas bumping represents a tree-based approach as well, multiview is based on the(More)
  • 1