A database de-identification framework to enable direct queries on medical data for secondary use.
@article{Erdal2012ADD,
title={A database de-identification framework to enable direct queries on medical data for secondary use.},
author={Barbaros Selnur Erdal and Jie Liu and J Ding and J. Chen and Clay B. Marsh and Jyoti Kamal and Bradley D. Clymer},
journal={Methods of information in medicine},
year={2012},
volume={51 3},
pages={
229-41
}
}OBJECTIVE
To qualify the use of patient clinical records as non-human-subject for research purpose, electronic medical record data must be de-identified so there is minimum risk to protected health information exposure. This study demonstrated a robust framework for structured data de-identification that can be applied to any relational data source that needs to be de-identified.
METHODS
Using a real world clinical data warehouse, a pilot implementation of limited subject areas were used to…
Figures, Tables, and Topics from this paper
12 Citations
Using patient lists to add value to integrated data repositories
- MedicineJ. Biomed. Informatics
- 2014
Clinical records anonymisation and text extraction (CRATE): an open-source software system
- Computer Science, MedicineBMC Medical Informatics and Decision Making
- 2017
Creation and management of a research database from sensitive clinical records with secure pseudonym generation, full-text indexing, and a consent-to-contact process is possible and practical using entirely free and open-source software.
iGAS: A framework for using electronic intraoperative medical records for genomic discovery
- MedicineJ. Biomed. Informatics
- 2017
Knowledge Management and Informatics Considerations for Comparative Effectiveness Research: A Case-driven Exploration
- Computer ScienceMedical care
- 2013
The informatics challenges commonly encountered by those conducting CER studies include issues related to data information and knowledge management as well as those related to people and organizational issues (eg, sociotechnical factors and organizational factors).
Complexity and the will to transform.
- Computer ScienceMethods of information in medicine
- 2012
This issue of Methods contains specific as well as wider research topics, which give the perspective that health informatics is moving towards more clinically-oriented, multi-source, interoperable, more usable, secure and safer health information systems that can integrate in and outpatient data.
Use of Electronic Health-Related Datasets in Nursing and Health-Related Research
- MedicineWestern journal of nursing research
- 2015
This review article aims to discuss Electronic Health-Related Datasets (EHRDs) in terms of types, features, advantages, limitations, and possible use in nursing and health-related research.
Modular design, application architecture, and usage of a self-service model for enterprise data delivery: The Duke Enterprise Data Unified Content Explorer (DEDUCE)
- Computer ScienceJ. Biomed. Informatics
- 2014
A Theoretical Multi-level Privacy Protection Framework for Biomedical Data Warehouses
- Computer ScienceEUSPN/ICTH
- 2015
Protecting privacy in a clinical data warehouse
- Computer ScienceHealth Informatics J.
- 2015
The proposed privacy protection approach is scalable to clinical data warehouse construction with any size of medical data and can be secure enough to keep the confidential data from leaking to the outside world.
References
SHOWING 1-10 OF 68 REFERENCES
Toward a Fully De-identified Biomedical Information Warehouse
- Computer ScienceAMIA
- 2009
Findings on performance evaluation of different de-identification schemes that may be used to ensure regulatory compliance while also facilitating practical database updating and querying are reported.
Automatic de-identification of textual documents in the electronic health record: a review of recent research
- Computer ScienceBMC medical research methodology
- 2010
A review of recent research in automated de-identification of narrative text documents from the electronic health record finds methods based on dictionaries performed better with PHI that is rarely mentioned in clinical text, but are more difficult to generalize.
Automated de-identification of free-text medical records
- MedicineBMC Medical Informatics Decis. Mak.
- 2008
An automated Perl-based de-identification software package that is generally usable on most free-text medical records, e.g., nursing notes, discharge summaries, X-ray reports, etc, and is sufficiently generalized and can be customized to handle text files of any format is described.
Evaluating Common De-Identification Heuristics for Personal Health Information
- MedicineJournal of medical Internet research
- 2006
Existing Canadian federal and provincial privacy laws help explain why it is difficult to create an identification data set for the whole population, and there is a strong case for not disclosing the high-risk variables and their combinations identified here.
HIDE: An Integrated System for Health Information DE-identification
- Computer Science2008 21st IEEE International Symposium on Computer-Based Medical Systems
- 2008
This paper presents a prototype system for de-identifying health information including both structured and unstructured data and deploys a conditional random fields based technique for extracting identifying attributes from unstructuring data and k-anonymization based technique from structured data while preserving maximum data utility.
Concept-match medical data scrubbing. How pathology text can be used in research.
- MedicineArchives of pathology & laboratory medicine
- 2003
Computerized scrubbing can render the textual portion of a pathology report harmless for research purposes, and this article addresses the problem of data scrubbing.
Viewpoint Paper: Evaluating the State-of-the-Art in Automatic De-identification
- Computer ScienceJ. Am. Medical Informatics Assoc.
- 2007
An overview of this de-identification challenge is provided, the data and the annotation process are described, the evaluation metrics are explained, the nature of the systems that addressed the challenge are discussed, the results of received system runs are analyzed, and directions for future research are identified.
A Privacy-Preserving Framework for Integrating Person-Specific Databases
- Computer SciencePrivacy in Statistical Databases
- 2008
This paper develops protocols that enable data holders to merge personal records, thus creating larger profiles and diminishing duplication, and presents an extension to the protocol that permits the revelation of k-anonymous demographics, such that the administrator can perform joins more efficiently.
Evaluating the Risk of Re-identification of Patients from Hospital Prescription Records.
- MedicineThe Canadian journal of hospital pharmacy
- 2009
A formal risk analysis at one hospital produced a clinically relevant data set that also protects patient privacy and allows the hospital pharmacy to explicitly manage the risks of breach of patient privacy.







