Data Cleaning: Detecting, Diagnosing, and Editing Data Abnormalities
@article{vandenBroeck2005DataCD, title={Data Cleaning: Detecting, Diagnosing, and Editing Data Abnormalities}, author={Jan van den Broeck and Solveig Argeseanu Cunningham and Roger Eeckels and Kobus Herbst}, journal={PLoS Medicine}, year={2005}, volume={2} }
In this policy forum the authors argue that data cleaning is an essential part of the research process, and should be incorporated into study design.
367 Citations
Methods for Cleaning and Managing a Nurse-Led Registry
- MedicineThe Journal of neuroscience nursing : journal of the American Association of Neuroscience Nurses
- 2020
The methods described provide a structured way for nurses and their collaborators to clean and manage registries and resulted in high-quality data, which was confirmed by missing data analysis.
Better Reporting, Better Research: Guidelines and Guidance in PLoS Medicine
- MedicinePLoS medicine
- 2008
PPLoS Medicine announces a new section: Guidelines and Guidance, where guidelines and guidance for medical practice are presented for the first time.
Statistics Corner: Data Cleaning-I
- MedicineJournal of Postgraduate Medicine, Education and Research
- 2019
The investigator was in dilemma, whether to share the data with a statistician before or after cleaning, and found some answers regarding the role and responsibilities of the investigator in data cleaning.
Assumptions made when preparing drug exposure data for analysis have an impact on results: An unreported step in pharmacoepidemiology studies
- PsychologyPharmacoepidemiology and drug safety
- 2018
This study aimed to develop a framework to define and document drug data preparation and to examine the impact of different assumptions on results.
Data Cleaning and Data Visualization Systems for Learning Analytics
- Education
- 2020
The up-to-date findings and outcomes of the research, design, and development projects at the InterLabs Research Institute at Bradley University that are focused on the analysis and testing of effective systems to clean and visualize student academic performance data for learning analytics are presented.
Too Much Information: Research Issues Associated With Large Databases
- MedicineClinical nurse specialist CNS
- 2013
Rec registries and administrative databases provide healthcare researchers with increasing opportunities to address a wide variety of important practice and patient care questions and are encouraged to explore large data sets to improve patient safety and quality care.
Targeting Non-obvious Errors in Death Certificates
- Linguistics
- 2008
Mortality statistics are much used although their accuracy is often questioned, and current methods only capture obvious errors in death certification.
A systematic approach to initial data analysis is good research practice.
- MedicineThe Journal of thoracic and cardiovascular surgery
- 2016
Making a distinction between data cleaning and central monitoring in clinical trials
- Computer ScienceClinical trials
- 2021
Early clinical trials collected data on punch cards and then on paper, but now, with increasing use of electronic data capture to replace paper forms, staff at trial sites are entering data directly into databases and are prompted in real time with automated data checks.
References
SHOWING 1-10 OF 35 REFERENCES
Analysis of Incomplete Multivariate Data
- Mathematics, Computer Science
- 1997
The Normal Model Methods for Categorical Data Loglinear Models Methods for Mixed Data and Inference by Data Augmentation Methods for Normal Data provide insights into the construction of categorical and mixed data models.
Clinical Data Management
- Medicine
- 1994
From the Publisher:
The first comprehensive volume on the subject of clinical data management, this book contains concise, well-researched information covering all aspects of data management from…
Post-randomisation exclusions: the intention to treat principle and excluding patients from analysis
- MedicineBMJ : British Medical Journal
- 2002
The authors consider the circumstances when it may be possible to exclude patients from the analysis of data in clinical trials, even in an intention to treat trial.
Missing data
- MathematicsAmyotrophic lateral sclerosis and other motor neuron disorders : official publication of the World Federation of Neurology, Research Group on Motor Neuron Diseases
- 2004
The importance of missing data in RCTs is emphasized, and how the problem can be handled in an unbiased way by imputation procedures is discussed, and some recommendations for trial design and conduct are made that are tailored to R CTs for ALS.
Data Base Error Trapping and Prediction
- Computer Science
- 1991
This work develops and analyzes models for a class of problems involving inferences about uncertain numbers of errors in data bases and generates inferences in terms of predictive distributions for the numbers of undetected errors.
Attrition in longitudinal studies. How to deal with missing data.
- MathematicsJournal of clinical epidemiology
- 2002
A product perspective on total data quality management
- BusinessCACM
- 1998
The purpose of this TDQM methodology is to deliver highquality information products (IP) to information consumers and aims to facilitate the implementation of an organization’s overall data quality policy formally expressed by top management.
Editing data: what difference do consistency checks make?
- EducationAmerican journal of epidemiology
- 2000
The authors examined five possible approaches to handling data inconsistencies and the effect that each has on point estimates of current cigarette use in a self-administered school-based survey of tobacco use, attitudes, and behaviors in Florida.
Practical statistics for medical research
- Medicine
- 1990
Practical Statistics for Medical Research is a problem-based text for medical researchers, medical students, and others in the medical arena who need to use statistics but have no specialized mathematics background.