Major flaws in “Identification of individuals by trait prediction using whole-genome sequencing data”
@article{Erlich2017MajorFI, title={Major flaws in “Identification of individuals by trait prediction using whole-genome sequencing data”}, author={Yaniv Erlich}, journal={bioRxiv}, year={2017} }
Genetic privacy is an area of active research. While it is important to identify new risks, it is equally crucial to supply policymakers with accurate information based on scientific evidence. Recently, Lippert et al. (PNAS, 2017) investigated the status of genetic privacy using trait-predictions from whole genome sequencing. The authors sequenced a cohort of about 1000 individuals and collected a range of demographic, visible, and digital traits such as age, sex, height, face morphology, and a…
20 Citations
No major flaws in “Identification of individuals by trait prediction using whole-genome sequencing data”
- BiologybioRxiv
- 2017
It is shown that not only faces may be derived from DNA, but a wide range of phenotypes and demographic variables, and the main contribution of Lippert et al. is an algorithm that identifies genomes of individuals by combining multiple DNA-based predictive models for a myriad of traits.
Identification of individuals by trait prediction using whole-genome sequencing data
- BiologyProceedings of the National Academy of Sciences
- 2017
A maximum entropy algorithm is developed that integrates multiple predictions to determine which genomic samples and phenotype measurements originate from the same person and may have far-reaching ethical and legal implications.
Idéfix: identifying accidental sample mix-ups in biobanks using polygenic scores
- BiologybioRxiv
- 2021
Idéfix, a method for the identification of accidental sample mix-ups in biobanks using polygenic scores, is described and can already be used to identify a high-quality set of participants for whom it is very unlikely that they reflect sample Mix-ups, and therefore could be offered a pharmacogenetic passport.
Ensuring privacy and security of genomic data and functionalities
- Computer ScienceBriefings Bioinform.
- 2020
The genome privacy problem is discussed and relevant privacy attacks are reviewed, classified into identity tracing, attribute disclosure and completion attacks, which have been used to breach the privacy of an individual.
Artificial Intelligence and the Weaponization of Genetic Data
- Biology
- 2020
The ways in which data science is improving genetics are outlined and how that can ultimately lead to its weaponization are outlined, as well as to the broader social welfare risk associated with bio-warfare.
Machine learning and genomics: precision medicine versus patient privacy
- Computer SciencePhilosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences
- 2018
How breaches in patient privacy can occur are reviewed, recent developments in computational data protection are presented and how they can be combined with legal and ethical perspectives to provide secure frameworks for genomic data sharing are discussed.
DNA Based Methods in Intelligence - Moving Towards Metagenomics
- Biology
- 2020
Existing DNA intelligence tools applied to forensic science, the application of microbial forensics and metagenomics along with the challenges and concerns that future developments entail are discussed.
Toward a Risk-Utility Data Governance Framework for Research Using Genomic and Phenotypic Data in Safe Havens: Multifaceted Review
- MedicineJournal of medical Internet research
- 2020
A proportionate data governance framework is proposed to promote the safe, socially acceptable use of genomic and phenotypic data in safe havens to safeguard privacy and retain data utility for research.
Toward a Risk-Utility Data Governance Framework for Research Using Genomic and Phenotypic Data in Safe Havens: Multifaceted Review (Preprint)
- Medicine
- 2019
Recommendations toward a risk-utility model with a flexible suite of controls to safeguard privacy and retain data utility for research in safe havens can be used to contribute toward a proportionate data governance framework to promote the safe, socially acceptable use of genomic and phenotypic data in safe haven.
Facial recognition from DNA using face-to-DNA classifiers
- Computer ScienceNature Communications
- 2019
Another proof of concept to biometric authentication is established by using multiple face-to-DNA classifiers, each classifying given faces by a DNA-encoded aspect (sex, genomic background, individual genetic loci), or by aDNA-inferred aspect (BMI, age).
6 References
No major flaws in “Identification of individuals by trait prediction using whole-genome sequencing data”
- BiologybioRxiv
- 2017
It is shown that not only faces may be derived from DNA, but a wide range of phenotypes and demographic variables, and the main contribution of Lippert et al. is an algorithm that identifies genomes of individuals by combining multiple DNA-based predictive models for a myriad of traits.
Identification of individuals by trait prediction using whole-genome sequencing data
- BiologyProceedings of the National Academy of Sciences
- 2017
A maximum entropy algorithm is developed that integrates multiple predictions to determine which genomic samples and phenotype measurements originate from the same person and may have far-reaching ethical and legal implications.
Identifying Personal Genomes by Surname Inference
- BiologyScience
- 2013
It is reported that surnames can be recovered from personal genomes by profiling short tandem repeats on the Y chromosome (Y-STRs) and querying recreational genetic genealogy databases and it is shown that a combination of a surname with other types of metadata, such as age and state, can be used to triangulate the identity of the target.
Bayesian method to predict individual SNP genotypes from gene expression data
- BiologyNature Genetics
- 2012
A Bayesian approach to predict SNP genotypes that is based only on RNA expression data is developed and it is shown that predicted genotypes can accurately and uniquely identify individuals in large populations.
Defining the role of common variation in the genomic and biological architecture of adult human height
- BiologyNature Genetics
- 2014
The results indicate a genetic architecture for human height that is characterized by a very large but finite number of causal variants, including mTOR, osteoglycin and binding of hyaluronic acid.
Routes for breaching and protecting genetic privacy
- Computer ScienceNature Reviews Genetics
- 2014
An overview of genetic privacy breaching strategies is presented, outlining the principles of each technique, the underlying assumptions, and their technological complexity and maturation, as well as highlighting different cases that are relevant to genetic applications.