An integrated map of structural variation in 2,504 human genomes

@article{Sudmant2015AnIM,
  title={An integrated map of structural variation in 2,504 human genomes},
  author={Peter H. Sudmant and Tobias Rausch and Eugene J. Gardner and Robert E. Handsaker and Alexej Abyzov and John Huddleston and Yan Zhang and Kai Ye and Goo Jun and Markus Hsi-Yang Fritz and Miriam K. Konkel and Ankit Malhotra and Adrian M. St{\"u}tz and Xinghua Shi and Francesco Paolo Casale and Jieming Chen and Fereydoun Hormozdiari and Gargi Dayama and Ken Chen and Maika Malig and Mark J.P. Chaisson and Klaudia Walter and Sascha Meiers and Seva Kashin and Erik P Garrison and Adam Auton and Hugo Y. K. Lam and Xinmeng Jasmine Mu and Can Alkan and Danny Antaki and Taejeong Bae and Eliza Cerveira and Peter S. Chines and Zechen Chong and Laura Clarke and Elif Dal and Li Ding and Sarah B. Emery and Xian Fan and Madhusudan Gujral and Fatma Kahveci and Jeffrey M. Kidd and Y. Kong and Eric-Wubbo Lameijer and Shane A. McCarthy and Paul Flicek and Richard A. Gibbs and Gabor T. Marth and Christopher E. Mason and Androniki Menelaou and Donna M. Muzny and Bradley J. Nelson and Amina Noor and Nicholas F. Parrish and Matthew Pendleton and Andrew Quitadamo and Benjamin Raeder and Eric E. Schadt and Mallory Romanovitch and Andreas Schlattl and Robert P. Sebra and Andrey A. Shabalin and Andreas Untergasser and Jerilyn A. Walker and Min Wang and Fuli Yu and Chengsheng Zhang and Jing Zhang and Xiangqun Zheng-Bradley and Wanding Zhou and Thomas Zichner and Jonathan Sebat and Mark A. Batzer and Steven Mccarroll and Ryan E. Mills and Mark B. Gerstein and Ali Bashir and Oliver Stegle and Scott E. Devine and Charles Lee and Evan E. Eichler and Jan O. Korbel},
  journal={Nature},
  year={2015},
  volume={526},
  pages={75 - 81}
}
Structural variants are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight structural variant classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype blocks in 26 human populations. Analysing this set, we identify numerous gene-intersecting structural variants exhibiting population stratification and… 
A high-quality human reference panel reveals the complexity and distribution of genomic structural variants
TLDR
This work analyses whole genome sequencing data of 769 individuals from 250 Dutch families, and provides a haplotype-resolved map of 1.9 million genome variants across 9 different variant classes, including novel forms of complex indels, and retrotransposition-mediated insertions of mobile elements and processed RNAs.
A high-quality reference panel reveals the complexity and distribution of structural genome changes in a human population
TLDR
This work analyzes whole genome sequencing data of 769 individuals from 250 Dutch families and provides a haplotype-resolved map of 1.9 million genome variants across 9 different variant classes, including novel forms of complex indels, and retrotransposition-mediated insertions of mobile elements and processed RNAs.
Mapping and characterization of structural variation in 17,795 human genomes
TLDR
Structural variants in more than 17,000 human genomes are mapped and characterized using whole-genome sequencing, showing how this type of variation contributes to rare deleterious coding and noncoding alleles.
A high-quality human reference panel reveals the complexity and distribution of genomic structural variants
TLDR
This work analyzes whole genome sequencing data of 769 individuals from 250 Dutch families and provides a haplotype-resolved map of 1.9 million genome variants across 9 different variant classes, including novel forms of complex indels, and retrotransposition-mediated insertions of mobile elements and processed RNAs.
Haplotype-resolved diverse human genomes and integrated analysis of structural variation
TLDR
A recently developed computational pipeline that combines long-read technology and single-cell template strand sequencing (Strand-seq) to generate fully phased diploid genome assemblies without guidance of a reference genome or use of parent-child trio information is leveraged.
Multi-platform discovery of haplotype-resolved structural variation in human genomes
TLDR
A suite of long-read, short- read, strand-specific sequencing technologies, optical mapping, and variant discovery algorithms are applied to comprehensively analyze three trios to define the full spectrum of human genetic variation in a haplotype-resolved manner.
Mapping and characterization of structural variation in 17,795 deeply sequenced human genomes
TLDR
A cloud-based pipeline is used to map and characterize SV in 17,795 deeply sequenced human genomes from common disease trait mapping studies and exploit this resource to infer the dosage sensitivity of genes and non-coding elements, revealing strong trends related to regulatory element class, conservation and cell-type specificity.
Network-based analysis of allele frequency distribution among multiple populations identifies adaptive genomic structural variants
TLDR
A new method to identify potentially adaptive structural variants based on a network-based analysis that incorporates genotype frequency data from 26 populations simultaneously is developed, which identifies 577 structural variants that show high population distribution and introduces evolutionary models that may better explain the complex evolution of structural variants.
Multiethnic catalog of structural variants and their translational impact for disease phenotypes across 19,652 genomes
TLDR
The significance of SVs when assessing genotype-phenotype associations and the importance of ethnic diversity in study design is demonstrated by analyzing SVs across 19,652 individuals and the translational impact on 4,156 aptamerbased proteomic measurements across 4,021 multi-ethnic samples.
...
...

References

SHOWING 1-10 OF 58 REFERENCES
Towards a comprehensive structural variation map of an individual human genome
TLDR
The results indicate that a large number of structural variants have been unreported in the individual genomes published to date, and necessitate they be actively studied in health-related analyses of personal genomes.
Origins and functional impact of copy number variation in the human genome
TLDR
It is concluded that the heritability void left by genome-wide association studies will not be accounted for by common CNVs, and 30 loci with CNVs that are candidates for influencing disease susceptibility are identified.
A haplotype map of the human genome
TLDR
A public database of common variation in the human genome: more than one million single nucleotide polymorphisms for which accurate and complete genotypes have been obtained in 269 DNA samples from four populations, including ten 500-kilobase regions in which essentially all information about common DNA variation has been extracted.
A map of human genome variation from population-scale sequencing
TLDR
The pilot phase of the 1000 Genomes Project is presented, designed to develop and compare different strategies for genome-wide sequencing with high-throughput platforms, and the location, allele frequency and local haplotype structure of approximately 15 million single nucleotide polymorphisms, 1 million short insertions and deletions, and 20,000 structural variants are described.
A Comprehensive Map of Mobile Element Insertion Polymorphisms in Humans
TLDR
A direct comparison of MEI and SNP diversity levels suggests a differential mobile element insertion rate among populations, and a comprehensive map of 7,380 MEI polymorphisms from the 1000 Genomes Project whole-genome sequencing data is presented.
Mapping copy number variation by population scale genome sequencing
TLDR
A map of unbalanced SVs is constructed based on whole genome DNA sequencing data from 185 human genomes, integrating evidence from complementary SV discovery approaches with extensive experimental validations, and serves as a resource for sequencing-based association studies.
Transcriptome and genome sequencing uncovers functional variation in humans
TLDR
Se sequencing and deep analysis of messenger RNA and microRNA from lymphoblastoid cell lines of 462 individuals from the 1000 Genomes Project—the first uniformly processed high-throughput RNA-sequencing data from multiple human populations with high-quality genome sequences discover extremely widespread genetic variation affecting the regulation of most genes.
Linkage disequilibrium and heritability of copy-number polymorphisms within duplicated regions of the human genome.
TLDR
A combination of BAC-based and high-density customized oligonucleotide arrays were used to resolve the molecular basis of structural rearrangements and underscore the need for complete maps of genetic variation in duplication-rich regions of the genome.
Characteristics of de novo structural changes in the human genome.
Small insertions and deletions (indels) and large structural variations (SVs) are major contributors to human genetic diversity and disease. However, mutation rates and characteristics of de novo
...
...