A comprehensive map of genetic variation in the world’s largest ethnic group - Han Chinese

  title={A comprehensive map of genetic variation in the world’s largest ethnic group - Han Chinese},
  author={Charleston W. K. Chiang and Serghei Mangul and Christopher R. Robles and Warren W. Kretzschmar and Na Cai and Kenneth S. Kendler and Sriram Sankararam and Jonathan Flint},
As are most non-European populations around the globe, the Han Chinese are relatively understudied in population and medical genetics studies. From low-coverage whole-genome sequencing of 11,670 Han Chinese women we present a catalog of 25,057,223 variants, including 548,401 novel variants that are seen at least 10 times in our dataset. Individuals from our study come from 19 out of 22 provinces across China, allowing us to study population structure, genetic ancestry, and local adaptation in… 
Testing for Hardy-Weinberg Equilibrium in Structured Populations using NGS Data
This work proposes a method that takes population structure into account in the testing for HWE, such that other factors causing deviations from HWE can be detected and shows the effectiveness of this method in NGS data, as well as in genotype data, for both simulated and real datasets.
Characterization of structural chromosomal variants by massively parallel sequencing
Five studies, focused on the analysis of SV using whole genome sequencing (WGS), developed and tested tools suitable for WGS SV analysis in a clinical setting, and validate the use of SV calling from WGS as a routine test in rare disease diagnostics.
Testing for Hardy–Weinberg equilibrium in structured populations using genotype or low‐depth next generation sequencing data
A method that takes population structure into account in the testing for HWE, such that other factors causing deviations from HWE can be detected, and is shown to be effectiveness in low‐depth NGS data, as well as in genotype data, for both simulated and real data set, where the use of genotype likelihoods enables the uncertainty.


Genetic structure of the Han Chinese population revealed by genome-wide SNP variation.
Using over 350,000 genome-wide autosomal SNPs in over 6000 Han Chinese samples from ten provinces of China, this study revealed a one-dimensional "north-south" population structure and a close correlation between geography and the genetic structure of the Han Chinese.
Genomic dissection of population substructure of Han Chinese and its implication in association studies.
Examination of population substructures in a diverse set of over 1700 Han Chinese samples collected from 26 regions across China showed that most differentiated genes among clusters are involved in cardiac arteriopathy, and signals indicating significant differences among Han Chinese subpopulations should be carefully explained.
Population structure of Han Chinese in the modern Taiwanese population based on 10,000 participants in the Taiwan Biobank project.
An investigation of the population structure of Han Chinese on this Pacific island using genotype data of 591,048 SNPs in an initial freeze of 10,801 unrelated TWB participants finds the Taiwanese Han Chinese clustered into three cline groups, finding that this T group is genetically distinct from neighbouring Southeast Asians and Austronesian tribes but similar to other southern Han Chinese.
11,670 whole-genome sequences representative of the Han Chinese population from the CONVERGE project
The China, Oxford and Virginia Commonwealth University Experimental Research on Genetic Epidemiology (CONVERGE) project on Major Depressive Disorder (MDD) sequenced 11,670 female Han Chinese at
Ancient DNA Reveals That the Genetic Structure of the Northern Han Chinese Was Shaped Prior to 3,000 Years Ago
The results show that the ancient people of Hengbei bore a strong genetic resemblance to present-day northern Han Chinese and were genetically distinct from other present- day Chinese populations and two ancient populations.
Natural positive selection and north–south genetic diversity in East Asia
Two of the regions that emerged are found in HLA class I and II, suggesting that the HLA imputation panel from the HapMap may not be directly applicable to every Chinese sample, which has important implications to autoimmune studies that plan to impute the classical HLA alleles to fine map the SNP association signals.
The ADH1B Arg47His polymorphism in East Asian populations and expansion of rice domestication in history
The regionally restricted enrichment of the class I alcohol dehydrogenase sequence polymorphism (ADH1BArg47His) in southern China and the adjacent areas suggests Darwinian positive selection on this genetic locus during Neolithic time though the driving force is yet to be disclosed.
Genetic signatures of high-altitude adaptation in Tibetans
  • Jian Yang, Zi Jin, +13 authors J. Qu
  • Biology, Medicine
    Proceedings of the National Academy of Sciences
  • 2017
The largest genome-wide study in Tibetans to date detects signatures of natural selection at nine gene loci, two of which are strongly associated with blood phenotypes in present day Tibetans, and shows the genetic relatedness of Tibetans with other ethnic groups in China and estimates the divergence time between Tibetans and Han.
Diversification of the ADH1B Gene during Expansion of Modern Humans
The dating of the H7 expansion may help understand the selective force on the ADH1B gene, and age estimates of the haplogroups based on the STRPs agree with the time of the migration events estimated by other studies.
Whole-genome sequence variation, population structure and demographic history of the Dutch population
The Genome of the Netherlands (GoNL) Project is described, in which the whole genomes of 250 Dutch parent-offspring families were sequenced and a haplotype map of 20.4 million single-nucleotide variants and 1.2 million insertions and deletions were constructed.