The complete sequence of a human genome

@article{Nurk2021TheCS,
  title={The complete sequence of a human genome},
  author={Sergey Nurk and Sergey Koren and Arang Rhie and Mikko Rautiainen and Andrey V. Bzikadze and Alla Mikheenko and Mitchell R. Vollger and Nicolas Altemose and Lev I Uralsky and Ariel Gershman and Sergey S. Aganezov and Savannah J Hoyt and Mark E. Diekhans and Glennis A. Logsdon and Michael Alonge and Stylianos E. Antonarakis and Matthew Borchers and Gerard G. Bouffard and Shelise Y. Brooks and Gina V. Caldas and Haoyu Cheng and Chen-Shan Chin and William Chow and Leonardo Gomes de Lima and Philip C. Dishuck and Richard Durbin and Tatiana Dvorkina and Ian T. Fiddes and Giulio Formenti and Robert S. Fulton and Arkarachai Fungtammasan and Erik K. Garrison and Patrick G. S. Grady and Tina A Graves-Lindsay and Ira M. Hall and Nancy F. Hansen and Gabrielle Hartley and Marina Haukness and Kerstin Howe and Michael W. Hunkapiller and Chirag Jain and Miten Jain and Erich D. Jarvis and Peter Kerpedjiev and Melanie Kirsche and Mikhail Kolmogorov and Jonas Korlach and Milinn Kremitzki and Heng Li and Valerie V B Maduro and Tobias Marschall and Ann M. McCartney and Jennifer McDaniel and Danny E. Miller and Jim Mullikin and Eugene Wimberly Myers and Nathan D. Olson and Benedict J. Paten and Paul Peluso and Pavel A. Pevzner and David Porubsky and Tamara Potapova and Evgeny I. Rogaev and Jeffrey A. Rosenfeld and Steven L. Salzberg and Valerie A. Schneider and Fritz J. Sedlazeck and Kishwar Shafin and Colin J. Shew and Alaina Shumate and Yumi Sims and Arian F. A. Smit and Daniela C. Soto and Ivan Sovi{\'c} and Jessica M. Storer and Aaron M. Streets and Beth A. Sullivan and Françoise Thibaud-Nissen and James Torrance and Justin Wagner and Brian P. Walenz and Aaron M. Wenger and Jonathan M. D. Wood and Chunlin Xiao and Stephanie M. Yan and Alice C Young and Samantha Zarate and Urvashi Surti and Rajiv C. McCoy and Megan Y. Dennis and Ivan A. Alexandrov and Jennifer L. Gerton and Rachel J. O’Neill and Winston Timp and Justin M. Zook and Michael C. Schatz and Evan E. Eichler and Karen H. Miga and Adam M. Phillippy},
  journal={Science (New York, N.Y.)},
  year={2021},
  volume={376},
  pages={44 - 53}
}
In 2001, Celera Genomics and the International Human Genome Sequencing Consortium published their initial drafts of the human genome, which revolutionized the field of genomics. While these drafts and the updates that followed effectively covered the euchromatic fraction of the genome, the heterochromatin and many other complex regions were left unfinished or erroneous. Addressing this remaining 8% of the genome, the Telomere-to-Telomere (T2T) Consortium has finished the first truly complete 3… 
Automated assembly of high-quality diploid human reference genomes
TLDR
Developing a combination of all the top performing methods, this work generated the first high- quality diploid reference assembly, containing only ∼4 gaps per chromosome, most within + 1% of CHM13’s length.
A complete reference genome improves analysis of human genetic variation
TLDR
How the T2T-CHM13 reference genome universally improves read mapping and variant calling for 3,202 and 17 globally diverse samples sequenced with short and long reads, respectively is demonstrated.
Segmental duplications and their variation in a complete human genome
TLDR
The first comprehensive view of human segmental duplications organization is presented based on a complete telomere-to-telomere human genome (T2T-CHM13) and reveals unprecedented patterns of structural heterozygosity and massive evolutionary differences in SD organization between humans and their closest living relatives.
Complete genomic and epigenetic maps of human centromeres
TLDR
An extensive study of newly assembled peri/centromeric sequences representing 6.2% of the first complete, telomere-to-telomere human genome assembly (T2T-CHM13) is presented, which provides an unprecedented atlas of human centromeres to guide future studies of their complex and critical functions as well as their unique evolutionary dynamics.
A classical revival: Human satellite DNAs enter the genomics era.
  • N. Altemose
  • Biology
    Seminars in cell & developmental biology
  • 2022
TLDR
This review provides an account of the history and current understanding of HSat1-3, with a view towards future studies of their evolution and roles in health and disease.
A reference-quality, fully annotated genome from a Puerto Rican individual
TLDR
The assembly and annotation of a second individual genome, from a Puerto Rican individual whose DNA was collected as part of the Human Pangenome project, is described, which is the first true reference genome created from an individual of African descent.
Prospects of telomere‐to‐telomere assembly in barley: Analysis of sequence gaps in the MorexV3 reference genome
TLDR
Almost all centromeric sequences and 45S ribosomal DNA repeat arrays were absent from the MorexV3 pseudomolecules and that the majority of sequence gaps can be attributed to assembly breakdown in long stretches of satellite repeats, but missing sequences cannot fully account for the difference between assembly size and flow cytometric genome size estimates.
Gaps and complex structurally variant loci in phased genome assemblies
TLDR
It is found that trio-based approaches using HiFi are the current gold standard although chromosome-wide phasing accuracy is comparable when using Strand-seq instead of parental data, and 6-7 Mbp of DNA are incorrectly orientated per haplotype irrespective of whether trio-free or trio- based approaches are employed.
Enrichment of centromeric DNA from human cells
TLDR
This work has developed a technique, named CenRICH, to enrich for centromeric DNA from human cells based on selective restriction digestion and size fractionation, and shows that this approach has great potential for making sequencing of centromer DNA more affordable and efficient and for single DNA molecule studies.
Assembly-free discovery of human novel sequences using long reads
TLDR
This study designed an Assembly-Free Novel Sequence (AF-NS) approach to identify novel sequences from Oxford Nanopore Technology long reads and revealed their association with the binding motifs of transcription factors.
...
...

References

SHOWING 1-10 OF 197 REFERENCES
Finishing the euchromatic sequence of the human genome
TLDR
The near-complete sequence reported here should serve as a firm foundation for biomedical research in the decades ahead and greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death.
Finishing The Euchromatic Sequence Of The Human Genome
TLDR
The near-complete sequence reported here greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death and should serve as a firm foundation for biomedical research in the decades ahead.
Resolving the complexity of the human genome using single-molecule sequencing
TLDR
A greater complexity of the human genome in the form of variation of longer and more complex repetitive DNA that can now be largely resolved with the application of this longer-read sequencing technology is suggested.
A complete reference genome improves analysis of human genetic variation
TLDR
How the T2T-CHM13 reference genome universally improves read mapping and variant calling for 3,202 and 17 globally diverse samples sequenced with short and long reads, respectively is demonstrated.
Long-read human genome sequencing and its applications
TLDR
The currently available platforms, how the technologies are being applied to assemble and phase human genomes, and their impact on improving the authors' understanding of human genetic variation are discussed.
Segmental duplications and their variation in a complete human genome
TLDR
The first comprehensive view of human segmental duplications organization is presented based on a complete telomere-to-telomere human genome (T2T-CHM13) and reveals unprecedented patterns of structural heterozygosity and massive evolutionary differences in SD organization between humans and their closest living relatives.
Centromere reference models for human chromosomes X and Y satellite arrays.
TLDR
This study provides an initial sequence characterization of a regional centromere and establishes a foundation to extend genomic characterization to these sites as well as to other repeat-rich regions within complex genomes.
Improved assembly and variant detection of a haploid human genome using single-molecule, high-fidelity long reads
TLDR
Although the HiFi assembly has significantly improved continuity and accuracy in many complex regions of the genome, it still falls short of the assembly of centromeric DNA and the largest regions of segmental duplication using existing assemblers, suggesting that HiFi may be the most effective stand-alone technology for de novo assembly of human genomes.
Complete genomic and epigenetic maps of human centromeres
TLDR
An extensive study of newly assembled peri/centromeric sequences representing 6.2% of the first complete, telomere-to-telomere human genome assembly (T2T-CHM13) is presented, which provides an unprecedented atlas of human centromeres to guide future studies of their complex and critical functions as well as their unique evolutionary dynamics.
Single haplotype assembly of the human genome from a hydatidiform mole
TLDR
Analysis of gene and repeat content show this assembly to be of excellent quality and contiguity, and comparisons to ClinVar and the NHGRI GWAS catalog show that the CHM1 genome does not harbor an excess of deleterious alleles, but comparison to assembly-independent resources, such as BAC clone end sequences and long reads generated by a different sequencing technology, indicate misassembled regions.
...
...