The Sequence of the Human Genome

  title={The Sequence of the Human Genome},
  author={J. Craig Venter and Mark D. Adams and Eugene Wimberly Myers and Peter W. Li and Richard J. Mural and Granger G. Sutton and Hamilton O. Smith and Mark Yandell and Cheryl A. Evans and Robert A. Holt and Jeannine D. Gocayne and Peter G. Amanatides and Richard Ballew and Daniel H. Huson and Jennifer R. Wortman and Qing Zhang and Chinnappa D. Kodira and Xiangqun H. Zheng and Lin Chen and Marian Skupski and G. M. Subramanian and Paul D. Thomas and Jinghui Zhang and George L. Gabor Miklos and Catherine R. Nelson and Samuel E. Broder and Andrew G. Clark and Joe Nadeau and Victor A. McKusick and Norton D. Zinder and Arnold J. Levine and Richard J. Roberts and M I Simon and Carolyn W. Slayman and Michael W. Hunkapiller and Randall A. Bolanos and Arthur L. Delcher and Ian M. Dew and Daniel P. Fasulo and Michael Flanigan and Liliana D. Florea and Aaron L. Halpern and Sridhar Hannenhalli and Saul A. Kravitz and Samuel Levy and Clark M. Mobarry and Knut Reinert and Karin A. Remington and Jane Abu-Threideh and Ellen M. Beasley and Kendra Biddick and Vivien R. Bonazzi and Rhonda Brandon and Michele Cargill and Ishwar Chandramouliswaran and Rosane Charlab and Kabir Chaturvedi and Zuoming Deng and Valentina Di Francesco and Patrick Dunn and Karen Eilbeck and Carlos Evangelista and Andrei E. Gabrielian and Wei Chang Gan and Wangmao Ge and Fangcheng Gong and Zhiping Gu and Ping Guan and Thomas J. Heiman and Maureen E. Higgins and Rui-Ru Ji and Zhaoxi Ke and Karen A. Ketchum and Zhongwu Lai and Yi-Ting Lei and Zhenya Li and Jiayin Li and Yong Liang and Xiaoying Lin and Fu Lu and Gennady V. Merkulov and Natalia Milshina and Helen M. Moore and Ashwinikumar K. Naik and Vaibhav A. Narayan and Beena A Neelam and Deborah R. Nusskern and Douglas B. Rusch and Steven L. Salzberg and Wei Shao and Bixiong Chris Shue and Jingtao Sun and Zhen Y. Wang and Aihui Wang and Xin Wang and Jian Wang and Ming Hui Wei and Ron Wides and Chunlin Xiao and Chunhua Yan and Alison Yao and Jane Ye and Ming Zhan and Weiqing Zhang and Hongyu Zhang and Qing Feng Zhao and Lian-rong Zheng and Fei Zhong and Wenyan Zhong and Shiaoping C. Zhu and Shaying Zhao and Dennis A Gilbert and Suzanna Baumhueter and Gene Spier and Christine Jacobson Carter and Anibal Cravchik and Trevor Woodage and Feroze Ali and Huijin An and Aderonke Awe and Danita Baldwin and H. Maria Baden and Mary Barnstead and Ian Barrow and Karen Beeson and Dana Busam and Amy L. Carver and Angela Center and Ming-Ming Cheng and Liz Curry and Steven Danaher and Lionel B. Davenport and Raymond Desilets and Susanne Malchau Dietz and Kristina L. Dodson and Lisa Doup and Steve Ferriera and Neha Garg and Andres Gluecksmann and Britney Hart and Jason Haynes and Charles Haynes and Cheryl R. Heiner and Suzanne Lynn Hladun and Damon Hostin and Jarrett T. Houck and T. Howland and Chinyere Ibegwam and Jeffery E. Johnson and Francis Kalush and Lesley Kline and Shashi Koduru and Amy Love and Felecia Mann and David Scott May and Steven McCawley and Tina C. McIntosh and Ivy McMullen and Mee C. Moy and Linda P Moy and Brian Murphy and Keith A. Nelson and Cynthia Pfannkoch and Eric C. Pratts and Vinita Puri and Hina-Ur-Razaq Qureshi and Matt Reardon and Robert Rodriguez and Yu Hui Rogers and Deanna L. Romblad and Bob Ruhfel and Richard Scott and Cynthia D. Sitter and Michelle Smallwood and Erin Stewart and Renee Strong and Ellen Suh and Reginald William Thomas and Ni Ni Tint and Sukyee Tse and Claire Vech and Gary Wang and J. Gillis Wetter and Sherita M. Williams and Monica S. Williams and Sandra M. Windsor and E. Winn-Deen and Keriellen Wolfe and Jayshree Zaveri and K. Zaveri and Josep F. Abril and Roderic Guig{\'o} and Michael J. Campbell and Kimmen Sjolander and Brian Karlak and Anish Kejariwal and Huaiyu Mi and B. Lazareva and Thomas Hatton and Apurva Narechania and Karen Diemer and Anushya Muruganujan and Nan Guo and Shinji Sato and Vineet Bafna and Sorin Istrail and Ross Lippert and Russell Schwartz and Brian P. Walenz and Shibu Yooseph and David R. Allen and Anand Basu and J. W. Baxendale and L. W. Blick and Marcelo Caminha and John Carnes-Stine and Parris Caulk and Yen-Hui Chiang and My D. Coyne and Carl Dahlke and Anne Deslattes Mays and Maria Dombroski and Michael Donnelly and Dale Ely and Shiva Esparham and Carl Fosler and Harold C. Gire and Stephen A. Glanowski and Kenneth Glasser and Anna Glodek and Mark Gorokhov and Kennedy Terence Graham and Barry Gropman and Michael Harris and Jeremy Heil and Scott Henderson and Jeffrey Hoover and Donald Jennings and Catherine Jordan and James Jordan and John R. Kasha and Leonid Kagan and Cheryl L. Kraft and Alexander Levitsky and Mark Lewis and Xiangjun Liu and John Lopez and Daniel S. Ma and William H. Majoros and Joseph Antonio McDaniel and Sean Murphy and Matthew Newman and Trung Nguyen and Ngoc Thanh Nguyen and Marc Nodell and Sue Pan and Jim Peck and Marshall W. Peterson and William L. Rowe and Robert Sanders and John L. Scott and Michael Simpson and Thomas Smith and A. Sprague and Timothy B. Stockwell and Russell J. Turner and Eli Venter and Mei Wang and Meiyuan Wen and David Wu and Mitchell M. Wu and Ashley Xia and Ali Zandieh and Xiaohong Zhu},
  pages={1304 - 1351}
A 2.91-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method. The 14.8-billion bp DNA sequence was generated over 9 months from 27,271,853 high-quality sequence reads (5.11-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals. Two assembly strategies—a whole-genome assembly and a regional chromosome assembly—were used, each combining sequence data from… 

Whole-genome shotgun assembly and comparison of human genome assemblies

The analysis of WGSA shows 97% order and orientation agreement with NCBI Build 34, where most of the 3% of sequence out of order is due to scaffold placement problems as opposed to assembly errors within the scaffolds themselves.

Computational comparison of human genomic sequence assemblies for a region of chromosome 4.

It is shown that even in a problematic region, existing software tools can be used with high-quality mapping data to produce genomic sequence contigs with a low rate of rearrangements.

Assembly of a pan-genome from deep sequencing of 910 humans of African descent

A deeply sequenced dataset of 910 individuals, all of African descent, is used to construct a set of DNA sequences that is present in these individuals but missing from the reference human genome, demonstrating that the African pan-genome contains ~10% more DNA than the current human reference genome.

A Comparison of Whole-Genome Shotgun-Derived Mouse Chromosome 16 and the Human Genome

Comparison of the structure and protein-coding potential of Mmu 16 with that of the homologous segments of the human genome identifies regions of conserved synteny with human chromosomes (Hsa) 3, 8, 12, 16, 21, and 22.

Assembly and compositional analysis of human genomic data

A dynamic-programming based approach to sequence assembly validation and detection of large-scale polymorphisms within a population that is made possible through the availability of large human sequence contigs is discussed.

Human Genome Sequencing Using Unchained Base Reads on Self-Assembling DNA Nanoarrays

A genome sequencing platform that achieves efficient imaging and low reagent consumption with combinatorial probe anchor ligation chemistry to independently assay each base from patterned nanoarrays of self-assembling DNA nanoballs is described.

On the sequencing of the human genome

Computational analysis finds that a perfect tiling path with 2-fold coverage is sufficient to recover virtually the entirety of a genome assembly and concludes that the assembly primarily depended on the HGP's sequence-tagged site maps, BAC maps, and clone-based sequences.

NotI flanking sequences: a tool for gene discovery and verification of the human genome.

The results of the study suggest that the existing tools for computational determination of CpG islands fail to identify a significant fraction of functional CPG islands, and unmethylated DNA stretches with a high frequency of C pG dinucleotides can be found even in regions with low CG content.

Comparing vertebrate whole-genome shotgun reads to the human genome.

Methods for using cross-species whole-genome shotgun sequence (WGS) for genome annotation are described in this paper and showed a 23-fold enrichment for coding regions compared with noncoding regions in the human genome.

Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding.

Dozens of mutations previously described in OMIM and hundreds of nonsynonymous single-nucleotide and structural variants in genes previously implicated in disease are identified in this individual.



A High-Resolution Radiation Hybrid Map of the Human Genome Draft Sequence

We have constructed a physical map of the human genome by using a panel of 90 whole-genome radiation hybrids (the TNG panel) in conjunction with 40,322 sequence-tagged sites (STSs) derived from

Overlapping genomic sequences: a treasure trove of single-nucleotide polymorphisms.

The results clearly indicate that developing SNP markers from overlapping genomic sequence is highly efficient and cost effective, requiring only the two simple steps of developing STSs around the known SNPs and characterizing them in the appropriate populations.

The DNA sequence of human chromosome 22

The sequence of the euchromatic part of human chromosome 22 is reported, which consists of 12 contiguous segments spanning 33.4 megabases, contains at least 545 genes and 134 pseudogenes, and provides the first view of the complex chromosomal landscapes that will be found in the rest of the genome.

Is Whole Human Genome Sequencing Feasible

One make think of the shotgun approach as delivering a collection of R reads that constitute a random sample of contiguous subsequences of the source sequence of length approximately \({bar{L}}_{R}}\).

Human whole-genome shotgun sequencing.

This article outlines an alternative approach to sequencing the human and other large genomes, which it is argued is less costly and more informative than the clone-by-clone approach.

The mosaic structure of human pericentromeric DNA: a strategy for characterizing complex regions of the human genome.

Comparisons with existing sequence and physical maps for the human genome suggest that many of these BACs map to regions of the genome with sequence gaps, which indicates that large portions of pericentromeric DNA are virtually devoid of unique sequences.

Human BAC ends quality assessment and sequence analyses.

The annotation results of BESs for the contents of available genomic sequences, sequence tagged sites, expressed sequence tags, protein encoding regions, and repeats indicate that this resource will be valuable in many areas of genome research.

Complementary DNA sequencing: expressed sequence tags and human genome project

Automated partial DNA sequencing was conducted on more than 600 randomly selected human brain complementary DNA (cDNA) clones to generate expressed sequence tags (ESTs), which will facilitate the tagging of most human genes in a few years at a fraction of the cost of complete genomic sequencing.

Against a whole-genome shotgun.

It is argued here that the whole-genome shotgun proposed by Weber and Myers satisfies neither the high probability of success nor the decreased cost of any such approach.

An SNP map of the human genome generated by reduced representation shotgun sequencing

A simple but powerful method, called reduced representation shotgun (RRS) sequencing, for creating SNP maps, which facilitates the rapid, inexpensive construction of SNP maps in biomedically and agriculturally important species.