Corpus ID: 88502193

Chapter 2 Sequence Assembly

  title={Chapter 2 Sequence Assembly},
  author={Xiaoqiu Huang},
11 We describe an efficient method for assembling short reads into long sequences. 12 In this method, a hashing technique is used to compute overlaps between short 13 reads, allowing base mismatches in the overlaps. Then an overlap graph is con14 structed, with each vertex representing a read and each edge representing an over15 lap. The overlap graph is explored by graph algorithms to find unique paths of 16 reads representing contigs. The consensus sequence of each contig is constructed 17 by… Expand


A contig assembly program based on sensitive detection of fragment overlaps.
An effective computer program for assembling DNA fragments, the contig assembly program (CAP), has been developed and the performance tests of the program on fragment data from genomic sequencing projects produced satisfactory results. Expand
PCAP: a whole-genome assembly program.
The PCAP program has several features to address efficiency and accuracy issues in assembly, including generation of a consensus sequence for a contig is based on an alignment of reads in the contig, in which both base quality values and coverage information are used to determine every consensus base. Expand
Velvet: algorithms for de novo short read assembly using de Bruijn graphs.
Velvet represents a new approach to assembly that can leverage very short reads in combination with read pairs to produce useful assemblies and is in close agreement with simulated results without read-pair information. Expand
CAP3: A DNA sequence assembly program.
The third generation of the CAP sequence assembly program is described, which has a capability to clip 5' and 3' low-quality regions of reads and uses forward-reverse constraints to correct assembly errors and link contigs. Expand
Ray: Simultaneous Assembly of Reads from a Mix of High-Throughput Sequencing Technologies
A parallel short-read assembler, called Ray, is described, which has been developed to assemble reads obtained from a combination of sequencing platforms, and its performance is compared to other assemblers on simulated and real datasets. Expand
A sequence assembly and editing program for efficient management of large projects.
We describe a sequence assembly and editing program for managing large and small projects. It is being used to sequence complete cosmids and has substantially reduced the time taken to process theExpand
An Eulerian path approach to DNA fragment assembly
  • P. Pevzner, Haixu Tang, Michael Waterman
  • Biology, Medicine
  • Proceedings of the National Academy of Sciences of the United States of America
  • 2001
This work abandons the classical “overlap–layout–consensus” approach in favor of a new euler algorithm that, for the first time, resolves the 20-year-old “repeat problem” in fragment assembly. Expand
Short read fragment assembly of bacterial genomes.
A new Eulerian assembler is presented that generates nearly optimal short read assemblies of bacterial genomes and an approach to assemble reads in the case of the popular hybrid protocol when short and long Sanger-based reads are combined. Expand
Consed: a graphical tool for sequence finishing.
A finishing tool, consed, which attempts to implement principles of shotgun sequencing by using error probabilities from phred and phrap as an objective criterion to guide the entire finishing process. Expand
TIGR Assembler: A New Tool for Assembling Large Shotgun Sequencing Projects
ABSTRACT A new approach to assembling large, random shotgun sequencing projects has been developed. The TIGR Assembler overcomes several major obstacles to assembling such projects: the large numberExpand