Learn More
A 2.91-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method. The 14.8-billion bp DNA sequence was generated over 9 months from 27,271,853 high-quality sequence reads (5.11-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five(More)
Anopheles gambiae is the principal vector of malaria, a disease that afflicts more than 500 million people and causes more than 1 million deaths each year. Tenfold shotgun sequence coverage was obtained from the PEST strain of A. gambiae and assembled into scaffolds that span 278 million base pairs. A total of 91% of the genome was organized in 303(More)
The ciliate Tetrahymena thermophila is a model organism for molecular and cellular biology. Like other ciliates, this species has separate germline and soma functions that are embodied by distinct nuclei within a single cell. The germline-like micronucleus (MIC) has its genome held in reserve for sexual reproduction. The soma-like macronucleus (MAC), which(More)
Aspergillus fumigatus is exceptional among microorganisms in being both a primary and opportunistic pathogen as well as a major allergen. Its conidia production is prolific, and so human respiratory tract exposure is almost constant. A. fumigatus is isolated from human habitats and vegetable compost heaps. In immunocompromised individuals, the incidence of(More)
UNLABELLED We describe two new Generalized Hidden Markov Model implementations for ab initio eukaryotic gene prediction. The C/C++ source code for both is available as open source and is highly reusable due to their modular and extensible architectures. Unlike most of the currently available gene-finders, the programs are re-trainable by the end user. They(More)
The ENCyclopedia Of DNA Elements (ENCODE) Project aims to identify all functional elements in the human genome sequence. The pilot phase of the Project is focused on a specified 30 megabases (È1%) of the human genome sequence and is organized as an international consortium of computational and laboratory-based scientists working to develop and apply(More)
BACKGROUND Transcription initiation is a key component in the regulation of gene expression. mRNA 5' full-length sequencing techniques have enhanced our understanding of mammalian transcription start sites (TSSs), revealing different initiation patterns on a genomic scale. RESULTS To identify TSSs in Drosophila melanogaster, we applied a hierarchical(More)
BACKGROUND Predicting complete protein-coding genes in human DNA remains a significant challenge. Though a number of promising approaches have been investigated, an ideal suite of tools has yet to emerge that can provide near perfect levels of sensitivity and specificity at the level of whole genes. As an incremental step in this direction, it is hoped that(More)
The high degree of similarity between the mouse and human genomes is demonstrated through analysis of the sequence of mouse chromosome 16 (Mmu 16), which was obtained as part of a whole-genome shotgun assembly of the mouse genome. The mouse genome is about 10% smaller than the human genome, owing to a lower repetitive DNA content. Comparison of the(More)
All metazoan eukaryotes express microRNAs (miRNAs), roughly 22-nucleotide regulatory RNAs that can repress the expression of messenger RNAs bearing complementary sequences. Several DNA viruses also express miRNAs in infected cells, suggesting a role in viral replication and pathogenesis. Although specific viral miRNAs have been shown to autoregulate viral(More)