• Publications
  • Influence
GENCODE: the reference human genome annotation for The ENCODE Project.
This work has examined the completeness of the transcript annotation and found that 35% of transcriptional start sites are supported by CAGE clusters and 62% of protein-coding genes have annotated polyA sites, and over one-third of GENCODE protein-Coding genes aresupported by peptide hits derived from mass spectrometry spectra submitted to Peptide Atlas. Expand
Initial sequencing and comparative analysis of the mouse genome.
The results of an international collaboration to produce a high-quality draft sequence of the mouse genome are reported and an initial comparative analysis of the Mouse and human genomes is presented, describing some of the insights that can be gleaned from the two sequences. Expand
Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project
Functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project are reported, providing convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts. Expand
Evolution of genes and genomes on the Drosophila phylogeny
These genome sequences augment the formidable genetic tools that have made Drosophila melanogaster a pre-eminent model for animal genetics, and will further catalyse fundamental research on mechanisms of development, cell biology, genetics, disease, neurobiology, behaviour, physiology and evolution. Expand
The Genome Sequence of Caenorhabditis briggsae: A Platform for Comparative Genomics
Comparisons of the two genomes exhibit extensive colinearity, and the rate of divergence appears to be higher in the chromosomal arms than in the centers, which will help to understand the evolutionary forces that mold nematode genomes. Expand
The ENCODE (ENCyclopedia Of DNA Elements) Project
The ENCyclopedia Of DNA Elements (ENCODE) Project is organized as an international consortium of computational and laboratory-based scientists working to develop and apply high-throughput approaches for detecting all sequence elements that confer biological function. Expand
Genome sequence of the Brown Norway rat yields insights into mammalian evolution
This first comprehensive analysis of the genome sequence of the Brown Norway (BN) rat strain is reported, which is the third complete mammalian genome to be deciphered, and three-way comparisons with the human and mouse genomes resolve details of mammalian evolution. Expand
Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution
A draft genome sequence of the red jungle fowl, Gallus gallus, provides a new perspective on vertebrate genome evolution, while also improving the annotation of mammalian genomes. Expand
An Efficient, Probabilistically Sound Algorithm for Segmentation and Word Discovery
  • M. Brent
  • Computer Science
  • Machine Learning
  • 1 February 1999
Experiments on phonemic transcripts of spontaneous speech by parents to young children suggest that the model-based, unsupervised algorithm for recovering word boundaries in a natural-language text from which they have been deleted is more effective than other proposed algorithms, at least when utterance boundaries are given and the text includes a substantial number of short utterances. Expand
Identification of Functional Elements and Regulatory Circuits by Drosophila modENCODE
Two studies identified regions of the nematode and fly genomes that show highly occupied targets (or HOT) regions where DNA was bound by more than 15 of the transcription factors analyzed and the expression of related genes were characterized. Expand