Learn More
We report the generation and analysis of functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project. These data have been further integrated and augmented by a number of evolutionary and computational analyses. Together, our results advance the collective knowledge about(More)
Eukaryotic cells make many types of primary and processed RNAs that are found either in specific subcellular compartments or throughout the cells. A complete catalogue of these RNAs is not yet available and their characteristic subcellular localizations are also poorly understood. Because RNA represents the direct output of the genetic information encoded(More)
To understand the biology and evolution of ruminants, the cattle genome was sequenced to about sevenfold coverage. The cattle genome contains a minimum of 22,000 genes, with a core set of 14,345 orthologs shared among seven mammalian species of which 1217 are absent or undetected in noneutherian (marsupial or monotreme) genomes. Cattle-specific evolutionary(More)
BACKGROUND Vertebrate odorant receptors comprise at least three types of G protein-coupled receptors (GPCRs): the OR, V1R, and V2R/V2R-like receptors, the latter group belonging to the C family of GPCRs. These receptor families are thought to receive chemosensory information from a wide spectrum of odorant and pheromonal cues that influence critical animal(More)
U12-type introns are spliced by the U12-dependent spliceosome and are present in the genomes of many higher eukaryotic lineages including plants, chordates and some invertebrates. However, due to their relatively recent discovery and a systematic bias against recognition of non-canonical splice sites in general, the introns defined by U12-type splice sites(More)
The vast majority of the biology of a newly sequenced genome is inferred from the set of encoded proteins. Predicting this set is therefore invariably the first step after the completion of the genome DNA sequence. Here we review the main computational pipelines used to generate the human reference protein-coding gene sets.
Alternative splicing (AS) has the potential to greatly expand the functional repertoire of mammalian transcriptomes. However, few variant transcripts have been characterized functionally, making it difficult to assess the contribution of AS to the generation of phenotypic complexity and to study the evolution of splicing patterns. We have compared the AS of(More)
Legumes are the third largest family of angiosperms and the second most important crop class. Legume genomes have been shaped by extensive large-scale gene duplications, including an approximately 58 million year old whole genome duplication shared by most crop legumes. We report the genome and the transcription atlas of coding and non-coding genes of a(More)
Eukaryotic cells make many types of primary and processed RNAs that are found either in specific subcellular compartments or throughout the cells. A complete catalogue of these RNAs is not yet available and their characteristic subcellular localizations are also poorly understood. Because RNA represents the direct output of the genetic information encoded(More)