Learn More
The GENCODE Consortium aims to identify all gene features in the human genome using a combination of computational analysis, manual annotation, and experimental validation. Since the first public release of this annotation data set, few new protein-coding loci have been added, yet the number of alternative splicing transcripts annotated has steadily(More)
Eukaryotic cells make many types of primary and processed RNAs that are found either in specific subcellular compartments or throughout the cells. A complete catalogue of these RNAs is not yet available and their characteristic subcellular localizations are also poorly understood. Because RNA represents the direct output of the genetic information encoded(More)
While the long noncoding RNAs (ncRNAs) constitute a large portion of the mammalian transcriptome, their biological functions has remained elusive. A few long ncRNAs that have been studied in any detail silence gene expression in processes such as X-inactivation and imprinting. We used a GENCODE annotation of the human genome to characterize over a thousand(More)
We present a fast mapping-based algorithm to compute the mappability of each region of a reference genome up to a specified number of mismatches. Knowing the mappability of a genome is crucial for the interpretation of massively parallel sequencing experiments. We investigate the properties of the mappability of eukaryotic DNA/RNA both as a whole and at the(More)
UNLABELLED AutoGRAPH is an interactive web server for automatic multi-species comparative genomics analyses based on personal datasets or pre-inserted public datasets. This program automatically identifies conserved segments (CS) and breakpoint regions, assesses the conservation of marker/gene order between organisms, constructs synteny maps for two to(More)
The extraordinary phenotypic diversity of dog breeds has been sculpted by a unique population history accompanied by selection for novel and desirable traits. Here we perform a comprehensive analysis using multiple test statistics to identify regions under selection in 509 dogs from 46 diverse breeds using a newly developed high-density genotyping array(More)
Alternative splicing (AS) has the potential to greatly expand the functional repertoire of mammalian transcriptomes. However, few variant transcripts have been characterized functionally, making it difficult to assess the contribution of AS to the generation of phenotypic complexity and to study the evolution of splicing patterns. We have compared the AS of(More)
The transcriptome of a cell is represented by a myriad of different RNA molecules with and without protein-coding capacities. In recent years, advances in sequencing technologies have allowed researchers to more fully appreciate the complexity of whole transcriptomes, showing that the vast majority of the genome is transcribed, producing a diverse(More)
MOTIVATION Genome maps are fundamental to the study of an organism and essential in the process of genome sequencing which in turn provides the ultimate map of the genome. The increased number of genomes being sequenced offers new opportunities for the mapping of closely related organisms. We propose here an algorithmic formalization of a genome comparison(More)
By limiting sequencing to those sequences transcribed as mRNA, whole exome sequencing is a cost-efficient technique often used in disease-association studies. We developed two target enrichment designs based on the recently released annotation of the canine genome: the exome-plus design and the exome-CDS design. The exome-plus design combines the exons of(More)