Learn More
Genomic structural variants (SVs) are abundant in humans, differing from other forms of variation in extent, origin and functional impact. Despite progress in SV characterization, the nucleotide resolution architecture of most SVs remains unknown. We constructed a map of unbalanced SVs (that is, copy number variants) based on whole genome DNA sequencing(More)
Protein-protein interaction (PPI) networks of many organisms share global topological features such as degree distribution, k-hop reachability, betweenness and closeness. Yet, some of these networks can differ significantly from the others in terms of local structures: e.g. the number of specific network motifs can vary significantly among PPI networks.(More)
UNLABELLED Recent years have witnessed an increase in research activity for the detection of structural variants (SVs) and their association to human disease. The advent of next-generation sequencing technologies make it possible to extend the scope of structural variation studies to a point previously unimaginable as exemplified by the 1000 Genomes(More)
Human genomes are now being rapidly sequenced, but not all forms of genetic variation are routinely characterized. In this study, we focus on Alu retrotransposition events and seek to characterize differences in the pattern of mobile insertion between individuals based on the analysis of eight human genomes sequenced using next-generation sequencing.(More)
MOTIVATION Comrad is a novel algorithmic framework for the integrated analysis of RNA-Seq and whole genome shotgun sequencing (WGSS) data for the purposes of discovering genomic rearrangements and aberrant transcripts. The Comrad framework leverages the advantages of both RNA-Seq and WGSS data, providing accurate classification of rearrangements as(More)
MOTIVATION High-throughput sequencing of tumor samples has shown that most tumors exhibit extensive intra-tumor heterogeneity, with multiple subpopulations of tumor cells containing different somatic mutations. Recent studies have quantified this intra-tumor heterogeneity by clustering mutations into subpopulations according to the observed counts of DNA(More)
Somatic variants can be used as lineage markers for the phylogenetic reconstruction of cancer evolution. Since somatic phylogenetics is complicated by sample heterogeneity, novel specialized tree-building methods are required for cancer phylogeny reconstruction. We present LICHeE (Lineage Inference for Cancer Heterogeneity and Evolution), a novel method(More)
BACKGROUND Non-coding RNAs (ncRNAs) have important functional roles in the cell: for example, they regulate gene expression by means of establishing stable joint structures with target mRNAs via complementary sequence motifs. Sequence motifs are also important determinants of the structure of ncRNAs. Although ncRNAs are abundant, discovering novel ncRNAs on(More)
MOTIVATION Accurately predicting and genotyping indels longer than 30 bp has remained a central challenge in next-generation sequencing (NGS) studies. While indels of up to 30 bp are reliably processed by standard read aligners and the Genome Analysis Toolkit (GATK), longer indels have still resisted proper treatment. Also, discovering and genotyping longer(More)