Learn More
We develop a statistical tool SNVer for calling common and rare variants in analysis of pooled or individual next-generation sequencing (NGS) data. We formulate variant calling as a hypothesis testing problem and employ a binomial-binomial model to test the significance of observed allele frequency against sequencing error. SNVer reports one single overall(More)
Congenital heart disease (CHD) is the most frequent birth defect, affecting 0.8% of live births. Many cases occur sporadically and impair reproductive fitness, suggesting a role for de novo mutations. Here we compare the incidence of de novo mutations in 362 severe CHD cases and 264 controls by analysing exome sequencing of parent-offspring trios. CHD cases(More)
Eukaryotes have two types of spliceosomes, comprised of either major (U1, U2, U4, U5, U6) or minor (U11, U12, U4atac, U6atac; <1%) snRNPs. The high conservation of minor introns, typically one amidst many major introns in several hundred genes, despite their poor splicing, has been a long-standing enigma. Here, we discovered that the low abundance minor(More)
To facilitate the clinical implementation of genomic medicine by next-generation sequencing, it will be critically important to obtain accurate and consistent variant calls on personal genomes. Multiple software tools for variant calling are available, but it is unclear how comparable these tools are or what their relative merits in real-world scenarios(More)
The motor neuron (MN) degenerative disease, spinal muscular atrophy (SMA) is caused by deficiency of SMN (survival motor neuron), a ubiquitous and indispensable protein essential for biogenesis of snRNPs, key components of pre-mRNA processing. However, SMA's hallmark MN pathology, including neuromuscular junction (NMJ) disruption and sensory-motor circuitry(More)
BACKGROUND Gene expression studies of peripheral blood mononuclear cells from patients with systemic lupus erythematosus (SLE) have demonstrated a type I interferon signature and increased expression of inflammatory cytokine genes. Studies of patients with Aicardi Goutières syndrome, commonly cited as a single gene model for SLE, have suggested that(More)
MOTIVATION Next-generation RNA sequencing offers an opportunity to investigate transcriptome in an unprecedented scale. Recent studies have revealed widespread alternative polyadenylation (polyA) in eukaryotes, leading to various mRNA isoforms differing in their 3' untranslated regions (3'UTR), through which, the stability, localization and translation of(More)
Advances in next-generation sequencing technology have made it possible to comprehensively interrogate the entire spectrum of genomic variations including rare variants. They may help capture the remaining genetic heritability which has not been fully explained by previous genome-wide association studies. Here we performed a gene-based genome-wide scan to(More)
We consider the problem of identifying disease-associated genomic regions in genome-wide association studies (GWAS). It is shown that conventional single SNP analysis can be greatly improved by (i) exploiting the spatial dependency and (ii) conducing set-wise analysis. The SNP set association problem can be conceptualized as the problem of simultaneously(More)