Learn More
Crohn's disease and ulcerative colitis, the two common forms of inflammatory bowel disease (IBD), affect over 2.5 million people of European ancestry, with rising prevalence in other populations. Genome-wide association studies and subsequent meta-analyses of these two diseases as separate phenotypes have implicated previously unsuspected mechanisms, such(More)
We develop a statistical tool SNVer for calling common and rare variants in analysis of pooled or individual next-generation sequencing (NGS) data. We formulate variant calling as a hypothesis testing problem and employ a binomial-binomial model to test the significance of observed allele frequency against sequencing error. SNVer reports one single overall(More)
MOTIVATION Identification of a transcription factor binding sites is an important aspect of the analysis of genetic regulation. Many programs have been developed for the de novo discovery of a binding motif (collection of binding sites). Recently, a scoring function formulation was derived that allows for the comparison of discovered motifs from different(More)
MOTIVATION Genome-wide association studies (GWAS) interrogate common genetic variation across the entire human genome in an unbiased manner and hold promise in identifying genetic variants with moderate or weak effect sizes. However, conventional testing procedures, which are mostly P-value based, ignore the dependency and therefore suffer from loss of(More)
MOTIVATION A central problem in genomic research is the identification of genes and pathways involved in diseases and other biological processes. The genes identified or the univariate test statistics are often linked to known biological pathways through gene set enrichment analysis in order to identify the pathways involved. However, most of the procedures(More)
Autoimmune diseases (AIDs) are polygenic diseases affecting 7-10% of the population in the Western Hemisphere with few effective therapies. Here, we quantify the heritability of paediatric AIDs (pAIDs), including JIA, SLE, CEL, T1D, UC, CD, PS, SPA and CVID, attributable to common genomic variations (SNP-h(2)). SNP-h(2) estimates are most significant for(More)
To facilitate the clinical implementation of genomic medicine by next-generation sequencing, it will be critically important to obtain accurate and consistent variant calls on personal genomes. Multiple software tools for variant calling are available, but it is unclear how comparable these tools are or what their relative merits in real-world scenarios(More)
Genome-wide association studies (GWAS) have been fruitful in identifying disease susceptibility loci for common and complex diseases. A remaining question is whether we can quantify individual disease risk based on genotype data, in order to facilitate personalized prevention and treatment for complex diseases. Previous studies have typically failed to(More)
The motor neuron (MN) degenerative disease, spinal muscular atrophy (SMA) is caused by deficiency of SMN (survival motor neuron), a ubiquitous and indispensable protein essential for biogenesis of snRNPs, key components of pre-mRNA processing. However, SMA's hallmark MN pathology, including neuromuscular junction (NMJ) disruption and sensory-motor circuitry(More)
High-throughout genomic data provide an opportunity for identifying pathways and genes that are related to various clinical phenotypes. Besides these genomic data, another valuable source of data is the biological knowledge about genes and pathways that might be related to the phenotypes of many complex diseases. Databases of such knowledge are often called(More)