Learn More
We present a method for spotting words in the wild, i.e., in real images taken in unconstrained environments. Text found in the wild has a surprising range of difficulty. At one end of the spectrum, Optical Character Recognition (OCR) applied to scanned pages of well formatted printed text is one of the most successful applications of computer vision to(More)
Genome-wide patterns of variation across individuals provide a powerful source of data for uncovering the history of migration, range expansion, and adaptation of the human species. However, high-resolution surveys of variation in genotype, haplotype and copy number have generally focused on a small number of population groups. Here we report the analysis(More)
Autism spectrum disorders (ASDs) are childhood neurodevelopmental disorders with complex genetic origins. Previous studies focusing on candidate genes or genomic regions have identified several copy number variations (CNVs) that are associated with an increased risk of ASDs. Here we present the results from a whole-genome CNV study on a cohort of 859 ASD(More)
High-throughput sequencing platforms are generating massive amounts of genetic variation data for diverse genomes, but it remains a challenge to pinpoint a small subset of functionally important variants. To fill these unmet needs, we developed the ANNOVAR tool to annotate single nucleotide variants (SNVs) and insertions/deletions, such as examining their(More)
Published genomewide association (GWA) studies typically analyze and report single-nucleotide polymorphisms (SNPs) and their neighboring genes with the strongest evidence of association (the "most-significant SNPs/genes" approach), while paying little attention to the rest. Borrowing ideas from microarray data analysis, we demonstrate that pathway-based(More)
We introduce novel profile-based string kernels for use with support vector machines (SVMs) for the problems of protein classification and remote homology detection. These kernels use probabilistic profiles, such as those produced by the PSI-BLAST algorithm, to define position-dependent mutation neighborhoods along protein sequences for inexact matching of(More)
The discovery of microRNAs (miRNAs) as a new class of regulators of gene expression has triggered an explosion of research activities, but has left many unanswered questions about how this regulation functions and how it is integrated with other regulatory mechanisms. A number of miRNAs have been found to be present in plasma and other body fluids of humans(More)
We describe a computational protocol for the ARACNE algorithm, an information-theoretic method for identifying transcriptional interactions between gene products using microarray expression profile data. Similar to other algorithms, ARACNE predicts potential functional associations among genes, or novel functions for uncharacterized genes, by identifying(More)