Learn More
Biological sequences are composed of long strings of alphabetic letters rather than arrays of numerical values. Lack of a natural underlying metric for comparing such alphabetic data significantly inhibits sophisticated statistical analyses of sequences, modeling structural and functional aspects of proteins, and related problems. Herein, we use(More)
BACKGROUND Experimental designs that take advantage of high-throughput sequencing to generate datasets include RNA sequencing (RNA-seq), chromatin immunoprecipitation sequencing (ChIP-seq), sequencing of 16S rRNA gene fragments, metagenomic analysis and selective growth experiments. In each case the underlying data are similar and are composed of counts of(More)
BACKGROUND Bacterial vaginosis (BV), the most common vaginal condition of reproductive-aged women, is associated with a highly diverse and heterogeneous microbiota. Here we present a proof-of-principle analysis to uncover the function of the microbiota using meta-RNA-seq to uncover genes and pathways that potentially differentiate healthy vaginal microbial(More)
BACKGROUND Women living with HIV and co-infected with bacterial vaginosis (BV) are at higher risk for transmitting HIV to a partner or newborn. It is poorly understood which bacterial communities constitute BV or the normal vaginal microbiota among this population and how the microbiota associated with BV responds to antibiotic treatment. METHODS AND(More)
Accurate identification of specific groups of proteins by their amino acid sequence is an important goal in genome research. Here we combine information theory with fuzzy logic search procedures to identify sequence signatures or predictive motifs for members of the Myc-Max-Mad transcription factor network. Myc is a well known oncoprotein, and this family(More)
We developed a low-cost, high-throughput microbiome profiling method that uses combinatorial sequence tags attached to PCR primers that amplify the rRNA V6 region. Amplified PCR products are sequenced using an Illumina paired-end protocol to generate millions of overlapping reads. Combinatorial sequence tagging can be used to examine hundreds of samples(More)
Assisted reproductive technologies (ARTs) are becoming increasingly prevalent and are generally considered to be safe medical procedures. However, evidence indicates that embryo culture may adversely affect the developmental potential and overall health of the embryo. One of the least studied but most important areas in this regard is the effects of embryo(More)
BACKGROUND There is currently no way to verify the quality of a multiple sequence alignment that is independent of the assumptions used to build it. Sequence alignments are typically evaluated by a number of established criteria: sequence conservation, the number of aligned residues, the frequency of gaps, and the probable correct gap placement. Covariation(More)
The main drawback of most cancer chemotherapy is its relatively low ability to target tumour cells versus normal cells. As a consequence, chemotherapy is usually connected with severe side effects due to the toxicity of traditional cytostatic agents towards normal tissues. A few years ago, the site-specific activation of non-toxic prodrugs in tumours has(More)
Experimental variance is a major challenge when dealing with high-throughput sequencing data. This variance has several sources: sampling replication, technical replication, variability within biological conditions, and variability between biological conditions. The high per-sample cost of RNA-Seq often precludes the large number of experiments needed to(More)