William L. Trimble

Learn More
MG-RAST (http://metagenomics.anl.gov) is an open-submission data portal for processing, analyzing, sharing and disseminating metagenomic datasets. The system currently hosts over 200,000 datasets and is continuously updated. The volume of submissions has increased 4-fold over the past 24 months, now averaging 4 terabasepairs per month. In addition to(More)
We provide a novel method, DRISEE (duplicate read inferred sequencing error estimation), to assess sequencing quality (alternatively referred to as "noise" or "error") within and/or between sequencing samples. DRISEE provides positional error estimates that can be used to inform read trimming within a sample. It also provides global (whole sample) error(More)
We present the genomic sequence of the human pathogen Legionella pneumophila serogroup 12 strain 570-CO-H (ATCC 43290), a clinical isolate from the Colorado Department of Health, Denver, CO. This is the first example of a genome sequence of L. pneumophila from a serogroup other than serogroup 1. We highlight the similarities and differences relative to six(More)
Gene prediction algorithms (or gene callers) are an essential tool for analyzing shotgun nucleic acid sequence data. Gene prediction is a ubiquitous step in sequence analysis pipelines; it reduces the volume of data by identifying the most likely reading frame for a fragment, permitting the out-of-frame translations to be ignored. In this study we evaluate(More)
Halomonas strain GFAJ-1 was reported in Science magazine to be a remarkable microbe for which there was "arsenate in macromolecules that normally contain phosphate, most notably nucleic acids." The draft genome of the bacterium was determined (NCBI accession numbers AHBC01000001 through AHBC01000103). It appears to be a typical gamma proteobacterium.
The democratized world of sequencing is leading to numerous data analysis challenges; MG-RAST addresses many of these challenges for diverse datasets, including amplicon datasets, shotgun metagenomes, and metatranscriptomes. The changes from version 2 to version 3 include the addition of a dedicated gene calling stage using FragGenescan, clustering of(More)
We reconstructed the complete 2.4 Mb-long genome of a previously uncultivated epsilonproteobacterium, Candidatus Sulfuricurvum sp. RIFRC-1, via assembly of short-read shotgun metagenomic data using a complexity reduction approach. Genome-based comparisons indicate the bacterium is a novel species within the Sulfuricurvum genus, which contains one cultivated(More)
The numerous classes of repeats often impede the assembly of genome sequences from the short reads provided by new sequencing technologies. We demonstrate a simple and rapid means to ascertain the repeat structure and total size of a bacterial or archaeal genome without the need for assembly by directly analyzing the abundances of distinct k-mers among(More)
Alcaligenes faecalis subsp. faecalis NCIB 8687, the betaproteobacterium from which arsenite oxidase had its structure solved and the first "arsenate gene island" identified, provided a draft genome of 3.9 Mb in 186 contigs (with the largest 15 comprising 90% of the total) for this opportunistic pathogen species.