• Publications
  • Influence
The complete genome of Rhodococcus sp. RHA1 provides insights into a catabolic powerhouse
TLDR
Overall, RHA1 appears to have evolved to simultaneously catabolize a diverse range of plant-derived compounds in an O2-rich environment and is established as an important model for studying actinomycete physiology. Expand
Fusobacterium nucleatum infection is prevalent in human colorectal carcinoma.
TLDR
Overabundance of Fusobacterium sequences in tumor versus matched normal control tissue is verified by quantitative PCR analysis from a total of 99 subjects, and a positive association with lymph node metastasis is observed. Expand
Assembling millions of short DNA sequences using SSAKE
TLDR
SSAKE is a tool for aggressively assembling millions of short nucleotide sequences by progressively searching through a prefix tree for the longest possible overlap between any two sequences to help leverage the information from short sequence reads by stringently assembling them into contiguous sequences that can be used to characterize novel sequencing targets. Expand
ABySS 2.0: resource-efficient assembly of large genomes using a Bloom filter.
TLDR
ABySS 2.0 is benchmarked using a Genome in a Bottle data set of 250-bp Illumina paired-end and 6-kbp mate-pair libraries from a single individual and implements algorithms that employ a Bloom filter, a probabilistic data structure, to represent a de Bruijn graph and reduce memory requirements. Expand
Mutational evolution in a lobular breast tumour profiled at single nucleotide resolution
TLDR
The data show that single nucleotide mutational heterogeneity can be a property of low or intermediate grade primary breast cancers and that significant evolution can occur with disease progression, and two new RNA-editing events that recode the amino acid sequence of SRP9 and COG3 are revealed. Expand
NanoSim: nanopore sequence read simulator based on statistical characterization
TLDR
NanoSim is introduced, a fast and scalable read simulator that captures the technology-specific features of ONT data and allows for adjustments upon improvement of nanopore sequencing technology and is expected to have an enabling role in the field and benefit the development of scalable next-generation sequencing technologies for the long nanopore reads. Expand
LINKS: Scalable, alignment-free scaffolding of draft genomes with long reads
TLDR
LINKS, the Long Interval Nucleotide K-mer Scaffolder algorithm, a method that makes use of the sequence properties of nanopore sequence data and other error-containing sequence data, to scaffold high-quality genome assemblies, without the need for read alignment or base correction is presented. Expand
Co-occurrence of anaerobic bacteria in colorectal carcinomas
TLDR
A polymicrobial signature of Gram-negative anaerobic bacteria is associated with colorectal carcinoma tissue, associated with over-expression of numerous host genes, including the gene encoding the pro-inflammatory chemokine Interleukin-8. Expand
Complete haplotype sequence of the human immunoglobulin heavy-chain variable, diversity, and joining genes and characterization of allelic and copy-number variation.
TLDR
This work presents the most complete haplotype of IGHV, IGHD, and IGHJ gene regions derived from a single chromosome, representing an alternate assembly of ∼1 Mbp of high-quality finished sequence, and characterize four large germline copy-number variants (CNVs). Expand
Exhaustive T-cell repertoire sequencing of human peripheral blood samples reveals signatures of antigen selection and a directly measured repertoire size of at least 1 million clonotypes.
TLDR
It is shown that the sensitivity of sequence-based repertoire profiling is limited by both sequencing depth and sequencing accuracy, and a new, directly measured, lower limit on individual T-cell repertoire size is established. Expand
...
1
2
3
4
5
...