On the widespread and critical impact of systematic bias and batch effects in single-cell RNA-Seq data

@article{Hicks2015OnTW,
  title={On the widespread and critical impact of systematic bias and batch effects in single-cell RNA-Seq data},
  author={Stephanie C. Hicks and Mingxiang Teng and Rafael A. Irizarry},
  journal={bioRxiv},
  year={2015},
  pages={025528}
}
Single-cell RNA-Sequencing (scRNA-Seq) has become the most widely used high-throughput method for transcription profiling of individual cells. [...] Key Result We examined data from five published studies and found that systematic errors can explain a substantial percentage of observed cell-to-cell expression variability.Expand
Batch effects and the effective design of single-cell gene expression studies
TLDR
The major source of variation in the gene expression data was driven by genotype, but the also observed substantial variation between the technical replicates, indicating that UMI counts are not an unbiased estimator of gene expression levels. Expand
Batch effects and the effective design of single-cell gene expression studies
TLDR
The major source of variation in the gene expression data was driven by genotype, but the also observed substantial variation between the technical replicates, indicating that UMI counts are not an unbiased estimator of gene expression levels. Expand
Overcoming confounding plate effects in differential expression analyses of single-cell RNA-seq data
TLDR
It is demonstrated that failing to consider plate effects in the statistical model results in loss of type I error control and a solution is proposed whereby counts are summed from all cells in each plate and the count sums for all plates are used in the DE analysis. Expand
Overcoming confounding plate effects in differential expression analyses of single-cell RNA-seq data
TLDR
It is demonstrated that failing to consider plate effects in the statistical model results in loss of type I error control and a solution is proposed whereby counts are summed from all cells in each plate and the count sums for all plates are used in the DE analysis. Expand
Correcting batch effects in single-cell RNA sequencing data by matching mutual nearest neighbours
TLDR
This work presents a new strategy for batch correction based on the detection of mutual nearest neighbours in the high-dimensional expression space that demonstrates the superiority of this approach over existing methods on a range of simulated and real scRNA-seq data sets. Expand
Gene length and detection bias in single cell RNA sequencing protocols
TLDR
The finding that scRNA-seq datasets that have been sequenced using a full-length transcript protocol exhibit gene length bias akin to bulk RNA-seq data is found, and it is illustrated that full- length and UMI data can be combined to reveal the underlying biology influencing expression of mESCs. Expand
How to design a single-cell RNA-sequencing experiment: pitfalls, challenges and perspectives
TLDR
An overview of the current available experimental and computational methods developed to handle single-cell RNA-sequencing data and, based on their peculiarities, possible analysis frameworks depending on specific experimental designs are suggested. Expand
Gene length and detection bias in single cell RNA sequencing protocols
TLDR
It is found that scRNA-seq datasets that have been sequenced using a full-length transcript protocol exhibit gene length bias akin to bulk RNA-seq data, and despite clear differences between UMI and full- length transcript data, it is illustrated that full- lengths and UMI data can be combined to reveal underlying biology influencing expression of mESCs. Expand
Experimental Considerations for Single-Cell RNA Sequencing Approaches
TLDR
The individual steps of a typical single-cell analysis workflow are delineated from tissue procurement, cell preparation, to platform selection and data analysis, and critical challenges in each of these steps are discussed, which will serve as a helpful guide to navigate the complex field of single- cell sequencing. Expand
A step-by-step workflow for low-level analysis of single-cell RNA-seq data.
TLDR
This article describes a computational workflow for low-level analyses of scRNA-seq data, based primarily on software packages from the open-source Bioconductor project, which covers basic steps including quality control, data exploration and normalization, as well as more complex procedures such as cell cycle phase assignment. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 38 REFERENCES
Tackling the widespread and critical impact of batch effects in high-throughput data
TLDR
It is argued that batch effects (as well as other technical and biological artefacts) are widespread and critical to address and experimental and computational approaches for doing so are reviewed. Expand
MAST: A flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA-seq data
TLDR
A new methodology to analyze single-cell transcriptomic data is presented that models this bimodality within a coherent generalized linear modeling framework, and the cellular detection rate, the fraction of genes turned on in a cell, is introduced. Expand
CEL-Seq: single-cell RNA-Seq by multiplexed linear amplification.
TLDR
It is shown that CEL-Seq gives more reproducible, linear, and sensitive results than a PCR-based amplification method, and will be useful for transcriptomic analyses of complex tissues containing populations of diverse cell types. Expand
Full-Length mRNA-Seq from single cell levels of RNA and individual circulating tumor cells
TLDR
Applying Smart-Seq to circulating tumor cells from melanomas, it is found that although gene expression estimates from single cells have increased noise, hundreds of differentially expressed genes could be identified using few cells per cell type. Expand
mRNA-Seq whole-transcriptome analysis of a single cell
TLDR
A single-cell digital gene expression profiling assay with only a single mouse blastomere is described, which detected the expression of 75% more genes than microarray techniques and identified 1,753 previously unknown splice junctions called by at least 5 reads. Expand
Computational and analytical challenges in single-cell transcriptomics
The development of high-throughput RNA sequencing (RNA-seq) at the single-cell level has already led to profound new discoveries in biology, ranging from the identification of novel cell types to theExpand
Single-cell RNA-seq: advances and future challenges
TLDR
An overview of the biological questions single-cell RNA-seq has been used to address, the major findings obtained from such studies, and current challenges and expected future developments in this booming field are provided. Expand
Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells
TLDR
It is shown that the single-cell latent variable model (scLVM) allows the identification of otherwise undetectable subpopulations of cells that correspond to different stages during the differentiation of naive T cells into T helper 2 cells. Expand
Characterization of the single-cell transcriptional landscape by highly multiplex RNA-seq.
TLDR
This strategy will enable the unbiased discovery and analysis of naturally occurring cell types during development, adult physiology, and disease and be demonstrated by analyzing the transcriptomes of 85 single cells of two distinct types. Expand
Single cell RNA Seq reveals dynamic paracrine control of cellular variation
TLDR
This study highlights the importance of cell-to-cell communication in controlling cellular heterogeneity and reveals general strategies that multicellular populations can use to establish complex dynamic responses. Expand
...
1
2
3
4
...