Emergent Statistical Laws in Single-Cell Transcriptomic Data

  title={Emergent Statistical Laws in Single-Cell Transcriptomic Data},
  author={Silvia Lazzardi and Filippo Valle and Andrea Mazzolini and Antonio Scialdone and Michele Caselle and Matteo Osella},
Large scale data on single-cell gene expression have the potential to unravel the specific transcriptional programs of different cell types. The structure of these expression datasets suggests a similarity with several other complex systems that can be analogously described through the statistics of their basic building blocks. Transcriptomes of single cells are collections of messenger RNA abundances transcribed from a common set of genes just as books are different collections of words from a… Expand


Quantifying E. coli Proteome and Transcriptome with Single-Molecule Sensitivity in Single Cells
System-wide analyses of protein and mRNA expression in individual cells with single-molecule sensitivity using a newly constructed yellow fluorescent protein fusion library for Escherichia coli found that almost all protein number distributions can be described by the gamma distribution with two fitting parameters which, at low expression levels, have clear physical interpretations as the transcription rate and protein burst size. Expand
Single-cell proteomic analysis of S. cerevisiae reveals the architecture of biological noise
A strategy that pairs high-throughput flow cytometry and a library of GFP-tagged yeast strains to monitor rapidly and precisely protein levels at single-cell resolution is presented, revealing a remarkable structure to biological noise. Expand
RNA sequencing reveals two major classes of gene expression levels in metazoan cells
RNA sequencing of mouse Th2 cells is used, coupled with a range of other techniques, to show that all genes can be separated, based on their expression abundance, into two distinct groups: one group comprised of lowly expressed and putatively non‐functional mRNAs, and the other of highly expressed m RNAs with active chromatin marks at their promoters. Expand
Quantitative Analysis of Fission Yeast Transcriptomes and Proteomes in Proliferating and Quiescent Cells
This rich resource provides the first comprehensive reference for all RNA and most protein concentrations in a eukaryote under two key physiological conditions: cellular proliferation and quiescence. Expand
Are There Laws of Genome Evolution?
  • E. Koonin
  • Biology, Computer Science
  • PLoS Comput. Biol.
  • 2011
The observed universal regularities do not appear to be shaped by selection but rather are emergent properties of gene ensembles, which might qualify as “laws of evolutionary genomics” in the same sense “law” is understood in modern physics. Expand
Comprehensive integration of single cell data
This work presents a strategy for comprehensive integration of single cell data, including the assembly of harmonized references, and the transfer of information across datasets, and demonstrates how anchoring can harmonize in-situ gene expression and scRNA-seq datasets. Expand
Bayesian inference of gene expression states from single-cell RNA-seq data.
A Bayesian normalization procedure called Sanity (SAmpling-Noise-corrected Inference of Transcription activitY) is derived from first principles and shows that Sanity outperforms other normalization methods on downstream tasks, such as finding nearest-neighbor cells and clustering cells into subtypes. Expand
Universal features in the genome-level evolution of protein domains
A stochastic duplication/innovation model, in the class of the so-called 'Chinese restaurant processes', that explains this observation with two universal parameters, representing a minimal number of domains and the relative weight of innovation to duplication, and a model variant where new topologies are related to occurrence in genomic data, accounting for fold specificity. Expand
Zipf's law in gene expression.
Using data from gene expression databases on various organisms and tissues, it is found that the abundances of expressed genes exhibit a power-law distribution with an exponent close to -1; i.e., they obey Zipf's law. Expand
Revealing the vectors of cellular identity with single-cell genomics
Single-cell genomics has now made it possible to create a comprehensive atlas of human cells. At the same time, it has reopened definitions of a cell's identity and of the ways in which identity isExpand