Inferring processes underlying B-cell repertoire diversity

  title={Inferring processes underlying B-cell repertoire diversity},
  author={Yuval Elhanati and Zachary Sethna and Quentin Marcou and Curtis G Callan and Thierry Mora and Aleksandra M. Walczak},
  journal={Philosophical Transactions of the Royal Society B: Biological Sciences},
We quantify the VDJ recombination and somatic hypermutation processes in human B-cells using probabilistic inference methods on high-throughput DNA sequence repertoires of human B-cell receptor heavy chains. Our analysis captures the statistical properties of the naive repertoire, first after its initial generation via VDJ recombination and then after selection for functionality. We also infer statistical properties of the somatic hypermutation machinery (exclusive of subsequent effects of… 

Mouse T cell repertoires as statistical ensembles: overall characterization and age dependence

Analysis of T cell sequence repertoires taken from the blood and thymus of mice of different ages quantifies the significant changes in this process that occur in development from embryo to young adult.

Evidence for Shaping of Light Chain Repertoire by Structural Selection

Probabilistic and deterministic models are used to infer and disentangle generation and selection of the light chain, using large samples of light chains sequenced from healthy donors and transgenic mice to suggest structural selection maintaining the size of theCDR3 within a limited range, and preventing turns in the CDR3 region.

Learning the heterogeneous hypermutation landscape of immunoglobulins from high-throughput repertoire data

It is shown that hypermutations occurring concomittantly along B-cell lineages tend to co-localize, suggesting a possible mechanism for accelerating affinity maturation.

Likelihood-Based Inference of B Cell Clonal Families

An agglomerative algorithm to find a maximum likelihood clustering, two approximate algorithms with various trade-offs of speed versus accuracy, and a third, fast algorithm for finding specific lineages are described that under simulation greatly improve upon existing clonal family inference methods, and that they also give significantly different clusters than previous methods when applied to two real data sets.

A Bayesian phylogenetic hidden Markov model for B cell receptor sequence analysis

A novel approach to Bayesian phylogenetic inference for BCR sequences that is based on a phylogenetic hidden Markov model (phylo-HMM) that naturally accounts for uncertainty in all unobserved variables, including the phylogenetic tree, via posterior distribution sampling.

Predicting the spectrum of TCR repertoire sharing with a data-driven model of recombination

It is shown that even for large cohorts the observed degree of sharing of TCR sequences between individuals is well predicted by a model accounting for the known quantitative statistical biases in the generation process, together with a simple model of thymic selection.

Predicting the spectrum of TCR repertoire sharing with a data‐driven model of recombination

It is shown that even for large cohorts, the observed degree of sharing of TCR sequences between individuals is well predicted by a model accounting for the known quantitative statistical biases in the generation process, together with a simple model of thymic selection.

repgenHMM: a dynamic programming tool to infer the rules of immune receptor generation from sequence data

A Hidden Markov model is presented, which accounts for all plausible scenarios that can generate the receptor sequences and can be used to generate synthetic sequences, to calculate the probability of generation of any receptor sequence, as well as the theoretical diversity of the repertoire.

Immunoglobulin Clonotype and Ontogeny Inference

High-throughput immune repertoire analysis with IGoR

A software tool, IGoR, that calculates the likelihoods of potential V(D)J recombination and somatic hypermutation scenarios from raw immune sequence reads and outperforms existing tools in accuracy and estimate the sample sizes needed for reliable repertoire characterization.



Statistical inference of the generation probability of T-cell receptors from sequence repertoires

The probabilistic model predicts the generation probability of any specific CDR3 sequence by the primitive recombination process, allowing the potential diversity of the T-cell repertoire and to understand why some sequences are shared between individuals.

Quantifying evolutionary constraints on B-cell affinity maturation

It is found that the substitution process is conserved across individuals but varies significantly across gene segments, and a per-residue map of selection is derived, which provides a more nuanced view of the constraints on framework and variable regions.

Reconstructing a B-Cell Clonal Lineage. II. Mutation, Selection, and Affinity Maturation

The lineage exhibits a remarkably uniform rate of improvement of the effective affinity to influenza hemagglutinin (HA) over evolutionary time, increasing 1000-fold overall from the unmutated ancestor to the best of the observed antibodies.

Shaping of Human Germline IgH Repertoires Revealed by Deep Sequencing

The data suggest that developmental selection removes HCDR3 loops containing patches of hydrophobicity, which are commonly found in some auto-antibodies, and at least 69% of the initial productive IgH rearrangements are removed from the repertoire during B cell development.

Quantifying selection in immune receptor repertoires

A significant correlation between biases induced by VDJ recombination and inferred selection factors together with a reduction of diversity during selection suggest that natural selection acting on the recombination process has anticipated the selection pressures experienced during somatic evolution.

Models of Somatic Hypermutation Targeting and Substitution Based on Synonymous Mutations from High-Throughput Immunoglobulin Sequencing Data

Improved models of SHM targeting and substitution that are based only on synonymous mutations, and are thus independent of selection are produced.

The Inference of Antigen Selection on Ig Genes1

Side-by-side application of multinomial and binomial models on 86 previously established Ig sequences disclosed 8 discrepancies, leading to opposite statistical conclusions about Ag selection.

Strong intrinsic biases towards mutation and conservation of bases in human IgVH genes during somatic hypermutation prevent statistical analysis of antigen selection

The data suggest that analysis of the distribution of mutations in IgVH genes cannot be used reliably to state whether antigenic selection of the B‐cell carrying the genes occurred, and that intrinsic biases alone may be enough to give the appearance of selection.

Passenger transgenes reveal intrinsic specificity of the antibody hypermutation mechanism: clustering, polarity, and specific hot spots.

Somatic hypermutation in mice carrying an immunoglobulin kappa transgene exhibits specific base substitution preferences with transitions being favored over transversions and it is proposed that these substitution preferences can be used to discriminate intrinsic from antigen-selected hot spots.