Benchmarking the performance of human antibody gene alignment utilities using a 454 sequence dataset

@article{Jackson2010BenchmarkingTP,
  title={Benchmarking the performance of human antibody gene alignment utilities using a 454 sequence dataset},
  author={Katherine J. L. Jackson and Scott D. Boyd and Bruno A. Ga{\"e}ta and Andrew M. Collins},
  journal={Bioinformatics},
  year={2010},
  volume={26 24},
  pages={
          3129-30
        }
}
MOTIVATION Immunoglobulin heavy chain genes are formed by recombination of genes randomly selected from sets of IGHV, IGHD and IGHJ genes. Utilities have been developed to identify genes that contribute to observed VDJ rearrangements, but in the absence of datasets of known rearrangements, the evaluation of these utilities is problematic. We have analyzed thousands of VDJ rearrangements from an individual (S22) whose IGHV, IGHD and IGHJ genotype can be inferred from the dataset. Knowledge of… 

Tables from this paper

Assigning and visualizing germline genes in antibody repertoires
TLDR
IgSCUEAL (Immunoglobulin Subtype Classification Using Evolutionary ALgorithms) demonstrates the highest accuracy of V and J assignment amongst existing approaches, even when the reassorted sequence is highly mutated, and can successfully cluster sequences on the basis of shared V/J germline alleles.
Benchmarking immunoinformatic tools for the analysis of antibody repertoire sequences
Abstract Summary Antibody repertoires reveal insights into the biology of the adaptive immune system and empower diagnostics and therapeutics. There are currently multiple tools available for the
ImmunoGlobulin galaxy (IGGalaxy) for simple determination and quantitation of immunoglobulin heavy chain rearrangements from NGS
TLDR
IGGalaxy provides clinical researchers with detailed insight into the repertoire of the B-cell population per individual sequenced and between control and pathogenic genomes and is capable of analyzing alternative NGS data.
HTJoinSolver: Human immunoglobulin VDJ partitioning using approximate dynamic programming constrained by conserved motifs
TLDR
A novel approximate dynamic programming method that uses conserved immunoglobulin gene motifs to improve performance of aligning V-segments of rearranged immunoglobeulin (Ig) genes and enhances the former JOINSOLVER algorithm by processing sequences with insertions and/or deletions (indels).
RearrangementsLoci by Analysis of VDJ Gene Immunoglobulin H Chain V Region Gene The Inference of Phased Haplotypes for the
TLDR
Analysis of rearrangements frequencies suggests that particular genes may have substantially different yet predictable propensities for rearrangement within different haplotypes, together with data highlighting the extent of hap-lotypic variation within the population.
Fast multiclonal clusterization of V(D)J recombinations from high-throughput sequencing
TLDR
New algorithms that process high-throughput sequencing data to extract unnamed V(D)J junctions and gather them into clones for quantification provide new insight into the analysis of high-power sequencing data for leukemia, and also to the quantitative assessment of any immunological profile.
DSab-origin: a novel IGHD sensitive VDJ mapping method and its application on antibody response after influenza vaccination
TLDR
This work filled in a computational gap in D segment assignment for VDJ germline gene identification in antibody research by presenting a D-sensitive mapping method called DSab-origin with accuracies around 90% in human monoclonal antibody data and average 95.8% in mouse data.
The Inference of Phased Haplotypes for the Immunoglobulin H Chain V Region Gene Loci by Analysis of VDJ Gene Rearrangements
TLDR
Analysis of rearrangement frequencies suggests that particular genes may have substantially different yet predictable propensities for rearrangements within different haplotypes, and suggests that there may be substantial variability in the available Ab repertoires of different individuals.
Reconstructing and mining the B cell repertoire with ImmunediveRsity
TLDR
ImmunediveRsity is a stand-alone pipeline primarily based in R programming for the integral analysis of B cell repertoire data generated by HTS, allowing the identification of previously validated antigen-specific antibodies, and revealing different and unexpected clonal diversity patterns in the post-immunization IgM and IgG compartments.
Analysis of human immunoglobulin VDJ and DJ rearrangements shows N region synthesis by concatenation of cytosine-rich strands preferentially originating from trimmed germline gene segments
TLDR
The data show that C-enriched N addition preferentially happens at trimmed 3’-ends of VH-, D-, and JH-gene segments indicating a dependency of the transferase mechanism upon the nuclease mechanism.
...
...

References

SHOWING 1-10 OF 21 REFERENCES
Reconsidering the human immunoglobulin heavy-chain locus:
TLDR
A surprising lack of diversity in the available IGHD gene repertoire is confirmed, and restriction of the germline sequence databases to the functional set described here will substantially improve the accuracy of IG HD gene alignments and therefore the accuracyof analysis of the V–D–J junction.
Reconsidering the human immunoglobulin heavy-chain locus: 1. An evaluation of the expressed human IGHD gene repertoire.
TLDR
A surprising lack of diversity in the available IGHD gene repertoire is confirmed, and restriction of the germline sequence databases to the functional set described here will substantially improve the accuracy of IG HD gene alignments and therefore the accuracyof analysis of the V-D-J junction.
iHMMune-align: hidden Markov model-based alignment and identification of germline genes in rearranged immunoglobulin gene sequences
TLDR
iHMMune-align provides a more accurate identification of component germline genes than other currently available IGH gene characterization programs, according to an evaluation of other immunoglobulin gene alignment utilities.
Many human immunoglobulin heavy‐chain IGHV gene polymorphisms have been reported in error
TLDR
A bioinformatic analysis of germline and rearranged immunoglobulin gene sequences is described, which casts doubt on the existence of a substantial proportion of reported germline polymorphisms and presents a revised repertoire of expressed IGHV genes, which should substantially improve the accuracy of immunoglOBulin gene analysis.
Ab-origin: an enhanced tool to identify the sourcing gene segments in germline for rearranged antibodies
TLDR
Ab-origin is presented, a program designed by batch query against germline databases based on empirical knowledge, optimized scoring scheme and appropriate parameters, which outperformed all the other five popular tools in terms of prediction accuracy.
Individual Variation in the Germline Ig Gene Repertoire Inferred from Variable Region Gene Rearrangements
TLDR
The extent of genotypic variation between individuals is highlighted by an individual with aplastic anemia who appears to lack six contiguous IGHD genes on both chromosomes, and these deletions significantly alter the potential expressed IGH repertoire, and possibly immune function, in this individual.
SoDA: implementation of a 3D alignment algorithm for inference of antigen receptor recombinations
TLDR
A dynamic programming algorithm to perform reconstruction of the details of the recombinatorial process giving rise to each of the participating antigen receptor genes is developed and implemented as web-accessible software called SoDA (Somatic Diversification Analysis).
No evidence for the use of DIR, D–D fusions, chromosome 15 open reading frames or VHreplacement in the peripheral repertoire was found on application of an improved algorithm, JointML, to 6329 human immunoglobulin H rearrangements
TLDR
JointML was shown to have a higher predictive performance for D‐gene assignment in mutated and unmutated sequences than four other publicly available programs.
Characterization of the Human Ig Heavy Chain Antigen Binding Complementarity Determining Region 3 Using a Newly Developed Software Algorithm, JOINSOLVER
TLDR
Analysis of the human CDR3H with JOINSOLVER has provided comprehensive information on the influences that shape this important Ag binding region of VH chains.
Measurement and Clinical Monitoring of Human Lymphocyte Clonality by Massively Parallel V-D-J Pyrosequencing
TLDR
It is shown that massively parallel DNA sequencing of rearranged immune receptor loci can provide direct detection and tracking of immune diversity and expanded clonal lymphocyte populations in physiological and pathological contexts.
...
...