HiC-spector: a matrix library for spectral and reproducibility analysis of Hi-C contact maps

@article{Yan2016HiCspectorAM,
  title={HiC-spector: a matrix library for spectral and reproducibility analysis of Hi-C contact maps},
  author={Koon-Kiu Yan and Galip G{\"u}rkan Yardımcı and Chengfei Yan and William Stafford Noble and Mark B. Gerstein},
  journal={Bioinformatics},
  year={2016},
  volume={33},
  pages={2199 - 2201}
}
Summary Genome-wide proximity ligation based assays like Hi-C have opened a window to the 3D organization of the genome. In so doing, they present data structures that are different from conventional 1D signal tracks. To exploit the 2D nature of Hi-C contact maps, matrix techniques like spectral analysis are particularly useful. Here, we present HiC-spector, a collection of matrix-related functions for analyzing Hi-C contact maps. In particular, we introduce a novel reproducibility metric for… 

Measuring the reproducibility and quality of Hi-C data

This work assess reproducibility and quality measures by varying sequencing depth, resolution and noise levels in Hi-C data from 13 cell lines, with two biological replicates each, as well as 176 simulated matrices, to identify low-quality experiments.

Binless normalization of Hi-C data provides significant interaction and difference detection independent of resolution

Binless, a method that allows for reproducible normalization of Hi-C data independent of its resolution, is developed and compared to compare how Binless performs in comparison with other methods.

essHi-C: essential component analysis of Hi-C matrices

Systematic comparisons show that essHi-C improves the clarity of the interaction patterns, enhances the robustness against sequencing depth of topologically associating domains identification, allows the unsupervised clustering of experiments in different cell lines and recovers the cell-cycle phasing of single-cells based on Hi-C data.

Boost-HiC: computational enhancement of long-range contacts in chromosomal contact maps

This work proposes to use the sparse information contained in raw contact maps to infer high-confidence contact counts between all pairs of loci, and enables the detection of Hi-C patterns such as chromosomal compartments at a resolution that would be otherwise only attainable by sequencing a hundred times deeper the experimentalHi-C library.

Boost-HiC : Computational enhancement of long-range contacts in chromosomal contact maps

This work proposes to use the sparse information contained in raw contact maps to determine high-confidence contact frequency between all pairs of loci, and enables the detection of Hi-C patterns such as chromosomal compartments at a resolution that would be otherwise only attainable by sequencing a hundred times deeper the experimentalHi-C library.

The corrected gene proximity map for analyzing the 3D genome organization using Hi-C data

The Corrected Gene Proximity map is a map of the 3D structure of the genome on a global scale that allows the simultaneous analysis of intra- and inter- chromosomal interactions and of gene co-regulation and co-localization more effectively than the map obtained by raw contact frequencies, thus revealing hidden associations between global spatial positioning and gene expression.

SHAMAN: bin-free randomization, normalization and screening of Hi-C matrices

SHAMAN is introduced for performing Hi-C analysis at dynamic scales, without predefined resolution, and while minimizing biases over very large datasets, and how contact preferences among regulatory elements, including promoters, enhancers and insulators can be assessed with minimal bias by comparing pooled empirical and randomized matrices.

HPRep: Quantifying Reproducibility in HiChIP and PLAC-Seq Datasets

HPRep, a stratified and weighted correlation metric derived from normalized contact counts, is presented, to quantify reproducibility in HiChIP and PLAC-seq data and it is demonstrated that HPRep outperforms existing reproducecibility measures developed for Hi-C data.

GenomeDISCO: A concordance score for chromosome conformation capture experiments using random walks on contact map graphs

A multi-scale concordance measure called GenomeDISCO (DIfferences between Smoothed COntact maps) is introduced for assessing the similarity of a pair of contact maps obtained from chromosome capture experiments and accurately distinguishes biological replicates from samples obtained from different cell types.

TADCompare: An R Package for Differential and Temporal Analysis of Topologically Associated Domains

TADCompare is developed, a method for differential analysis of boundaries of interacting domains between two or more Hi-C datasets based on a spectral clustering-derived measure called the eigenvector gap, which enables a loci-by-loci comparison of boundary differences.
...

References

SHOWING 1-7 OF 7 REFERENCES

Iterative Correction of Hi-C Data Reveals Hallmarks of Chromosome Organization

A computational pipeline that integrates a strategy to map sequencing reads with a data-driven method for iterative correction of biases, yielding genome-wide maps of relative contact probabilities and eigenvector decomposition provides insights into local chromatin states, global patterns of chromosomal interactions, and the conserved organization of human and mouse chromosomes are presented.

Comprehensive Mapping of Long-Range Interactions Reveals Folding Principles of the Human Genome

Hi-C is described, a method that probes the three-dimensional architecture of whole genomes by coupling proximity-based ligation with massively parallel sequencing and demonstrates the power of Hi-C to map the dynamic conformations of entire genomes.

Genome architectures revealed by tethered chromosome conformation capture and population-based modeling

A computational method is developed to translate the TCC data into physical chromatin contacts in a population of three-dimensional genome structures, demonstrating that the indiscriminate properties of interchromosomal interactions are consistent with the well-known architectural features of the human genome.

Analysis methods for studying the 3D architecture of the genome

Computational tools to interpret Hi-C data are reviewed, including pipelines for mapping, filtering, and normalization, and methods for confidence estimation, domain calling, visualization, and three-dimensional modeling.

Exploring the three-dimensional organization of genomes: interpreting chromatin interaction data

Several types of statistical and computational approaches that have recently been developed to analyse chromatin interaction data are described.

Spatial organization of chromatin domains and compartments in single chromosomes

An imaging method for mapping the spatial positions of numerous genomic regions along individual chromosomes is developed and it is observed that chromosome folding deviates from the ideal fractal-globule model at large length scales and that TADs are largely organized into two compartments spatially arranged in a polarized manner in individual chromosomes.

A Fast Algorithm for Matrix Balancing

  • P. KnightD. Ruiz
  • Computer Science
    Web Information Retrieval and Linear Algebra Algorithms
  • 2007
It is shown that while stationary iterative methods offer little or no improvement in many cases, a scheme using a preconditioned conjugate gradient method as the inner iteration can give quadratic convergence at low cost.