The Genomic HyperBrowser: inferential genomics at the sequence level

  title={The Genomic HyperBrowser: inferential genomics at the sequence level},
  author={Geir Kjetil Sandve and Sveinung Gundersen and Halfdan Rydbeck and Ingrid Kristine Glad and Lars Holden and Marit Holden and Knut Liest{\o}l and Trevor Clancy and Egil Ferkingstad and Morten Johansen and Vegard Nygaard and Eivind T{\o}stesen and Arnoldo Frigessi and Eivind Hovig},
  journal={Genome Biology},
  pages={R121 - R121}
The immense increase in the generation of genomic scale data poses an unmet analytical challenge, due to a lack of established methodology with the required flexibility and power. We propose a first principled approach to statistical analysis of sequence-level genomic information. We provide a growing collection of generic biological investigations that query pairwise relations between tracks, represented as mathematical objects, along the genome. The Genomic HyperBrowser implements the… 

The Genomic HyperBrowser: an analysis web server for genome-scale data

The Genomic HyperBrowser is an open-ended web server for the analysis of genomic track data that opens for a range of genomic investigations, related to, e.g., gene regulation, disease association or epigenetic modifications of the genome.

GSuite HyperBrowser: integrative analysis of dataset collections across the genome and epigenome

GSuite HyperBrowser is the first comprehensive solution for integrative analysis of dataset collections across the genome and epigenome, an open-source system for streamlined acquisition and customizable statistical analysis of large collections of genome-wide datasets.

Retrieval of Genomic Data using PyTables

This thesis primarily focuses on the retrieval part of GTrackCore, and how to utilize the advantages of the PyTables library in such a setting, and shows that the cost of having a data model that stores data in a single HDF5 file makes data retrieval slower in some cases, and faster in others.

EpiExplorer: live exploration and global analysis of large epigenomic datasets

EpiExplorer is described, a web tool for exploring genome and epigenome data on a genomic scale and its utility is demonstrated by describing a hypothesis-generating analysis of DNA hydroxymethylation in relation to public reference maps of the human epigenome.

Supplemental material for ” The differential disease regulome ”

An abstract representation of generic genomic elements by means of mathematical objects is constructed, which the biologist can explore advanced and problem-specific null hypotheses, hierarchically preserving various degrees of track information, strengthening power and reducing false discoveries.

Introduction Online Resources for Genomic Analysis Using High-Throughput Sequencing

The use and applications of common file formats for coding and storing genomic data are described and several web-accessible open-source resources for the visualization and analysis of NGS data are considered.

Epigenomic annotation‐based interpretation of genomic data: from enrichment analysis to machine learning

The hierarchy of tools and methods reviewed here presents a practical guide for the interpretation of genome‐wide ROIs within an epigenomic context and discusses leading tools and machine learning methods utilizing epigenomic and 3D genome structure data.

Recommendations for the FAIRification of genomic track metadata

A first iteration of a draft standard for genomic track metadata, as well as the accompanying software ecosystem, can be adapted or extended to future needs of the research community regarding data, methods and tools, balancing the requirements of both data submitters and analytical end-users.

Online resources for genomic analysis using high-throughput sequencing.

The use and applications of common file formats for coding and storing genomic data are described and several web-accessible open-source resources for the visualization and analysis of NGS data are considered.

Cogito: automated and generic comparison of annotated genomic intervals

Cogito implements a novel approach to facilitate high-level overview analyses of complex datasets, and offers additional insights into the data without the need for a full, time-consuming reanalysis.



Galaxy: a platform for interactive large-scale genome analysis.

An interactive system, Galaxy, that combines the power of existing genome annotation databases with a simple Web portal to enable users to search remote resources, combine data from independent queries, and visualize the results.

The Integr8 project - a resource for genomic and proteomic data

Main aims are to store the relationships of biological entities to each other and to entries in other databases, to provide a framework that allows for new kinds of data to be integrated, and to offer an entity-centric view of complete genomes and proteomes.

EpiGRAPH: user-friendly software for statistical analysis and prediction of (epi)genomic data

EpiGRAPH's practical utility is demonstrated in a case study on monoallelic gene expression and its novel approach to reproducible bioinformatic analysis is described.

The Ensembl genome database project

The Ensembl ( database project provides a bioinformatics framework to organise biology around the sequences of large genomes. It is a comprehensive source of stable automatic

The ENCODE (ENCyclopedia Of DNA Elements) Project

The ENCyclopedia Of DNA Elements (ENCODE) Project is organized as an international consortium of computational and laboratory-based scientists working to develop and apply high-throughput approaches for detecting all sequence elements that confer biological function.

NCBI Reference Sequences: current status, policy and new initiatives

The recent growth of the RefSeq database, recent changes to feature annotations and record types for eukaryotic (primarily vertebrate) species and policies regarding species inclusion and genome annotation are reported on.

The H-Invitational Database (H-InvDB), a comprehensive annotation resource for human genes and transcripts*

H-Inv DB, originally developed as an integrated database of the human transcriptome based on extensive annotation of large sets of full-length cDNA (FLcDNA) clones, now provides annotation for 120 558 human mRNAs extracted from the International Nucleotide Sequence Databases (INSD), in addition to 54 978 human FLcDNAs, in the latest release H-InvDB_4.6.

The vertebrate genome annotation (Vega) database

The Vertebrate Genome Annotation (Vega) database was first made public in 2004 and now contains comprehensive annotation on 20 of the 24 human chromosomes, four whole mouse chromosomes and around 40% of the zebrafish Danio rerio genome.

Genomic location analysis by ChIP‐Seq

This review discusses ChIP‐Seq technology, its possible pitfalls, data analysis and several early applications, and suggests that protein binding can be mapped in a truly genome‐wide manner with extremely high resolution.

Comprehensive Mapping of Long-Range Interactions Reveals Folding Principles of the Human Genome

Hi-C is described, a method that probes the three-dimensional architecture of whole genomes by coupling proximity-based ligation with massively parallel sequencing and demonstrates the power of Hi-C to map the dynamic conformations of entire genomes.