Interactive visualization and analysis of large-scale sequencing datasets using ZENBU

  title={Interactive visualization and analysis of large-scale sequencing datasets using ZENBU},
  author={Jessica Severin and Marina Lizio and Jayson Harshbarger and Hideya Kawaji and Carsten O. Daub and Yoshihide Hayashizaki and Nicolas Bertin and Alistair R. R. Forrest},
  journal={Nature Biotechnology},
217 clustering (Fig. 1c and Supplementary Figs. 3 and 4), and collation (Fig. 1h and Supplementary Fig. 3). With an understanding of these atomic operations, more advanced users can combine them into complex processing scripts. These customized scripts can also be saved and shared allowing efficient re-use of optimized analyses and views. To demonstrate some of ZENBU’s functionality, we show multiple views on the same underlying data, ENCODE RNA-seq2 experiments loaded in BAM format (Fig. 1… 

SEQing: web-based visualization of iCLIP and RNA-seq data in an interactive python framework

SEQing is a customizable interactive dashboard to visualize crosslink sites on target genes of RNA-binding proteins that have been obtained by iCLIP, and is customizable in many ways and has also the option to be secured by a password.

svist4get: a simple visualization tool for genomic tracks from sequencing experiments

Svist4get is a command-line tool for customizable generation of publication-quality figures based on data from genomic signal tracks and is able to aggregate data from several tracks on a single plot along with the transcriptome annotation.

RNASeqBrowser: A genome browser for simultaneous visualization of raw strand specific RNAseq reads and UCSC genome browser custom tracks

A visualization tool that incorporates and extends the functionality of the UCSC genome browser and NGS visualization tools such as IGV, designed for ease of use for users with few bioinformatic skills, and incorporates the features of many genome browsers into one platform.

Update of the FANTOM web resource: expansion to provide additional transcriptome atlases

The recent updates of both data and interfaces in the FANTOM web resource are reported, including expansion of the resource by employing different assays, which yielded additional atlases of long noncoding RNAs, miRNAs and their promoters.

The FANTOM5 Computation Ecosystem: Genomic Information Hub for Promoters and Active Enhancers.

This chapter presents use cases in which the FANTOM5 dataset has been reused, leading to new findings and elaborate on different computing applications developed to publish the data and enable reproducibility and discovery of new findings.

FANTOM5 CAGE profiles of human and mouse samples

In the FANTOM5 project, transcription initiation events across the human and mouse genomes were mapped at a single base-pair resolution and their frequencies were monitored by CAGE coupled with single-molecule sequencing to represent the consequence of transcriptional regulation in each analyzed state of mammalian cells.

Unsupervised analysis of multi-experiment transcriptomic patterns with SegRNA identifies unannotated transcripts

A new segmentation and genome annotation (SAGA) method, SegRNA, that integrates data from multiple transcriptome profiling assays and generates the first unsupervised transcriptome annotation of the K562 chronic myeloid leukemia cell line, integrating multiple types of RNA data.

CSI NGS Portal: An Online Platform for Automated NGS Data Analysis and Sharing

The CSI NGS Portal is an online platform which gathers established bioinformatics pipelines to provide fully automated NGS data analysis and sharing in a user-friendly website and helps researchers rapidly analyse their N GS data and share results with colleagues without the aid of a bioinformatician.

Deep Cap Analysis of Gene Expression (CAGE): Genome-Wide Identification of Promoters, Quantification of Their Activity, and Transcriptional Network Inference.

The use of CAGE is described for the identification of novel noncoding transcripts in mammalian cells providing detailed information for basic data processing and advanced bioinformatics analyses.

A comprehensive overview of computational tools for RNA-seq analysis

This review provides a systemic overview of 235 available RNA-seq analysis tools across various domains published from 2008 to 2020 and discusses the interdisciplinary nature of bioinformatics involved in RNA sequencing, analysis, and software development without laboring over the biological details.



The UCSC genome browser and associated tools

The UCSC Genome Browser is a graphical viewer for genomic data that presents visualization of annotations mapped to genomic coordinates, and the ability to juxtapose annotations of many types facilitates inquiry-driven data mining.

Integrative Genomics Viewer

The sheer volume and scope of data posed by this flood of data pose a significant challenge to the development of efficient and intuitive visualization tools able to scale to very large data sets and to flexibly integrate multiple data types, including clinical data.

Visualizing genomes: techniques and challenges

This work provides a guide to genomic data visualization tools that facilitate analysis tasks by enabling researchers to explore, interpret and manipulate their data, and in some cases perform on-the-fly computations.

The GENCODE v7 catalog of human long noncoding RNAs: analysis of their gene structure, evolution, and expression.

The most complete human lncRNA annotation to date is presented, produced by the GENCODE consortium within the framework of the ENCODE project and comprising 9277 manually annotated genes producing 14,880 transcripts, and expression correlation analysis indicates that lncRNAs show particularly striking positive correlation with the expression of antisense coding genes.

Galaxy: a platform for interactive large-scale genome analysis.

An interactive system, Galaxy, that combines the power of existing genome annotation databases with a simple Web portal to enable users to search remote resources, combine data from independent queries, and visualize the results.

modMine: flexible access to modENCODE data

The modMine database ( described here has been built by the modENCODE Data Coordination Center to allow the broader research community to search for and download data sets of interest among the thousands generated by modENCode.

The Sequence Alignment/Map format and SAMtools

Summary: The Sequence Alignment/Map (SAM) format is a generic alignment format for storing read alignments against reference sequences, supporting short and long reads (up to 128 Mbp) produced by

UTGB toolkit for personalized genome browsers

The UTGB (University of Tokyo Genome Browser) Toolkit is designed to meet three major requirements for personalization of genome browsers: easy installation of the system with minimum efforts, browsing locally stored data and rapid interactive design of web interfaces tailored to individual needs.

The Ensembl genome database project

The Ensembl ( database project provides a bioinformatics framework to organise biology around the sequences of large genomes. It is a comprehensive source of stable automatic

eHive: An Artificial Intelligence workflow system for genomic analysis

BackgroundThe Ensembl project produces updates to its comparative genomics resources with each of its several releases per year. During each release cycle approximately two weeks are allocated to