A Comprehensive Infrastructure for Big Data in Cancer Research: Accelerating Cancer Research and Precision Medicine

  title={A Comprehensive Infrastructure for Big Data in Cancer Research: Accelerating Cancer Research and Precision Medicine},
  author={Izumi V. Hinkson and Tanja Davidsen and Juli D. Klemm and Ishwar Chandramouliswaran and Anthony R. Kerlavage and W. Kibbe},
  journal={Frontiers in Cell and Developmental Biology},
Advancements in next-generation sequencing and other -omics technologies are accelerating the detailed molecular characterization of individual patient tumors, and driving the evolution of precision medicine. Cancer is no longer considered a single disease, but rather, a diverse array of diseases wherein each patient has a unique collection of germline variants and somatic mutations. Molecular profiling of patient-derived samples has led to a data explosion that could help us understand the… 

Figures and Tables from this paper

Rapid advancement in cancer genomic big data in the pursuit of precision oncology

Several efforts devoted to accomplishing precision oncology and applying big data for use in Indonesia are discussed, including utilizing open access genomic data in writing research articles.

Open Data to Support CANCER Science—A Bioinformatics Perspective on Glioma Research

This work illustrates the current state of the art with examples from glioma research and shows how open data can be used for cancer research in general, and point out several resources and tools that are readily available.

Finding cancer driver mutations in the era of big data research

This review considers both coding and non-coding driver mutations, and discusses how such mutations might be identified from cancer sequencing datasets, and some of the tools and database that are available for the annotation of somatic variants and the identification of cancer driver genes.

Genomics-Guided Immunotherapy for Precision Medicine in Cancer.

Recent advances in molecular profiling, high-throughput sequencing, and computational efficiency has made immunogenomics the major tenet of precision medicine in cancer treatment.

Survey of main tools for querying and analyzing TCGA Data

  • Marzia SettinoM. Cannataro
  • Computer Science, Biology
    2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
  • 2018
This survey provides an overview of the main technical and functional features of the most popular and innovative tools for querying and analyzing TCGA data and makes available an easy to use guideline that helps the researchers in the choice of the tools best suited to their needs, hence focusing their efforts on the research goals rather than on the technical issues.

Precision Medicine Landscape of Genomic Testing for Patients With Cancer in the National Institutes of Health All of Us Database Using Informatics Approaches

Although not yet ubiquitous, diverse clinical genomic analyses in oncology can set the stage to grow the practice of precision medicine by integrating research patient data repositories, cancer data ecosystems, and biomedical informatics.

PRoBE the cloud toolkit: finding the best biomarkers of drug response within a breast cancer clinical trial

The cloud-based infrastructure, Patient Repository of Biomolecular Entities (PRoBE), has given the opportunity for uniform data structure, more efficient analysis of valuable data, and increased collaboration between researchers, according to researchers.

New functionalities in the TCGAbiolinks package for the study and integration of cancer data from GDC and GTEx

New features and enhancements of TCGAbiolinks are introduced, including more accurate and flexible pipelines for differential expression analyses, different methods for tumor purity estimation and filtering, and integration of normal samples from other platforms iv) support for other genomics datasets, exemplified by the TARGET data.

Classifying Big DNA Methylation Data: A Gene-Oriented Approach

This work proposes an efficient data processing procedure that permits to obtain a gene-oriented organization and enables to perform a supervised machine learning analysis with state-of-the-art methods on the NGS experiment of DNA methylation.



Comprehensive genomic characterization defines human glioblastoma genes and core pathways

The interim integrative analysis of DNA copy number, gene expression and DNA methylation aberrations in 206 glioblastomas reveals a link between MGMT promoter methylation and a hypermutator phenotype consequent to mismatch repair deficiency in treated gliobeasts, demonstrating that it can rapidly expand knowledge of the molecular basis of cancer.

Proteogenomics connects somatic mutations to signaling in breast cancer

It is demonstrated that proteogenomic analysis of breast cancer elucidates functional consequences of somatic mutations, narrows candidate nominations for driver genes within large deletions and amplified regions, and identifies therapeutic targets.

Implementing Genome-Driven Oncology

Proteogenomic characterization of human colon and rectal cancer

Integrated proteogenomic analysis provides functional context to interpret genomic abnormalities and affords a new paradigm for understanding cancer biology.

Comprehensive molecular portraits of human breast tumors

The ability to integrate information across platforms provided key insights into previously defined gene expression subtypes and demonstrated the existence of four main breast cancer classes when combining data from five platforms, each of which shows significant molecular heterogeneity.

A cloud-based workflow to quantify transcript-expression levels in public cancer compendia

This work quantified transcript-expression levels for 12,307 RNA-Sequencing samples from the Cancer Cell Line Encyclopedia and The Cancer Genome Atlas and created open-source Docker containers that include all the software and scripts necessary to process such data in the cloud and to collect performance metrics.

Collaboration to Accelerate Proteogenomics Cancer Care: The Department of Veterans Affairs, Department of Defense, and the National Cancer Institute's Applied Proteogenomics OrganizationaL Learning and Outcomes (APOLLO) Network

A tri‐federal initiative arising out of the Cancer Moonshot has resulted in the formation of a program to utilize advanced genomic and proteomic expression platforms on high‐quality human biospecimens in near‐real‐time in order to identify potentially actionable therapeutic molecular targets and accelerate novel clinical trials with biomarkers of prognostic and predictive value.

Integrated Genomic Analyses of Ovarian Carcinoma

It is reported that high-grade serous ovarian cancer is characterized by TP53 mutations in almost all tumours (96%); low prevalence but statistically recurrent somatic mutations in nine further genes including NF1, BRCA1,BRCA2, RB1 and CDK12; 113 significant focal DNA copy number aberrations; and promoter methylation events involving 168 genes.

Integrated genomic and molecular characterization of cervical cancer.

The extensive molecular characterization of 228 primary cervical cancers is reported, one of the largest comprehensive genomic studies of cervical cancer to date, and novel significantly mutated genes in cervical cancer are identified, revealing new potential therapeutic targets for cervical cancers.