Chado use case: storing genomic, genetic and breeding data of Rosaceae and Gossypium crops in Chado

@article{Jung2016ChadoUC,
  title={Chado use case: storing genomic, genetic and breeding data of Rosaceae and Gossypium crops in Chado},
  author={Sook Jung and Taein Lee and Stephen P. Ficklin and Jing Yu and Chun-Huai Cheng and Dorrie Main},
  journal={Database: The Journal of Biological Databases and Curation},
  year={2016},
  volume={2016}
}
  • Sook JungTaein Lee D. Main
  • Published 2 May 2016
  • Computer Science, Medicine
  • Database: The Journal of Biological Databases and Curation
The Genome Database for Rosaceae (GDR) and CottonGen are comprehensive online data repositories that provide access to integrated genomic, genetic and breeding data through search, visualization and analysis tools for Rosaceae crops and Gossypium (cotton). These online databases use Chado, an open-source, generic and ontology-driven database schema for biological data, as the primary data storage platform. Chado is highly normalized and uses ontologies to indicate the ‘types’ of data. Therefore… 

Figures and Tables from this paper

Kiwifruit Genome Database (KGD): a comprehensive resource for kiwifruit genomics

The Kiwifruit Genome Database (KGD) currently contains all publicly available genome and gene sequences, gene annotations, biochemical pathways, transcriptome profiles derived from public RNA-Seq datasets, and comparative genomic analysis results such as syntenic blocks and homologous gene pairs between different kiwifruit genome assemblies.

MGIS: managing banana (Musa spp.) genetic resources information and high-throughput genotyping data

The Musa Germplasm Information System (MGIS), the database for global ex situ-held banana genetic resources, has been developed to address needs in a user-friendly way and an interoperability layer has been implemented to facilitate the link with complementary databases like the Banana Genome Hub and the MusaBase breeding database.

Cucurbit Genomics Database (CuGenDB): a central portal for comparative and functional genomics of cucurbit crops

The Cucurbit Genomics Database has been developed using the Tripal toolkit and two new tools have been developed, a ‘SyntenyViewer’ to view genome synteny between different cucurbit species and an ‘RNA-Seq’ module to analyze and visualize gene expression profiles.

The NanDeSyn Database for Nannochloropsis systems and synthetic biology.

An international consortium has initiated an international consortium and present a comprehensive, multi-omics resource database named Nannochloropsis Design and Synthesis, which featured user friendly interfaces hosting genomic resources with gene annotations, and transcriptomic and proteomic data for six Nann Cochloropsis species.

SpinachBase: a central portal for spinach genomics

SpinachBase provides a suite of analysis and visualization tools including a genome browser, sequence similarity searches with BLAST, functional enrichment and functional classification analyses and functions to query and retrieve gene sequences and annotations.

Extension modules for storage, visualization and querying of genomic, genetic and breeding data in Tripal databases

The use of materialized views in the Chado Search module enables better performance as well as flexibility of data modeling in Chado, allowing existing Tripal databases with different metadata types to utilize the module.

ATGC transcriptomics: a web-based application to integrate, explore and analyze de novo transcriptomic data

A web-based application, called ATGC transcriptomics, with a flexible and adaptable interface that allows users to work with new generation sequencing (NGS) transcriptomic analysis results using an ontology-driven database and simplifies data exploration, visualization, and integration for a better comprehension of the results.

Tripal MapViewer: A tool for interactive visualization and comparison of genetic maps

Tripal MapViewer is a new interactive tool for visualizing genetic map data, developed as a Tripal replacement for Comparative Map Viewer, which enables visualization of entire maps or linkage groups and features such as molecular markers, quantitative trait loci (QTLs) and heritable phenotypic markers.

References

SHOWING 1-10 OF 35 REFERENCES

CMD: a Cotton Microsatellite Database resource for Gossypium genomics

The collection of publicly available cotton SSR markers in a centralized, readily accessible and curated web-enabled database provides a more efficient utilization of microsatellite resources and will help accelerate basic and applied research in molecular breeding and genetic mapping in Gossypium spp.

The Banana Genome Hub

The importance of the interoperability toward data integration between existing information systems is discussed, and several uses cases illustrate how the Banana Genome Hub can be used to study gene families.

The Sol Genomics Network (SGN)—from genotype to phenotype to breeding

The Sol Genomics Network is a web portal with genomic and phenotypic data, and analysis tools for the Solanaceae family and close relatives, and a new tool was recently implemented to improve Virus-Induced Gene Silencing (VIGS) constructs called the SGN VIGS tool.

RosBREED: Enabling marker-assisted breeding in Rosaceae

RosBREED, funded for four years from September 2009, incorporates eight teams (Breeding, Socio-Economics, Pedigree-Based Analysis, Breeding Information Management System, Genomics, Genotyping, MAB Pipeline, and Extension) in a transdisciplinary framework that involves significant educational and outreach activities and stakeholder participation.

MTGD: The Medicago truncatula genome database.

The J. Craig Venter Institute (JCVI) has been involved in M. truncatula genome sequencing and annotation since 2002 and has maintained a web-based resource providing data to the community for this entire period, where it currently hosts the latest version of the genome (Mt4.0).

Tripal: a construction toolkit for online genome databases

Tripal provides simplified site development by merging the power of Drupal, a popular web Content Management System with that of Chado, a community-derived database schema for storage of genomic, genetic and other related biological data.

The Chado Natural Diversity module: a new generic database schema for large-scale phenotyping and genotyping data

Details of the Natural Diversity module, a new Chado module that strictly adheres to the Chado remit of being generic and ontology driven, are described, including the design approach, the relational schema and use cases implemented in several databases.

A Chado case study: an ontology-based modular schema for representing genome-associated biological information

Chado is a relational database schema now being used to manage biological knowledge for a wide variety of organisms, from human to pathogens, especially the classes of information that directly or indirectly can be associated with genome sequences or the primary RNA and protein products encoded by a genome.

Towards a Reference Plant Trait Ontology for Modeling Knowledge of Plant Traits and Phenotypes

The vision of a species-neutral and overarching Reference Plant Trait Ontology which would be the basis for linking the disparate knowledge domains and that will support data integration and data mining across species is presented.

Sybil: methods and software for multiple genome comparison and visualization.

A two-phase protein clustering algorithm, used to generate protein clusters suitable for analysis through Sybil and a method for creating graphical displays of protein or gene clusters that span multiple genomes are described.