Benchmarking Database Performance for Genomic Data

@article{Khushi2015BenchmarkingDP,
  title={Benchmarking Database Performance for Genomic Data},
  author={Matloob Khushi},
  journal={Journal of Cellular Biochemistry},
  year={2015},
  volume={116}
}
  • Matloob Khushi
  • Published 1 June 2015
  • Computer Science
  • Journal of Cellular Biochemistry
Genomic regions represent features such as gene annotations, transcription factor binding sites and epigenetic modifications. Performing various genomic operations such as identifying overlapping/non‐overlapping regions or nearest gene annotations are common research needs. The data can be saved in a database system for easy management, however, there is no comprehensive database built‐in algorithm at present to identify overlapping regions. Therefore I have developed a novel region‐mapping… 
Data Mining ENCODE Data Predicts a Significant Role of SINA3 in Human Liver Cancer
TLDR
A bioinformatics pipeline to investigate Chip-seq DNA binding proteins datasets for HepG2 liver cancer cell line downloaded from ENCODE project showed a strong enrichment of DNA-binding protein SIN3A to activator and repressor indicating Sin3A plays has an important regulatory role in vital liver functions.
Predicting Functional Interactions Among DNA-Binding Proteins
TLDR
The significance of correlation between two transcription factors was unaffected by the peak-caller employed to identify transcription factor binding sites, and OCV measurements were used to develop a novel network map to study the correlation between twelve breast cancer cell-line datasets.
MinOmics, an Integrative and Immersive Tool for Multi-Omics Analysis
TLDR
This work uses proteomic data on 1417 proteins of the green microalga Chlamydomonas reinhardtii to investigate physicochemical parameters governing selectivity of three cysteine-based redox post translational modifications (PTM): glutathionylation, nitrosylation and disulphide bonds reduced by thioredoxins.
Automated classification and characterization of the mitotic spindle following knockdown of a mitosis-related protein
TLDR
Using the image analysis software tool MatQuantify, researchers can unambiguously test if disruption of a protein-of-interest changes metaphase spindle maintenance and thereby affects mitosis, and enables automated quantitative analysis of images of mitotic spindles.
A Novel Approach to Data Extraction on Hyperlinked Webpages
TLDR
15,000 web pages were downloaded using the in-house developed web-crawler and a nondeterministic finite automaton algorithm was designed to identify simple, complex, hyperlinked, or non-linked tables that could assist with performing better and stronger queries using the join operation.
MatCol: a tool to measure fluorescence signal colocalisation in biological systems
TLDR
MatCol has the ability to replace manual colocalisation counting, and the potential to be applied to a wide range of biological areas, and is validated in a biological setting.
PREDICTION BASED WORKLOAD PERFORMANCE EVALUATION FOR DISASTER MANAGEMENT SPATIAL DATABASE
  • N. Suryana, M. S. Rohman, F. Utomo
  • Computer Science
    The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
  • 2018
TLDR
The results of the study indicated that the prediction model for workload performance evaluation using CBR which is optimised by Hash Search technique for determining workload data on shortest path analysis via the employment of Dijkstra algorithm could be useful for the prediction of the incoming workload based on the status of the predetermined DBMS parameters.
An Electronic Healthcare Record Server Implemented in PostgreSQL.
TLDR
The five-part international standard for communicating healthcare records (ISO EN 13606) is used as the information basis for the design of the server, and some of the features that this standard demands that are provided by the server are described.
Evaluation of Functional Abilities in 0–6 Year Olds: An Analysis with the eEarlyCare Computer Application
TLDR
The use of computer applications together with Machine Learning techniques was shown to facilitate accurate diagnosis and therapeutic intervention and three clusters of functional development were found.
IMDB-Attire: A Novel Dataset for Attire Detection and Localization
TLDR
A unique dataset of ~8000 images from IMDBb.com was created to address the challenge of real-world application of the algorithm training for attire detection and multiclass classification and attire object detection using customized deep learning architectures including YOLO and SSD.

References

SHOWING 1-10 OF 34 REFERENCES
Binding Sites Analyser (BiSA): Software for Genomic Binding Sites Archiving and Overlap Analysis
TLDR
Transcription factor DNA binding site analyser software (BiSA), for archiving of binding regions and easy identification of overlap with or proximity to other regions of interest, supported by a comprehensive database of publicly available transcription factor binding sites and histone modifications.
BEDTools: a flexible suite of utilities for comparing genomic features
TLDR
A new software suite for the comparison, manipulation and annotation of genomic features in Browser Extensible Data (BED) and General Feature Format (GFF) format, which allows the user to compare large datasets (e.g. next-generation sequencing data) with both public and custom genome annotation tracks.
AnnotateGenomicRegions: a web application
TLDR
AnnotateGenomicRegions is a web application that accepts genomic regions as input and outputs a selection of overlapping and/or neighboring genome annotations that can be used by biologists and bioinformaticians alike.
Bioinformatic analysis of cis-regulatory interactions between progesterone and estrogen receptors in breast cancer
TLDR
Investigating the overlapping cis-regulatory role of estrogen receptor alpha (ERα) and progesterone receptor (PR) in the T-47D breast cancer cell line found that ERα binding sites overlap with a subset of PR binding sites, suggesting that ER α and PR, in general function independently at the molecular level, but that their activities converge on a specific subset of transcriptional targets.
Sequence features and chromatin structure around the genomic regions bound by 119 human transcription factors.
TLDR
An integrative analysis centered around 457 ChIP-seq data sets on 119 human TFs generated by the ENCODE Consortium identified highly enriched sequence motifs in most data sets, revealing new motifs and validating known ones.
Issues in bioinformatics benchmarking: the case study of multiple sequence alignment
TLDR
This work discusses the development of formal benchmarks, designed to represent the current problems encountered in the bioinformatics field, and considers several criteria for building good benchmarks and the advantages to be gained when they are used intelligently.
GenomicTools: a computational platform for developing high-throughput analytics in genomics
TLDR
GenomicTools is a flexible computational platform, comprising both a command-line set of tools and a C++ API, for the analysis and manipulation of high-throughput sequencing data such as DNA-seq, RNA- sequencing, ChIP-seq and MethylC-seq.
An Integrated Encyclopedia of DNA Elements in the Human Genome
TLDR
The Encyclopedia of DNA Elements project provides new insights into the organization and regulation of the authors' genes and genome, and is an expansive resource of functional annotations for biomedical research.
An integrated encyclopedia of DNA elements in the human genome
TLDR
The Encyclopedia of DNA Elements project provides new insights into the organization and regulation of the authors' genes and genome, and is an expansive resource of functional annotations for biomedical research.
...
1
2
3
4
...