Association of genes to genetically inherited diseases using data mining

@article{PerezIratxeta2002AssociationOG,
  title={Association of genes to genetically inherited diseases using data mining},
  author={Carolina Perez-Iratxeta and Peer Bork and Miguel Andrade},
  journal={Nature Genetics},
  year={2002},
  volume={31},
  pages={316-319}
}
Although approximately one-quarter of the roughly 4,000 genetically inherited diseases currently recorded in respective databases (LocusLink, OMIM) are already linked to a region of the human genome, about 450 have no known associated gene. Finding disease-related genes requires laborious examination of hundreds of possible candidate genes (sometimes, these are not even annotated; see, for example, refs 3,4). The public availability of the human genome draft sequence has fostered new strategies… 

G2D: a tool for mining genes associated with disease

An algorithm to prioritize genes on a chromosomal region according to their possible relation to an inherited disease using a combination of data mining on biomedical databases and gene sequence analysis is developed.

Literature and Genome Data Mining for Prioritizing Disease-Associated Genes

A method that combines literature mining, gene annotations, and sequence homology searches to prioritize candidate genes involved in a given genetic disorder is presented.

Computational approaches for disease gene identification

Three computational models for identifying candidate disease genes using ensemble learning models applied via combining multiple diverse biological sources and learning models to obtain better predictive performance are proposed.

Improving disease gene prioritization using the semantic similarity of Gene Ontology terms

This work introduces MedSim, a novel approach for ranking candidate genes for a particular disease based on functional comparisons involving the Gene Ontology, which uses functional annotations of known disease genes for assessing the similarity of diseases as well as the disease relevance of candidate genes.

Candidate-Based Approaches to Identify Genetic Variation Influencing Type 2 Diabetes and Quantitative Traits

A computational system named CAESAR is developed that ranks all annotated human genes as candidates for a complex trait by using ontologies to semantically map natural language descriptions of the trait with a variety of gene-centric information sources.

Highly consistent patterns for inherited human diseases at the molecular level

A comparative analysis of genes reported to cause inherited diseases in humans in terms of their causative effects on physiology, their genetics and inheritance modes, the functional processes they are involved in and their expression profiles across a wide spectrum of tissues reveals that there are more extensive correlations between these attributes of genetic disease genes than previously appreciated.

A computational system to select candidate genes for complex human traits

A computational system named CAESAR is developed that ranks all annotated human genes as candidates for a complex trait by using ontologies to semantically map natural language descriptions of the trait with a variety of gene-centric information sources.

Prediction of Human Disease Genes by Human-Mouse Conserved Coexpression Analysis

The results demonstrate that conserved coexpression, even at the human-mouse phylogenetic distance, represents a very strong criterion to predict disease-relevant relationships among human genes.

Pinpointing disease genes through phenomic and genomic data fusion

pgFusion not only provided an effective way for prioritizing candidate genes, but also demonstrated feasible solutions to two fundamental questions in the analysis of big genomic data: the comparability of heterogeneous data and the integration of multiple types of data.

Constructing human phenome-interactome networks for the prioritization of candidate genes

How similar methods can be readily used in identifying microRNAs that are potentially involved in complex diseases and discovering drugs that may target on disease-associated proteins are summarized.
...

References

SHOWING 1-10 OF 14 REFERENCES

Characterization of single-nucleotide polymorphisms in coding regions of human genes

The cSNPs most likely to influence disease, those that alter the amino acid sequence of the encoded protein, are found at a lower rate and with lower allele frequencies than silent substitutions, likely reflects selection acting against deleterious alleles during human evolution.

RefSeq and LocusLink: NCBI gene-centered resources

Together, RefSeq and LocusLink provide a non-redundant view of genes and other loci to support research on genes and gene families, variation, gene expression and genome annotation.

Autosomal Recessive Hypercholesterolemia Caused by Mutations in a Putative LDL Receptor Adaptor Protein

ARH appears to have a tissue-specific role in LDLR function, as it is required in liver but not in fibroblasts, and six mutations in a gene encoding a putative adaptor protein (ARH) are identified.

Initial sequencing and analysis of the human genome

The results of an international collaboration to produce and make freely available a draft sequence of the human genome are reported and an initial analysis is presented, describing some of the insights that can be gleaned from the sequence.

Initial sequencing and analysis of the human genome.

The results of an international collaboration to produce and make freely available a draft sequence of the human genome are reported and an initial analysis is presented, describing some of the insights that can be gleaned from the sequence.

Glutamate Dehydrogenase Deficiency in Cerebellar Degenerations: Clinical, Biochemical and Molecular Genetic Aspects

ABSTRACT: Glutamate dehydrogenase (GDH), an enzyme central to glutamate metabolism, is significantly reduced in patients with heterogenous neurological disorders characterized by multiple system

A novel pantothenate kinase gene (PANK2) is defective in Hallervorden-Spatz syndrome

It is shown that HSS is caused by a defect in a novel pantothenate kinase gene and a mechanism for oxidative stress in the pathophysiology of the disease is proposed.

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original.

Fuzzy Set Theory - and Its Applications

The book updates the research agenda with chapters on possibility theory, fuzzy logic and approximate reasoning, expert systems, fuzzy control, fuzzy data analysis, decision making and fuzzy set models in operations research.

Online Mendelian Inheritance in Man 'OMIM'.

  • S. Amladi
  • Medicine
    Indian journal of dermatology, venereology and leprology
  • 2003