• Corpus ID: 17167120

Algorithm for Finding Optimal Gene Sets in Microarray Prediction

  title={Algorithm for Finding Optimal Gene Sets in Microarray Prediction},
  author={J. Deutsch},
  journal={arXiv: Biological Physics},
  • J. Deutsch
  • Published 8 August 2001
  • Physics, Biology
  • arXiv: Biological Physics
Motivation: Microarray data has been recently been shown to be efficacious in distinguishing closely related cell types that often appear in the diagnosis of cancer. It is useful to determine the minimum number of genes needed to do such a diagnosis both for clinical use and to determine the importance of specific genes for cancer. Here a replication algorithm is used for this purpose. It evolves an ensemble of predictors, all using different combinations of genes to generate a set of optimal… 

Figures and Tables from this paper

Mining Clusters for Knowledge: Finding Algorithm-Independent Groups in Microarray Data
The proposed methodology combines and distills the information generated by different types of cluster analysis, and produces a representative clustering structure, and is shown to outperform the naive choice of a single algorithm.
Bioinspired Learning for Microarray Gene Selection and Cancer Classification
One major application of microarray technology lies in cancer classification. Thus far, a significant amount of new discoveries have been made and new bio-markers for various cancers have been
Genetic algorithm-neural network : feature extraction for bioinformatics data
The research proposes an innovative hybridised model based on genetic algorithms (GAs) and artificial neural networks (ANNs), to extract the highly differentially expressed genes for a specific cancer pathology and emphasises on extracting informative features from a high dimensional and highly complex data set, rather than to improve classification results.
Leukemia and small round blue-cell tumor cancer detection using microarray gene expression data set: Combining data dimension reduction and variable selection technique
Abstract Using gene expression data in cancer classification plays an important role for solving the fundamental problems relating to cancer diagnosis. Because of high throughput of gene expression
Scientifica Automated Local Linear Embedding with an application to microarray data Elisa
Permission is herewith granted to Università degli studi di Bologna to circulate and to have copied for non-commercial purposes, at its discretion, the above title upon the request of individuals or
Automated Local Linear Embedding with an application to microarray data


How Many Genes are Needed for a Discriminant Microarray Data Analysis
The analysis of the leukemia data from Whitehead/MIT group is a discriminant analysis, and it is observed that the performance of most of these weighted predictors on the testing set is gradually reduced as more genes are included, but a clear cutoff that separates good and bad prediction performance is not found.
Support vector machine classification and validation of cancer tissue samples using microarray expression data
A new method to analyse tissue samples using support vector machines for mis-labeled or questionable tissue results and shows that other machine learning methods also perform comparably to the SVM on many of those datasets.
Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays.
  • U. Alon, N. Barkai, +4 authors A. Levine
  • Biology, Medicine
    Proceedings of the National Academy of Sciences of the United States of America
  • 1999
A two-way clustering method is reported for analyzing a data set consisting of the expression patterns of different cell types, revealing broad coherent patterns that suggest a high degree of organization underlying gene expression in these tissues.
Coupled two-way clustering analysis of gene microarray data.
  • G. Getz, E. Levine, E. Domany
  • Biology, Physics
    Proceedings of the National Academy of Sciences of the United States of America
  • 2000
An algorithm, based on iterative clustering, that performs an algorithm to identify subsets of the genes and samples, such that when one of these is used to cluster the other, stable and significant partitions emerge.
Molecular classification of cancer: class discovery and class prediction by gene expression monitoring.
A generic approach to cancer classification based on gene expression monitoring by DNA microarrays is described and applied to human acute leukemias as a test case and suggests a general strategy for discovering and predicting cancer classes for other types of cancer, independent of previous biological knowledge.
Use of a cDNA microarray to analyse gene expression patterns in human cancer
Previously unrecognized alterations in the expression of specific genes provide leads for further investigation of the genetic basis of the tumorigenic phenotype of these cells.
Knowledge-based analysis of microarray gene expression data by using support vector machines.
  • M. P. Brown, W. Grundy, +5 authors D. Haussler
  • Computer Science, Medicine
    Proceedings of the National Academy of Sciences of the United States of America
  • 2000
A method of functionally classifying genes by using gene expression data from DNA microarray hybridization experiments, based on the theory of support vector machines (SVMs), to predict functional roles for uncharacterized yeast ORFs based on their expression data is introduced.
Comparative hybridization of an array of 21,500 ovarian cDNAs for the discovery of genes overexpressed in ovarian carcinomas.
One of the genes found, the gene HE4, was found to be expressed primarily in some ovarian cancers, and is thus a potential marker of ovarian carcinoma.
Distinctive gene expression patterns in human mammary epithelial cells and breast cancers.
  • C. Perou, S. Jeffrey, +11 authors D. Botstein
  • Biology, Medicine
    Proceedings of the National Academy of Sciences of the United States of America
  • 1999
The results support the feasibility and usefulness of this systematic approach to studying variation in gene expression patterns in human cancers as a means to dissect and classify solid tumors.
Monitoring gene expression profile changes in ovarian carcinomas using cDNA microarray.
Several genes that may have biological relevance in the process of ovarian carcinogenesis have been identified through the assembly and utilization of a 5766 member cDNA microarray to study the differences in gene expression between normal and neoplastic human ovarian tissues.