Learn More
Ambitious projects aimed at cloning, mapping and sequencing the genomes of various organisms, including that of Homo sapi-ens, have been launched worldwide. In all cases, the fruits of these labours will provide a solid platform from which to attempt the larger goal of understanding how genomes result in the organisms they specify. The success of these(More)
For statistical design of an optimal "lter, it is probabilistically advantageous to employ a large number of observation random variables; however, estimation error increases with the number of variables, so that variables not contributing to the determination of the target variable can have a detrimental e!ect. In linear "ltering, determination involves(More)
MOTIVATION Expression-based analysis for large families of genes has recently become possible owing to the development of cDNA microarrays, which allow simultaneous measurement of transcript levels for thousands of genes. For each spot on a microarray, signals in two channels must be extracted from their backgrounds. This requires algorithms to extract(More)
cDNA microarrays provide simultaneous expression measurements for thousands of genes that are the result of processing images to recover the average signal intensity from a spot composed of pixels covering the area upon which the cDNA detector has been put down. The accuracy of the signal measurement depends on using an appropriate algorithm to process the(More)
A major goal in genomics is to understand how genes are regulated in different tissues, stages of development, diseases, and species. Mapping DNase I hypersensitive (HS) sites within nuclear chromatin is a powerful and well-established method of identifying many different types of regulatory elements, but in the past it has been limited to analysis of(More)
Sarcomas are a biologically complex group of tumors of mesenchymal origin. By using gene expression microarray analysis, we aimed to find clues into the cellular differentiation and oncogenic pathways active in these tumors as well as potential biomarkers and therapeutic targets. We examined 181 tumors representing 16 classes of human bone and soft tissue(More)
Data clustering methods have been proven to be a successful data mining technique in the analysis of gene expression data. The Cluster affinity search technique (CAST) developed by Ben-Dor, et. al., 1999, which has been shown to cluster gene expression data well, has two drawbacks. First, the algorithm uses a fixed initial threshold value to start the(More)
This paper describes the internal working of a novel UNL converter for the Chinese language. Three steps are involved in generating Chinese from UNL: first, the UNL expression is converted to a graph; second, the graph is converted to a number of trees. Third, a top-down tree walking is performed to translate each subtree and the results are composed to(More)
For small samples, classifier design algorithms typically suffer from overfitting. Given a set of features, a classifier must be designed and its error estimated. For small samples, an error estimator may be unbiased but, owing to a large variance, often give very optimistic estimates. This paper proposes mitigating the small-sample problem by designing(More)