Learn More
Molecular profiling of tumors promises to advance the clinical management of cancer, but the benefits of integrating molecular data with traditional clinical variables have not been systematically studied. Here we retrospectively predict patient survival using diverse molecular data (somatic copy-number alteration, DNA methylation and mRNA, microRNA and(More)
We present an approach to cross-language retrieval that combines dense knowledge-based features and sparse word translations. Both feature types are learned directly from relevance rankings of bilingual documents in a pairwise ranking framework. In large-scale experiments for patent prior art search and cross-lingual retrieval in Wikipedia, our approach(More)
Automated annotation of protein function is challenging. As the number of sequenced genomes rapidly grows, the overwhelming majority of protein products can only be annotated computationally. If computational predictions are to be relied upon, it is crucial that the accuracy of these methods be high. Here we report the results from the first large-scale(More)
Generalized singular-value decomposition is used to separate multichannel electroencephalogram (EEG) into components found by optimizing a signal-to-noise quotient. These components are used to filter out artifacts. Short-time principal components analysis of time-delay embedded EEG is used to represent windowed EEG data to classify EEG according to which(More)
Tournament selection is a popular form of selection which is commonly used with genetic algorithms, genetic programming and evolutionary programming. However, tournament selection introduces a sampling bias into the selection process. We review analytic results and present empirical evidence that shows this bias has a significant impact on search(More)
MYCN amplification and overexpression are common in neuroendocrine prostate cancer (NEPC). However, the impact of aberrant N-Myc expression in prostate tumorigenesis and the cellular origin of NEPC have not been established. We define N-Myc and activated AKT1 as oncogenic components sufficient to transform human prostate epithelial cells to prostate(More)
The search space of Phrase-Based Statistical Machine Translation (PBSMT) systems can be represented as a directed acyclic graph (lattice). By exploring this search space, it is possible to analyze and understand the failures of PBSMT systems. Indeed, useful diagnoses can be obtained by computing the so-called <i>oracle</i> hypotheses, which are hypotheses(More)
MOTIVATION A current challenge in understanding cancer processes is to pinpoint which mutations influence the onset and progression of disease. Toward this goal, we describe a method called PARADIGM-SHIFT that can predict whether a mutational event is neutral, gain-or loss-of-function in a tumor sample. The method uses a belief-propagation algorithm to(More)
Protein function prediction is an active area of research in bioinformatics. Yet, the transfer of annotation on the basis of sequence or structural similarity remains widely used as an annotation method. Most of today's machine learning approaches reduce the problem to a collection of binary classification problems: whether a protein performs a particular(More)