Learn More
The area under the ROC (receiver operating characteristics) curve, or simply AUC, has been traditionally used in medical diagnosis since the 1970s. It has recently been proposed as an alternative single-number measure for evaluating the predictive ability of learning algorithms. However, no formal arguments were given as to why AUC should be preferred over(More)
Predictive accuracy has been used as the main and often only evaluation criterion for the predictive performance of classification learning algorithms. In recent years, the area under the ROC (Receiver Operating Characteristics) curve, or simply AUC, has been proposed as an alternative single-number measure for evaluating learning algorithms. In this paper,(More)
Parallel corpus is an indispensable resource for translation model training in statistical machine translation (SMT). Instead of collecting more and more parallel training corpora, this paper aims to improve SMT performance by exploiting full potential of the existing parallel corpora. Two kinds of methods are proposed: offline data optimization and online(More)
Drug-target interaction (DTI) is the basis of drug discovery and design. It is time consuming and costly to determine DTI experimentally. Hence, it is necessary to develop computational methods for the prediction of potential DTI. Based on complex network theory, three supervised inference methods were developed here to predict DTI and used for drug(More)
Predictive accuracy has been widely used as the main criterion for comparing the predictive ability of classiication systems (such as C4.5, neural networks, and Naive Bayes). Most of these classiiers also produce probability estimations of the classiication, but they are completely ignored in the accuracy measure. This is often taken for granted because(More)
Predictive accuracy has often been used as the main and often only evaluation criterion for the predictive performance of classification or data mining algorithms. In recent years, the area under the ROC (Receiver Operating Characteristics) curve, or simply AUC, has been proposed as an alternative single-number measure for evaluating performance of learning(More)
Nonnegative Matrix Factorization (NMF) has been one of the most widely used clustering techniques for exploratory data analysis. However, since each data point enters the objective function with squared residue error, a few outliers with large errors easily dominate the objective function. In this article, we propose a Robust Manifold Nonnegative Matrix(More)
BACKGROUND Polymorphisms in the cytotoxic T-lymphocyte antigen 4 (CTLA-4) gene have been implicated in susceptibility to cancer, but the many published studies have reported inconclusive results. The objective of the current study was to conduct a meta-analysis investigating the association between polymorphisms in the CTLA-4 gene and the risk of cancer. (More)
The presence of nitrogenous disinfection by-products (N-DBPs), including nitrosamines, cyanogen halides, haloacetonitriles, haloacetamides and halonitromethanes, in drinking water is of concern due to their high genotoxicity and cytotoxicity compared with regulated DBPs. Occurrence of N-DBPs is likely to increase if water sources become impacted by(More)
Complicated abdominal aortic aneurysm (AAA) is a major cause of mortality in elderly men. Ang II-dependent TGF-beta activity promotes aortic aneurysm progression in experimental Marfan syndrome. However, the role of TGF-beta in experimental models of AAA has not been comprehensively assessed. Here, we show that systemic neutralization of TGF-beta activity(More)