Turki Turki

Learn More
Predicting the response to a drug for cancer disease patients based on genomic information is an important problem in modern clinical oncology. This problem occurs in part because many available drug sensitivity prediction algorithms do not consider better quality cancer cell lines and the adoption of new feature representations; both lead to the accurate(More)
Instance reduction methods are popular methods that reduce the size of the datasets to possibly improve the classification accuracy. We present a method that reduces the size of the dataset based on the percentile of the dataset partitions which we call IPRed. We evaluate our and other popular instance reduction methods from a classification perspective by(More)
Dimensionality reduction procedures such as principal component analysis and the maximum margin criterion discriminant are special cases of a weighted maximum variance (WMV) approach. We present a simple two parameter version of WMV that we call 2P-WMV. We study the classification error given by the 1-nearest neighbor algorithm on features extracted by our(More)
Network inference through link prediction is an important data mining problem that finds many applications in computational social science and biomedicine. For example, by predicting links, i.e., regulatory relationships, between genes to infer gene regulatory networks (GRNs), computational biologists gain a better understanding of the functional elements(More)
Ensemble methods such as AdaBoost are popular machine learning methods that create highly accurate classifier by combining the predictions from several classifiers. We present a parametrized method of AdaBoost that we call Top-k Parametrized Boost. We evaluate our and other popular ensemble methods from a classification perspective on several real datasets.(More)
Supervised methods for inferring gene regulatory networks (GRNs) perform well with good training data. However, when training data is absent, these methods are not applicable. Unsupervised methods do not need training data but their accuracy is low. In this paper, we combine supervised and unsupervised methods to infer GRNs using time-series gene expression(More)
Predicting drug response to cancer disease is an important problem in modern clinical oncology that attracted increasing recent attention from various domains such as computational biology, machine learning, and data mining. Cancer patients respond differently to each cancer therapy owing to disease diversity, genetic factors, and environmental causes.(More)
Programs based on hash tables and Burrows-Wheeler are very fast for mapping short reads to genomes but have low accuracy in the presence of mismatches and gaps. Such reads can be aligned accurately with the Smith-Waterman algorithm but it can take hours and days to map millions of reads even for bacteria genomes. We introduce a GPU program called MaxSSmap(More)