Alireza Fazelpour

  • Citations Per Year
Learn More
Dimensionality reduction techniques have become a required step when working with bioinformatics datasets. Techniques such as feature selection have been known to not only improve computation time, but to improve the results of experiments by removing the redundant and irrelevant features or genes from consideration in subsequent analysis. Univariate(More)
With the proliferation of high-dimensional datasets across many application domains in recent years, feature selection has become an important data mining task due to its capability to improve both performance and computational efficiencies. The chosen feature subset is important not only due to its ability to improve classification performance, but also(More)
Bioinformatics datasets pose two major challenges to researchers and data-mining practitioners: class imbalance and high dimensionality. Class imbalance occurs when instances of one class vastly outnumber instances of the other class(es), and high dimensionality occurs when a dataset has many independent features (genes). Data sampling is often used to(More)
Many bioinformatics datasets share certain problems: they have class imbalance (one class with many more instances than the remaining class(es)), or are difficult to learn from (build accurate models with). Much research has investigated these two problems, or even considered both at once. However, hidden dependencies can exist between these two problems:(More)
One major challenge posed by bioinformatics datasets is class imbalance which occurs when one class has many more instances than the other class(es). Its undesirable effect on the classification performance is compounded with the fact that, in general, the class with fewer instances is the class of interest. Bagging has been utilized by practitioners in the(More)
Class imbalance is a significant challenge that practitioners in the field of bioinformatics are faced with on a daily basis. It is a phenomenon that occurs when number of instances of one class is much greater than number of instances of the other class(es) and it has adverse effects on the performance of classification models built on this skewed data.(More)
Choosing an appropriate cancer treatment is potentially the most important task in the treatment of a cancer patient. If it were possible to identify the best option for a patient (or at minimum to remove options that will not help the patient), then the general prognosis of the patient improves. However, this task becomes much more subtle due to(More)
Bioinformatics datasets contain a number of characteristics, such as noisy data and difficult to learn class boundaries, which make it challenge to build effective predictive models. One option for improving results is the use of ensemble learning methods, which involve combining the results of multiple predictive models into a single decision. Since we do(More)
The ever increasing growth of databases in the real time application is a major issue for the handling of large data. The data mining of the same is also a tedious task. The feature subset selection is a process for finding the irrelevant and redundant data and handling them. The proposed algorithm IFSSImproved Feature Subset Selection works in 2 major(More)