Sotiris B. Kotsiantis

Learn More
Supervised machine learning is the search for algorithms that reason from externally supplied instances to produce general hypotheses, which then make predictions about future instances. In other words, the goal of supervised learning is to build a concise model of the distribution of class labels in terms of predictor features. The resulting classifier is(More)
Many real-world data sets exhibit skewed class distributions in which almost all cases are allotted to a class and far fewer cases to a smaller, usually more interesting class. A classifier induced from an imbalanced data set has, typically, a low error rate for the majority class and an unacceptable error rate for the minority class. This paper firstly(More)
Supervised classification is one of the tasks most frequently carried out by so-called Intelligent Systems. Thus, a large number of techniques have been developed based on Artificial Intelligence (Logic-based techniques, Perceptron-based techniques) and Statistics (Bayesian Networks, Instance-based techniques). The goal of supervised learning is to build a(More)
Many factors affect the success of Machine Learning (ML) on a given task. The representation and quality of the instance data is first and foremost. If there is much irrelevant and redundant information present or noisy and unreliable data, then knowledge discovery during the training phase is more difficult. It is well known that data preparation and(More)
We develop a three-step fuzzy logic-based algorithm for clustering categorical attributes, and we apply it to analyze cultural data. In the first step the algorithm employs an entropy-based clustering scheme, which initializes the cluster centers. In the second step we apply the fuzzy c-modes algorithm to obtain a fuzzy partition of the data set, and the(More)
The ability to predict a student’s performance could be useful in a great number of different ways associated with university-level distance learning. Students’ key demographic characteristics and their marks on a few written assignments can constitute the training set for a supervised machine learning algorithm. The learning algorithm could then be able to(More)
Student dropout occurs quite often in universities providing distance education. The scope of this research is to study whether the usage of machine learning techniques can be useful in dealing with this problem. Subsequently, an attempt was made to identifying the most appropriate learning algorithm for the prediction of students' dropout. A number of(More)
Simple Bayes algorithm captures the assumption that every feature is independent from the rest of the features, given the state of the class feature. The fact that the assumption of independence is clearly almost always wrong has led to a general rejection of the crude independence model in favor of more complicated alternatives, at least by researchers(More)