George Tzanis

Learn More
The prediction of the Translation Initiation Site (TIS) in a genomic sequence is an important issue in biological research. Although several methods have been proposed to deal with this problem, there is a great potential for the improvement of the accuracy of these methods. Due to various reasons, including noise in the data as well as biological reasons,(More)
AbstrAct Association rule mining is a popular task that involves the discovery of co-occurences of items in transaction databases. Several extensions of the traditional association rule mining model have been proposed so far; however, the problem of mining for mutually exclusive items has not been directly tackled yet. Such information could be useful in(More)
The prediction of the translation initiation site (TIS) in a genomic sequence is an important issue in biological research. Several methods have been proposed to deal with it. However, it is still an open problem. In this paper we follow an approach consisting of a number of steps in order to increase TIS prediction accuracy. First, all the sequences are(More)
In an mRNA sequence, the prediction of the exact codon where the process of translation starts (Translation Initiation Site – TIS) is a particularly important problem. So far it has been tackled by several researchers that apply various statistical and machine learning techniques, achieving high accuracy levels, often over 90%. In this paper we propose a(More)
This paper studies the problem of predicting future values for a number of water quality variables, based on measurements from underwater sensors. It performs both exploratory and automatic analysis of the collected data with a variety of linear and nonlinear modeling methods. The paper investigates issues, such as the ability to predict future values for a(More)
This paper discusses the concept of big data mining in the domain of biology and medicine. Biological and medical data are increasing at very rapid rates, which in many cases outpace even Moore's law. This is the result of recent technological development, as well as the exploratory attitude of human beings, that prompts scientists to answer more questions(More)
In this paper we present a method for classifying accurately SAGE (Serial Analysis of Gene Expression) data. The high dimensionality of the data, namely the large number of features, in combination with the small number of samples poses a great challenge and demands more accurate and robust algorithms for classification. The prediction accuracy of the up to(More)
This paper presents a study on polyadenylation site prediction, which is a very important problem in bioinformatics and medicine, promising to give a lot of answers especially in cancer research. We describe a method, called PolyA-iEP, that we developed for predicting polyadenylation sites and we present a systematic study of the problem of recognizing mRNA(More)