Joaquim Assunção

  • Citations Per Year
Learn More
Some top data mining algorithms, as ensemble classifiers, may be inefficient to very large data set. This paper makes an initial proposal of a distributed ensemble classifier algorithm based on the popular Random Forests for Big Data. The proposed algorithm aims to improve the efficiency of the algorithm by a distributed processing model called MapReduce.(More)
This article presents a novel Stochastic Automata Networks (SAN) model to estimate the behavior of sediment strata formation in continental margins resulting from interaction between sea level, sediment input and subsidence over the last 130 million years. The model result is a set of probabilities that can be compared with geological facts and hypothesis;(More)
The use of stochastic formalisms, such as Stochastic Automata Networks (SAN), can be very useful for statistical prediction and behavior analysis. Once well fitted, such formalisms can generate probabilities about a target reality. These probabilities can be seen as a statistical approach of knowledge discovery. However, the building process of models for(More)
This paper describes a dimensionality reduction process to forecast time series events using stochastic models. As well as the KDD process defines a sequence of common steps to achieve useful information through data mining techniques, we propose a sequence of steps in order to estimate the probability of future events through stochastic modeling. Our(More)
Abstract So far, large stochastic models require considerable amounts of time to be created. In fact, to simulate systems or events, there is a constant need to perform an analysis of the system and its variables. In this paper we propose a method to automatically generate Stochastic Automata Networks (SAN) models for geological events. Based on(More)
  • 1