Joaquim Assunção

Learn More
Some top data mining algorithms, as ensemble classifiers, may be inefficient to very large data set. This paper makes an initial proposal of a distributed ensemble classifier algorithm based on the popular Random Forests for Big Data. The proposed algorithm aims to improve the efficiency of the algorithm by a distributed processing model called MapReduce.(More)
This article presents a novel Stochastic Automata Networks (SAN) model to estimate the behavior of sediment strata formation in continental margins resulting from interaction between sea level, sediment input and subsidence over the last 130 million years. The model result is a set of probabilities that can be compared with geological facts and hypothesis;(More)
So far, large stochastic models require considerable amounts of time to be created. In fact, to simulate systems or events, there is a constant need to perform an analysis of the system and its variables. In this paper we propose a method to automatically generate Stochastic Automata Networks (SAN) models for geological events. Based on user-defined input(More)
This paper describes a dimensionality reduction process to forecast time series events using stochastic models. As well as the KDD process defines a sequence of common steps to achieve useful information through data mining techniques, we propose a sequence of steps in order to estimate the probability of future events through stochastic modeling. Our(More)
area da qual anteriormente não havia informação geológica. Os algoritmos desen-volvidos compõem a ferramenta que serve para extração, transformação e carga de dados (ETL). Estes dados compõem um banco de dados paleoge-ográficos de grande volume. Esse banco visa agregar informações existentes com informações estimadas para se obter novas informações via(More)
  • 1