Sergio Rodrigues de Morais

Learn More
This paper introduces a novel conservative feature subset selection method with incomplete data sets. The method is conservative in the sense that it selects the minimal subset of features that renders the rest of the features independent of the target (the class variable) without making any assumption about the missing data mechanism. This is achieved in(More)
OBJECTIVES We propose a new graphical framework for extracting the relevant dietary, social and environmental risk factors that are associated with an increased risk of nasopharyngeal carcinoma (NPC) on a case-control epidemiologic study that consists of 1289 subjects and 150 risk factors. METHODS This framework builds on the use of Bayesian networks(More)
The aim of this study was to provide a framework for the analysis of visceral obesity and its determinants in women, where complex inter-relationships are observed among lifestyle, nutritional and metabolic predictors. Thirty-four predictors related to lifestyle, adiposity, body fat distribution, blood lipids and adipocyte sizes have been considered as(More)
In this paper, we discuss efforts to apply a novel Bayesian network (BN) structure learning algorithm to a real world epidemiological problem, namely the Nasopharyngeal Carcinoma (NPC). Our specific aims are : (1) to provide a statistical profile of the recruited population, (2) to help indentify the important environmental risk factors involved in NPC, and(More)
— Learning the structure of a bayesian network from a data set is NP-hard. In this paper, we discuss a novel heuristic called Polynomial Max-Min Skeleton (PMMS) developped by Tsamardinos et al. in 2005. PMMS was proved by extensive empirical simulations to be an excellent trade-off between time and quality of reconstruction compared to all constraint based(More)
In this study, we discuss and apply a novel and efficient algorithm for learning a local Bayesian network model in the vicinity of the ZNF217 oncogene from breast cancer microarray data without having to decide in advance which genes have to be included in the learning process. ZNF217 is a candidate oncogene located at 20q13, a chromosomal region frequently(More)
In this paper, we discuss simple methods for identification and handling of almost-deterministic relationships (ADR) in automatic constraint-based Bayesian network structure discovery. The problem with ADR is that conditional independence tests become unreliable when the conditional set almost-determine one of the variables in the test. Such errors have(More)