José M. Peña

Learn More
We propose algorithms for learning Markov boundaries from data without having to learn a Bayesian network first. We study their correctness, scalability and data efficiency. The last two properties are important because we aim to apply the algorithms to identify the minimal set of features that is needed for probabilistic classification in databases with(More)
We analyze two different feature selection problems: finding a minimal feature set optimal for classification (MINIMAL-OPTIMAL) vs. finding all features relevant to the target variable (ALLRELEVANT). The latter problem is motivated by recent applications within bioinformatics, particularly gene expression analysis. For both problems, we identify classes of(More)
BACKGROUND Treatment of multidrug resistant tuberculosis (MDR-TB) is lengthy, toxic, expensive, and has generally poor outcomes. We undertook an individual patient data meta-analysis to assess the impact on outcomes of the type, number, and duration of drugs used to treat MDR-TB. METHODS AND FINDINGS Three recent systematic reviews were used to identify(More)
We propose an algorithm for learning the Markov boundary of a random variable from data without having to learn a complete Bayesian network. The algorithm is correct under the faithfulness assumption, scalable and data efficient. The last two properties are important because we aim to apply the algorithm to identify the minimal set of random variables that(More)
We studied the effects of transcranial magnetic stimulation (TMS, 60 Hz and 0.7 mT for 4 h/day for 14 days) on oxidative and cell damage caused by olfactory bulbectomy (OBX) in Wistar rats. The levels of lipid peroxidation products and caspase-3 were enhanced by OBX, whereas it prompted a reduction in reduced glutathione (GSH) content and antioxidative(More)
This paper shows how the Bayesian network paradigm can be used in order to solve com­ binatorial optimization problems. To do it some methods of structure learning from data and simulation of Bayesian networks are in­ serted inside Estimation of Distribution Al­ gorithms (EDA). EDA are a new tool for evo­ lutionary computation in which populations of(More)
In many cases what matters is not whether a false discovery is made or not but the expected proportion of false discoveries among all the discoveries made, i.e. the so-called false discovery rate (FDR). We present an algorithm aiming at controlling the FDR of edges when learning Gaussian graphical models (GGMs). The algorithm is particularly suitable when(More)
We apply MCMC to approximately calculate (i) the ratio of directed acyclic graph (DAG) models to DAGs for up to 20 nodes, and (ii) the fraction of chain graph (CG) models that are neither undirected graph (UG) models nor DAG models for up to 13 nodes. Our results suggest that, for the numbers of nodes considered, (i) the ratio of DAG models to DAGs is not(More)