## Learning probabilistic networks

- Paul J. Krause
- Knowledge Eng. Review
- 1999

1 Excerpt

- Published 1998

The use of acycl ic , d i rec ted graphs (of ten ca l led 'DAG's) to s imul taneous ly represent causa l hypotheses and to encode independence and conditional independence constraints associated with those hypotheses has proved fruitful in the construction of expert systems, in the development of efficient updating algorithms (Pearl, 1988, Lauritzen et al. 1988), and in inferring causal structure (Pearl and Verma, 1991; Cooper and Herskovits 1992; Spirtes, Glymour and Scheines, 1993). In section 1 I will survey a number of extensions of the DAG framework based on directed graphs and chain graphs (Lauri tzen and Wermuth 1989; Frydenberg 1990; Koster 1996; Andersson, Madigan and Perlman 1996). Those based on directed graphs include models based on directed cyclic and acyclic graphs, possibly including latent variables and/or selection bias (Pearl, 1988; S p i r t e s , G l y m o u r a n d S c h e i n e s 1 9 9 3 ; S p i r t e s 1 9 9 5 ; S p i r t e s , M e e k , a n d R i c h a r d s o n 1 9 9 5 ; Richardson 1996a, 1996b; Koster 1996; Pearl and Dechter 1996; Cox and Wermuth, 1996). In sect ion 2 I s ta te two proper t ies , mot ivated by causal and spat ia l in tu i t ions , tha t the se t of conditional independencies entailed by a graphical model might satisfy. I proceed to show that the sets of independencies entailed by (i) an undirected graph via separation, and (ii) a (cyclic or acyclic) directed graph (possibly with latent and/or selection variables) via d-separation, satisfy both properties. By contrast neither of these properties, in general, will hold in a chain graph under the Lauritzen-Wermuth-Frydenberg (LWF) interpretation. One property holds for chain graphs under the Andersson-Madigan-Perlman (AMP) interpretation, the other does not. The examination of these properties and others l ike them may provide insight into the current vigorous debate concerning the applicability of chain graphs under different global Markov properties. 1. Graphs and Probability Distributions An u n d i r e c t e d graph U G is an ordered pair (V,U), where V is a set of vertices and U is a set of undirected edges X—Y between vertices. 2 Similarly, a directed graph DG is an ordered pair (V,D) where D is a set of directed edges X Y between vertices in V. A directed cycle onsists of a sequence of edges X1 X2... Xn X1 (n 2). If a directed graph DG contains no directed cycles it is said to be acyclic , otherwise it is cyclic. An edge XY is said to be out of X and into Y; X and Y are the endpoints of the edge. Note that if cycles are permitted there may be more than one edge between a given pair of vertices e.g. X Y. 1 1I thank P. Spirtes, C. Glymour, D. Madigan, M. Perlman and J. Besag for helpful conversations. Research for this paper was supported by the Office of Naval Research through contract number N00014-93-1-0568. 2Bold face ( X) indicate sets; plain face (X) indicates individual elements; italics ( U) indicates a graph or a path. I will consider directed graphs (cyclic or acyclic) in which V is partitioned into three disjoint sets O (Observed), S (Selection) and L (Latent), written DG(O,S,L). The interpretation of this definition is that D G r e p r e s e n t s a c a u s a l m e c h a n i s m , O r epresen t s the subse t o f the va r iab les tha t a re observed, S represents a set of variables which, due to the nature of the mechanism selecting the sample, are conditioned on in the subpopulation from which the sample is drawn, the variables L are not observed and for this reason are called 'latent'. 3 A mixed graph contains both directed and undirected edges. A partially directed cycle in a mixed g r a p h G i s a sequence of n dis t inct ver t ices X 1, ... X n, (n 3), and X n +1 X1, such that (a) i (1 i n) either X i— Xi+1 or Xi Xi+1, and (b) j (1 j n) such that X j Xj+1. A chain graph CG is a mixed graph in which there are no partially directed cycles. Koster (1996) considers a class of reciprocal graphs containing directed and undirected edges in which partially directed cycles are allowed. I do not consider such graphs separately here, though many of the comments which apply to LWF chain graphs apply also to reciprocal graphs since the former are a subclass of the latter. To make clear which kind of graph is being referred to I will use U G for undirected graphs, D G for directed graphs, AG for acyclic directed graphs, CG for chain graphs, and G to denote a graph which may be any one of these. A p a t h b e t w e e n X a n d Y in graph G ( o f w h a t e v e r t y p e ) c o n s i s t s o f a s e q u e n c e o f e d g e s , <E1,...En> such that there exists a sequence of distinct vertices <XX1,...Xn+1 Y> where Ei has endpoints X i and X i +1 (1i n), i.e. Ei is Xi— Xi +1, Xi Xi +1, or X i Xi +1 (1i n). A d i r e c t e d path from X to Y is a path of the form X ... Y.4 1 . 2 G l o b a l M a r k o v P r o p e r t i e s A s s o c i a t e d w i t h G r a p h s A Global Markov Property associates a set of conditional independence relations with a graph G. 5 In an undirected graph UG, for disjoint sets of vertices X, Y and Z, (Z may be empty), if there is no path from a variable X X, to a variable Y Y, that does not include some variable in Z, then X and Y are said to be separated by Z. U n d i r e c t e d G l o b a l M a r k o v P r o p e r t y (=S ): U G=S X Y | Z if X and Y are separated by Z in UG. 6 In a graph G, X is a p a r e n t of Y, (and Y is a c h i l d of X) if there is an edge X Y in G. X is an ancestor of Y (and Y is a descendant of X) if X=Y, or there is a directed path X ... Y from X 3Note that we use the terms 'variable' and 'vertex' interchangeably. 4Path is defined here as a sequence of edges, rather than vertices; in a cyclic graph a sequence of vertices does not in general define a unique path, since there may be more than one edge between a given pair of vertices. 5Often global Markov conditions are introduced as a means for deriving the consequences of a set of local Markov conditions. Here I merely define the Global property in terms of the relevant graphical criterion. 6‘ X Y | Z ’ means that ‘ X is independent of Y given Z’; if Z= , the abbreviation X Y is used; if X, Y and/or Z are singleton sets {V}, then brackets are omitted e.g. V Y | Z, instead of {V} Y | Z .

@inproceedings{Richardson1998ExtensionsOU,
title={Extensions of Undirected and Acyclic, Directed Graphical Models},
author={Thomas O Richardson and Peter Spirtes and Calrk Glymour and David Madigan and Michael D. Perlman},
year={1998}
}