Structure Learning in Conditional Probability Models via an Entropic Prior and Parameter Extinction

Abstract

We introduce an entropic prior for multinomial parameter estimation problems and solve for its maximum a posteriori (MAP) estimator. The prior is a bias for maximally structured and minimally ambiguous models. In conditional probability models with hidden state, iterative MAP estimation drives weakly supported parameters toward extinction, effectively turning them off. Thus structure discovery is folded into parameter estimation. We then establish criteria for simplifying a probabilistic model’s graphical structure by trimming parameters and states, with a guarantee that any such deletion will increase the posterior probability of the model. Trimming accelerates learning by sparsifying the model. All operations monotonically and maximally increase the posterior probability, yielding structure-learning algorithms only slightly slower than parameter estimation via expectation-maximization (EM), and orders of magnitude faster than search-based structure induction. When applied to hidden Markov model (HMM) training, the resulting models show superior generalization to held-out test data. In many cases the resulting models are so sparse and concise that they are interpretable, with hidden states that strongly correlate with meaningful categories. This work may not be copied or reproduced in whole or in part for any commercial purpose. Permission to copy in whole or in part without payment of fee is granted for nonprofit educational and research purposes provided that all such whole or partial copies include the following: a notice that such copying is by permission of Mitsubishi Electric Information Technology Center America; an acknowledgment of the authors and individual contributions to the work; and all applicable portions of the copyright notice. Copying, reproduction, or republishing for any other purpose shall require a license with payment of fee to Mitsubishi Electric Information Technology Center America. All rights reserved. Copyright c Mitsubishi Electric Information Technology Center America, 1998 201 Broadway, Cambridge, Massachusetts 02139 Publication History: 1. Limited circulation 19oct97. 2. Submitted to Neural Computation 8dec97. 3. Accepted 8aug98. 4. Modified and released 24aug98.

DOI: 10.1162/089976699300016395

Extracted Key Phrases

12 Figures and Tables

Statistics

01020'99'01'03'05'07'09'11'13'15'17
Citations per Year

177 Citations

Semantic Scholar estimates that this publication has 177 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@article{Brand1999StructureLI, title={Structure Learning in Conditional Probability Models via an Entropic Prior and Parameter Extinction}, author={Matthew Brand}, journal={Neural Computation}, year={1999}, volume={11}, pages={1155-1182} }