Graphical Models: Foundations of Neural Computation

  title={Graphical Models: Foundations of Neural Computation},
  author={Michael I. Jordan and Terrence J. Sejnowski},
  journal={Pattern Anal. Appl.},
From the Publisher: Graphical models use graphs to represent and manipulate joint probability distributions. They have their roots in artificial intelligence, statistics, and neural networks. The clean mathematical formalism of the graphical models framework makes it possible to understand a wide variety of network-based approaches to computation, and in particular to understand many neural network algorithms and architectures as instances of a broader probabilistic methodology. It also makes… 

Bayesian Inference in Nonlinear and Relational Latent Variable Models

The first graphical model for analysing nonlinear dependencies in relational data, is introduced in the thesis, and a new algorithm is developed for efficient and reliable inference in nonlinear state-space models.

Modeling language and cognition with deep unsupervised learning: a tutorial overview

It is argued that the focus on deep architectures and generative (rather than discriminative) learning represents a crucial step forward for the connectionist modeling enterprise, because it offers a more plausible model of cortical learning as well as a way to bridge the gap between emergentist connectionist models and structured Bayesian models of cognition.

Deep learning systems as complex networks

This article proposes to study deep belief networks using techniques commonly employed in the study of complex networks, in order to gain some insights into the structural and functional properties of the computational graph resulting from the learning process.

Towards Comprehensive Foundations of Computational Intelligence

  • Wlodzislaw Duch
  • Computer Science
    Challenges for Computational Intelligence
  • 2007
Heterogeneous adaptive systems are presented as particular example of transformation-based systems, and the goal of learning is redefined to facilitate creation of simpler data models.

Learning Orthographic Structure With Sequential Generative Neural Networks

This work investigates a sequential version of the restricted Boltzmann machine (RBM), a stochastic recurrent neural network that extracts high-order structure from sensory data through unsupervised generative learning and can encode contextual information in the form of internal, distributed representations.

Generalized Statistical Methods for Mixed Exponential Families, Part I: Theoretical Foundations

This work studies in detail the extreme case corresponding to exponential family Principal Component Analysis and solves problems related to fitting the generative model and making decisions in a data-driven manner.

Draft: Deep Learning in Neural Networks: An Overview

This historical survey compactly summarises relevant work, much of it from the previous millennium, on deep supervised learning, unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.

Generalized Statistical Methods for Mixed Exponential Families, Part II: Applications

This work demonstrates the ability to learn a GLS generative model in a controlled environment using synthetic data of mixed types and illustrates the benefits of making decisions in parameter space, with examples of categorical data (supervised and unsupervised) text categorization and mixed data-type classification and clustering.

Large-Scale Computational Modeling of Genetic Regulatory Networks

Recent advances in massive gene expression measurements by DNA-microarrays are summarized, which form the to date most powerful data basis for models of genetic networks.



Connectionist Learning of Belief Networks

Exploiting Tractable Substructures in Intractable Networks

A refined mean field approximation for inference and learning in probabilistic neural networks is developed, and it is shown how to incorporate weak higher order interactions into a first-order hidden Markov model.

A Unifying Review of Linear Gaussian Models

A new model for static data is introduced, known as sensible principal component analysis, as well as a novel concept of spatially adaptive observation noise, which shows how independent component analysis is also a variation of the same basic generative model.

Statistical Physics Algorithms That Converge

Close connections are demonstrated between mean field theory methods and other approaches, in particular, barrier function and interior point methods, for obtaining approximate solutions to optimization problems.

Modeling the manifolds of images of handwritten digits

Two new methods for modeling the manifolds of digitized images of handwritten digits of principal components analysis and factor analysis are described, based on locally linear low-dimensional approximations to the underlying data manifold.

A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants

An incremental variant of the EM algorithm in which the distribution for only one of the unobserved variables is recalculated in each E step is shown empirically to give faster convergence in a mixture estimation problem.

Neurons with graded response have collective computational properties like those of two-state neurons.

  • J. Hopfield
  • Biology
    Proceedings of the National Academy of Sciences of the United States of America
  • 1984
A model for a large network of "neurons" with a graded response (or sigmoid input-output relation) is studied and collective properties in very close correspondence with the earlier stochastic model based on McCulloch - Pitts neurons are studied.

Maximum Likelihood Competitive Learning

This work proposes to view competitive adaptation as attempting to fit a blend of simple probability generators to a set of data-points, and investigates one application of the soft competitive model, placement of radial basis function centers for function interpolation, and shows that the soft model can give better performance with little additional computational cost.

EM Algorithms for PCA and SPCA

An expectation-maximization (EM) algorithm for principal component analysis (PCA) which allows a few eigenvectors and eigenvalues to be extracted from large collections of high dimensional data and defines a proper density model in the data space.

Independent component analysis, A new concept?

  • P. Comon
  • Computer Science
    Signal Process.
  • 1994