#### Filter Results:

#### Publication Year

1997

2016

#### Co-author

#### Key Phrase

#### Publication Venue

Learn More

We consider a class of nonlinear models based on mixtures of local autoregressive time series. At any given time point, we have a certain number of linear models, denoted as experts, where the vector of covariates may include lags of the dependent variable. Additionally, we assume the existence of a latent multinomial variable, whose distribution depends on… (More)

We investigate a class of hierarchical mixtures-of-experts (HME) models where generalized linear models with nonlinear mean functions of the form psi (alpha + xT beta) are mixed. Here psi (.) is the inverse link function. It is shown that mixtures of such mean functions can approximate a class of smooth functions of the form psi (h(x)), where h(.) epsilon… (More)

| In the class of hierarchical mixtures-of-experts (HME) models, \experts" in the exponential family with generalized linear mean functions of the form (+ x T) are mixed, according to a set of local weights called the \gating functions" depending on the predictor x. Here () is the inverse link function. We provide regularity conditions on the experts and on… (More)

We discuss a class of nonlinear models based on mixtures-of-experts of regressions of exponential family time series models, where the covariates include functions of lags of the dependent variable as well as external covariates. The discussion covers results on model identifiability, stochastic stability, parameter estimation via maximum likelihood… (More)

|In mixtures-of-experts (ME) models, \experts" of generalized linear models are combined, according to a set of local weights called the \gating function". The invariant transformations of the ME probability density functions include the permutations of the expert labels and the translations of the parameters in the gating functions. Under certain… (More)

Previous researchers developed new learning architectures for sequential data by extending conventional hidden Markov models through the use of distributed state representations. Although exact inference and parameter estimation in these architectures is computationally intractable, Ghahramani and Jordan (1997) showed that approximate inference and… (More)

In this paper, we summarize some recent results in Li et al. (2012), which can be used to extend an important PAC-Bayesian approach, namely the Gibbs posterior, to study the nonadditive ranking risk. The methodology is based on assumption-free risk bounds and nonasymptotic oracle inequalities, which leads to nearly optimal convergence rates and optimal… (More)