• Corpus ID: 34533242

Grammar induction for mildly context sensitive languages using variational Bayesian inference

  title={Grammar induction for mildly context sensitive languages using variational Bayesian inference},
  author={Eva Portelance and Chris Bruno and Daniel Harasim and Leon Bergen and Timothy J. O'Donnell},
The following technical report presents a formal approach to probabilistic minimalist grammar induction. We describe a formalization of a minimalist grammar. Based on this grammar, we define a generative model for minimalist derivations. We then present a generalized algorithm for the application of variational Bayesian inference to lexicalized mildly context sensitive language grammars which in this paper is applied to the previously defined minimalist grammar. 



An Application of the Variational Bayesian Approach to Probabilistic Context-Free Grammars

An efficient learning algorithm for probabilistic context-free grammars based on the variational Bayesian approach is presented and it is shown that the computational complexity of the algorithm is equal to that of the Inside-Outside algorithm.

Unsupervised Induction of Tree Substitution Grammars for Dependency Parsing

This paper defines a hierarchical non-parametric Pitman-Yor Process prior which biases towards a small grammar with simple productions and significantly improves the state-of-the-art, when measured by head attachment accuracy.

Simple Robust Grammar Induction with Combinatory Categorial Grammars

A simple EM-based grammar induction algorithm for Combinatory Categorial Grammar (CCG) that achieves state-of-the-art performance by relying on a minimal number of very general linguistic principles, and discovers all categories automatically.

Covariance in Unsupervised Learning of Probabilistic Grammars

An alternative to the Dirichlet prior is suggested, a family of logistic normal distributions that permits soft parameter tying within grammars and across Grammars for text in different languages, and empirical gains in a novel learning setting using bilingual, non-parallel data are shown.

String Adjunct Grammars

Several subclasses of AG's motivated by strong linguistic considerations have been studied, comparing them with PSG's, and linguistic relevance of these grammars has been discussed.

An HDP Model for Inducing Combinatory Categorial Grammars

We introduce a novel nonparametric Bayesian model for the induction of Combinatory Categorial Grammars from POS-tagged text. It achieves state of the art performance on a number of languages, and

Linguistically Motivated Combinatory Categorial Grammar Induction

A CCG grammar induction scheme for semantic parsing is presented, where the grammar is restricted by modeling a wide range of linguistic constructions, then a new lexical generalization model is introduced that abstracts over systematic morphological, syntactic, and semantic variations in languages.

Derivational Minimalism

A simple grammar formalism with these properties is presented here and briefly explored and can define languages that are not in the class of languages definable by tree adjoining grammars.

Multiple Context-Free Grammars

It is outlined that the expressivity of m-MCFG’s increases with the parameter m and that the class of tree-adjoining languages is properly included in theclass of 2-multiple context-free languages.

Evidence against the context-freeness of natural language

In searching for universal constraints on the class of natural languages, linguists have investigated a number of formal properties, including that of context-freeness, which is interpreted strongly and weakly both as a way of characterizing structure sets and even weakly for characterizing string sets.