Probabilistic Typology: Deep Generative Models of Vowel Inventories
@article{Cotterell2017ProbabilisticTD, title={Probabilistic Typology: Deep Generative Models of Vowel Inventories}, author={Ryan Cotterell and Jason Eisner}, journal={ArXiv}, year={2017}, volume={abs/1705.01684} }
Linguistic typology studies the range of structures present in human language. The main goal of the field is to discover which sets of possible phenomena are universal, and which are merely frequent. For example, all languages have vowels, while most---but not all---languages have an /u/ sound. In this paper we present the first probabilistic treatment of a basic question in phonological typology: What makes a natural vowel inventory? We introduce a series of deep stochastic point processes…
26 Citations
A Deep Generative Model of Vowel Formant Typology
- Linguistics, Computer ScienceNAACL
- 2018
This work tackles the problem of vowel system typology, i.e., a generative probability model of which vowels a language contains, and develops a novel generative probabilities model that works directly with the acoustic information.
A Probabilistic Generative Model of Linguistic Typology
- Linguistics, Computer ScienceNAACL
- 2019
This work develops a generative model of language based on exponential-family matrix factorisation and shows how structural similarities between languages can be exploited to predict typological features with near-perfect accuracy, outperforming several baselines on the task of predicting held-out features.
From Phonology to Syntax: Unsupervised Linguistic Typology at Different Levels with Language Embeddings
- LinguisticsNAACL-HLT
- 2018
A core part of linguistic typology is the classification of languages according to linguistic properties, such as those detailed in the World Atlas of Language Structure (WALS). Doing this manually…
Tracking Typological Traits of Uralic Languages in Distributed Language Representations
- Linguistics, Computer ScienceArXiv
- 2017
This paper investigates which typological features are encoded in distributed representations of language by attempting to predict features in the World Atlas of Language Structures, and finds that some typological traits can be automatically inferred with accuracies well above a strong baseline.
On the Relation between Linguistic Typology and (Limitations of) Multilingual Language Modeling
- Linguistics, Computer ScienceEMNLP
- 2018
Fine-grained typological features such as exponence, flexivity, fusion, and inflectional synthesis are borne out to be responsible for the proliferation of low-frequency phenomena which are organically difficult to model by statistical architectures, or for the meaning ambiguity of character n-grams.
Uncovering Probabilistic Implications in Typological Knowledge Bases
- Linguistics, Computer ScienceACL
- 2019
A computational model is presented which successfully identifies known universals, including Greenberg universals but also uncovers new ones, worthy of further linguistic investigation, which outperforms baselines previously used for this problem, as well as a strong baseline from knowledge base population.
Consonant co-occurrence classes and the feature-economy principle
- LinguisticsPhonology
- 2020
The feature-economy principle is one of the key theoretical notions which have been postulated to account for the structure of phoneme inventories in the world's languages. In this paper, we test the…
Phonotactic Complexity and Its Trade-offs
- Computer Science, LinguisticsTACL
- 2020
Methods for calculating a measure of phonotactic complexity—bits per phoneme— that permits a straightforward cross-linguistic comparison are presented, giving insight into how complex a language’s phonotactics is.
Modeling Language Variation and Universals: A Survey on Typological Linguistics for Natural Language Processing
- Computer Science, LinguisticsComputational Linguistics
- 2018
It is shown that to date, the use of information in existing typological databases has resulted in consistent but modest improvements in system performance, due to both intrinsic limitations of databases and under-employment of the typological features included in them.
Modeling Language Variation and Universals: A Survey on Typological Linguistics for Natural Language Processing
- Computer Science, LinguisticsComputational Linguistics
- 2019
It is suggested that a new approach that adapts the broad and discrete nature of typological categories to the contextual and continuous nature of machine learning algorithms used in contemporary NLP could be facilitated by recent developments in data-driven induction ofTypological knowledge.
References
SHOWING 1-10 OF 39 REFERENCES
An Introduction to Linguistic Typology
- Linguistics
- 2012
This clear and accessible introduction to linguistic typology covers all linguistic domains from phonology and morphology over parts-of-speech, the NP and the VP, to simple and complex clauses,…
What is Phonological Typology
- Linguistics
- 2014
UC Berkeley Phonology Lab Annual Report (2014) What is Phonological Typology? Larry M. Hyman University of California, Berkeley Paper presented at the Workshop on Phonological Typology, University of…
Improved Lexical Acquisition through DPP-based Verb Clustering
- Computer ScienceACL
- 2013
This work presents the first unified framework for unsupervised learning of subcategorization frames, selectional preferences and verb classes, and shows how to utilize Determinantal Point Processes, elegant probabilistic models that are defined over the possible subsets of a given dataset and give higher probability mass to high quality and diverse subsets, for clustering.
The Dispersion-Focalization Theory of vowel systems
- Physics
- 1997
The Dispersion-Focalization Theory (DFT) attempts to predict vowel systems based on the minimization of an energy function summing two perceptual components : global dispersion , which is based on…
The sounds of the world's languages
- Linguistics, Art
- 1996
List of Figures. List of Tables. Acknowledgments. 1. The Sounds of the Worlda s Languages. 2. Places of Articulation. 3. Stops. 4. Nasals and Nasalized Consonants. 5. Fricatives. 6. Laterals. 7.…
A course in phonetics
- Linguistics
- 1975
Part I Introductory concepts: articulatory phonetics phonology and phonetic transcription. Part II English phonetics: the Consonants of English English vowels English words and sentences. Part III…
Toward a universal law of generalization for psychological science.
- Computer ScienceScience
- 1987
A psychological space is established for any set of stimuli by determining metric distances between the stimuli such that the probability that a response learned to any stimulus will generalize to…
Determinantal Point Processes for Machine Learning
- Computer ScienceFound. Trends Mach. Learn.
- 2012
Determinantal Point Processes for Machine Learning provides a comprehensible introduction to DPPs, focusing on the intuitions, algorithms, and extensions that are most relevant to the machine learning community, and shows how they can be applied to real-world applications.
Learning Determinantal Point Processes
- Computer ScienceUAI
- 2011
This thesis shows how determinantal point processes can be used as probabilistic models for binary structured problems characterized by global, negative interactions, and demonstrates experimentally that the techniques introduced allow DPPs to be used for real-world tasks like document summarization, multiple human pose estimation, search diversification, and the threading of large document collections.
A Learning Algorithm for Boltzmann Machines
- Computer ScienceCogn. Sci.
- 1985
A general parallel search method is described, based on statistical mechanics, and it is shown how it leads to a general learning rule for modifying the connection strengths so as to incorporate knowledge about a task domain in an efficient way.