Meaning to Form: Measuring Systematicity as Information

@inproceedings{Pimentel2019MeaningTF,
  title={Meaning to Form: Measuring Systematicity as Information},
  author={Tiago Pimentel and Arya D. McCarthy and Dami{\'a}n E. Blasi and Brian Roark and Ryan Cotterell},
  booktitle={ACL},
  year={2019}
}
A longstanding debate in semiotics centers on the relationship between linguistic signs and their corresponding semantics: is there an arbitrary relationship between a word form and its meaning, or does some systematic phenomenon pervade? For instance, does the character bigram ‘gl’ have any systematic relationship to the meaning of words like ‘glisten’, ‘gleam’ and ‘glow’? In this work, we offer a holistic quantification of the systematicity of the sign using mutual information and recurrent… 

Figures and Tables from this paper

Finding Concept-specific Biases in Form–Meaning Associations
TLDR
New methods to detect cross-linguistic associations at scale are provided, and it is found that there is a significant effect of non-arbitrariness, but it is unsurprisingly small.
Not just form, not just meaning: Words with consistent form-meaning mappings are learned earlier
TLDR
A robust, unique negative effect of systematicity on Age of Acquisition (AoA), such that more systematic words tend to be learned earlier, ultimately speeding up word learning.
What Meaning-Form Correlation Has to Compose With: A Study of MFC on Artificial and Natural Language
TLDR
It is found that linguistic phenomena such as synonymy and ungrounded stop-words weigh on MFC measurements, and that straightforward methods to mitigate their effects have widely varying results depending on the dataset they are applied to.
On Homophony and R\'enyi Entropy
TLDR
A new information-theoretic quantification of a language’s homophony is proposed: the sample Rényi entropy and this quantification is used to revisit Trott and Bergen's claims.
Predicting Declension Class from Form and Meaning
TLDR
This study introduces a new method that provides additional quantitative support for a classic linguistic finding that form and meaning are relevant for the classification of nouns into declensions and shows not only that individual declensions classes vary in the strength of their clues within a language, but also that these variations themselves vary across languages.
On Homophony and Rényi Entropy
TLDR
A new information-theoretic quantification of a language’s homophony is proposed: the sample Rényi entropy and this quantification is used to revisit Trott and Bergen's claims.
An Information-Theoretic Characterization of Morphological Fusion
TLDR
An informationtheoretic measure to quantify the degree of fusion of a given set of morphological features in a surface form, which naturally provides such a graded scale is presented.
How (Non-)Optimal is the Lexicon?
TLDR
It is found that (compositional) morphology and graphotactics can sufficiently account for most of the complexity of natural codes—as measured by code length.
Modeling the Unigram Distribution
TLDR
This work presents a novel model for estimating the unigram distribution in a language (a neuralization of Goldwater et al.
Word formation supports efficient communication: The case of compounds
Compounding is a common type of word formation exten- sively studied in linguistics and cognitive psychology. A growing line of research suggests that the lexicon supports efficient communication by
...
...

References

SHOWING 1-10 OF 64 REFERENCES
Finding Non-Arbitrary Form-Meaning Systematicity Using String-Metric Learning for Kernel Regression
TLDR
The results suggest that the English lexicon exhibits far more global form-meaning systematicity than previously discovered, and that much of this systematicity is focused in localized formmeaning patterns.
Systematicity and Natural Language Syntax
A lengthy debate in the philosophy of the cognitive sciences has turned on whether the phenomenon known as 'systematicity' of language and thought shows that connectionist explanatory aspirations are
The Systematicity of the Sign: Modeling Activation of Semantic Attributes from Nonwords
TLDR
The extent to which similarities amongst the sounds of words was sufficient to drive sound symbolic effects was tested and whether a computational model that learned to map between form and meaning of English words better accounted for the observed behavior was tested.
The arbitrariness of the sign: learning advantages from the structure of the vocabulary.
TLDR
This work found that the optimal structure of the vocabulary for learning incorporated a division of labor between 2 different language learning functions: arbitrariness facilitates learning specific word meanings and systematicity facilitates learning to group words into categories.
Wordform Similarity Increases With Semantic Similarity: An Analysis of 100 Languages
TLDR
Evidence is shown in 100 languages from a diverse array of language families that more semantically similar word pairs are also more phonologically similar, which suggests that there is an important statistical trend for lexicons to haveSemantically similar words be phonological similar as well, possibly for functional reasons associated with language learning.
Automatic Labeling of Phonesthemic Senses
TLDR
This study attempts to advance corpus-based exploration of sound iconicity, i.e. the existence of a non-arbitrary relation- ship between forms and meanings in language, by examining a number of phonesthemes, phonetic groupings proposed to be meaningful in the literature with the aim of developing ways to validate their existence and their semantic content.
Advances in the Cross-Linguistic Study of Ideophones
TLDR
This review surveys recent developments in ideophone research and reveals new insights about their interactional uses and about their relation to other linguistic devices like reported speech and grammatical evidentials.
Questioning Arbitrariness in Language: a Data-Driven Study of Conventional Iconicity
TLDR
This paper employs NLP techniques to address two main questions: How can the existence of phonesthemes be tested at a large scale with quantitative methods and how can the meaning arguably carried by a phonestheme be induced automatically from word embeddings.
Filled pauses and their status in the mental lexicon
TLDR
A study of the relationship between form and meaning in the most frequent monosyllabic words in the lexicon of English finds that the words which appear towards the top of the ranking are the communicatively important words.
...
...