Investigation of Word Senses over Time Using Linguistic Corpora
@inproceedings{Plitz2015InvestigationOW, title={Investigation of Word Senses over Time Using Linguistic Corpora}, author={Christian P{\"o}litz and Thomas Bartz and Katharina Morik and Angelika Storrer}, booktitle={TSD}, year={2015} }
Word sense induction is an important method to identify possible meanings of words. Word co-occurrences can group word contexts into semantically related topics. Besides the pure words, temporal information provide another dimension to further investigate the development of the word meanings over time. Large digital corpora of written language, such as those that are held by the CLARIN-D centers, provide excellent possibilities for such kind of linguistic research on authentic language data. In…
4 Citations
Word embeddings: reliability & semantic change
- Computer Science
- 2019
The JeSemE website is created to make word embedding based diachronic research more accessible and investigate the applicability of these methods by investigating the historical understanding of electricity as well as words connected to Romanticism.
On the Linearity of Semantic Change: Investigating Meaning Variation via Dynamic Graph Models
- Computer ScienceACL
- 2016
It is found that semantic change is linear in two senses: today’s embedding vector (= meaning) of words can be derived as linear combinations of embedding vectors of their neighbors in previous time periods.
LL(O)D and NLP perspectives on semantic change for humanities research
- Computer ScienceSemantic Web
- 2022
The aim is to provide the starting point for the construction of a workflow and set of multilingual diachronic ontologies within the humanities use case of the COST Action Nexus Linguarum, European network for Web-centred linguistic data science.
Topic Modeling Genre: An Exploration of French Classical and Enlightenment Drama
- Linguistics
- 2021
The concept of literary genre is a highly complex one: not only are different genres frequently defined on several, but not necessarily the same levels of description, but consideration of genres as…
References
SHOWING 1-10 OF 14 REFERENCES
Word-Sense Disambiguation Using Statistical Methods
- Computer ScienceACL
- 1991
A statistical technique for assigning senses to words is described, which incorporated into the statistical machine translation system the error rate of the system decreased by thirteen percent.
Bayesian Word Sense Induction
- Computer ScienceEACL
- 2009
This work places sense induction in a Bayesian context by modeling the contexts of the ambiguous word as samples from a multinomial distribution over senses which are in turn characterized as distributions over words.
Inducing Word Senses to Improve Web Search Result Clustering
- Computer ScienceEMNLP
- 2010
This work first acquires the senses of a query by means of a graph-based clustering algorithm that exploits cycles in the co-occurrence graph of the query, then clusters the search results based on their semantic similarity to the induced word senses.
Word sense disambiguation: A survey
- Computer Science, PsychologyCSUR
- 2009
This work introduces the reader to the motivations for solving the ambiguity of words and provides a description of the task, and overviews supervised, unsupervised, and knowledge-based approaches.
Towards Tracking Semantic Change by Visual Analytics
- Computer ScienceACL
- 2011
The aim of this study is to offer a new instrument for investigating the diachronic development of word senses in a way that allows for a better understanding of the nature of semantic change in general.
Tony McEnery, Richard Xiao & YuKio Tono, Corpus-based language studies: An advanced resource book . London and New York: Routledge, 2006. Pp. xix, 386. Pb $33.95.
- LinguisticsLanguage in Society
- 2008
Originally associated mainly with work in lexicography and grammar, corpus linguistics has more recently established its relevance for a wide range of linguistic endeavors, including research into…
Topics over time: a non-Markov continuous-time model of topical trends
- Computer ScienceKDD '06
- 2006
An LDA-style topic model is presented that captures not only the low-dimensional structure of data, but also how the structure changes over time, showing improved topics, better timestamp prediction, and interpretable trends.
Finding scientific topics
- Computer ScienceProceedings of the National Academy of Sciences of the United States of America
- 2004
A generative model for documents is described, introduced by Blei, Ng, and Jordan, and a Markov chain Monte Carlo algorithm is presented for inference in this model, which is used to analyze abstracts from PNAS by using Bayesian model selection to establish the number of topics.
Das Digitale Wörterbuch der Deutschen Sprache (DWDS)
- Art
- 2010
Es hat die Vollendung erlebt, und auch wieder nicht, denn als im Jahre 1960 die letzte Lieferung des Deutschen Worterbuchs erschien, da war langst deutlich, dass weite Teile dieses gewaltigen Werks…