Learn More
Vector-based models of lexical semantics retrieve semantically related words automatically from large corpora by exploiting the property that words with a similar meaning tend to occur in similar contexts. Despite their increasing popularity, it is unclear which kind of semantic similarity they actually capture and for which kind of words. In this paper, we(More)
In statistical NLP, Semantic Vector Spaces (SVS) are the standard technique for the automatic modeling of lexical semantics. However, it is largely unclear how these black-box techniques exactly capture word meaning. To explore the way an SVS structures the individual occurrences of words, we use a non-parametric MDS solution of a token-by-token similarity(More)
The language of IRC – Internet Relay Chat – is in many respects an example of " spoken language in written form " : although produced in a written medium, it shares with spoken language a dialogical immediacy that ordinary written text usually lacks, as a result of which, it tends to appear highly informal, even to the untrained observer. Linguists should(More)
This paper reports on the ways in which new entities are introduced into discourse. First, we present the evidence in support of a model of indefinite reference processing based on three principles: the listener's ability to make predictive inferences in order to decrease the unexpectedness of upcoming words, the availability to the speaker of grammatical(More)
Over the last decade, the Leuven Research Unit of Quantitative Lexicology and Variational Linguistics has developed a corpus-based method for the investigation of region and register variation in and between the national variants of Dutch, viz. Belgian Dutch and Netherlandic Dutch. The basic characteristics of the methodology are the following. Geeraerts,(More)