Sabine Schulte im Walde

Learn More
This article presents clustering experiments on German verbs: A statistical grammar model for German serves as the source for a distributional verb description at the lexical syntax–semantics interface, and the unsupervised clustering algorithm k-means uses the empirical verb properties to perform an automatic induction of verb classes. Various evaluation(More)
Verbs were clustered semantically on the basis of their alternation behaviour, as characterised by their syntactic subcategorisation frames extracted from maximum probability parses of a robust statistical parser, and completed by assigning WordNet classes as selectional preferences to the frame arguments. The clustering was achieved a iteratively by(More)
We present a study on the automatic acquisition of semantic classes for Catalan adjectives from distributional and morphological information, with particular emphasis on polysemous adjectives. The aim is to distinguish and characterize broad classes, such as qualitative (gran 'big') and relational (pulmonar 'pulmonary') adjectives, as well as to identify(More)
The paper presents a large-scale computational subcategorisation lexicon for several thousand German verbs. The lexical entries were obtained by unsupervised learning in a statistical grammar framework: a German context-free grammar containing frame-predicting grammar rules and information about lexical heads was trained on 18.7 million words of a large(More)
This paper presents an innovative, complex approach to semantic verb classification that relies on selectional preferences as verb properties. The probabilistic verb class model underlying the semantic classes is trained by a combination of the EM algorithm and the MDL principle, providing soft clusters with two dimensions (verb senses and(More)
This paper explores two hypotheses regarding vector space models that predict the compo-sitionality of German noun-noun compounds: (1) Against our intuition, we demonstrate that window-based rather than syntax-based distri-butional features perform better predictions, and that not adjectives or verbs but nouns represent the most salient part-of-speech. Our(More)
The lack of adequate bases of commonsense or even lexical knowledge is perhaps the main obstacle to the development of high-performance, robust tools for semantic interpretation. It is also generally accepted that, notwithstanding the increasing availability in recent years of substantial hand-coded lexical resources such as WordNet and EuroWordNet,(More)
The paper describes the application of k-Means, a standard clustering technique, to the task of inducing semantic classes for German verbs. Using probability distributions over verb subcategorisation frames, we obtained an intuitively plausible clustering of 57 verbs into 14 classes. The automatic clustering was evaluated against independently motivated,(More)
In this paper, we introduce SLQS, a new entropy-based measure for the unsupervised identification of hypernymy and its directionality in Distributional Semantic Models (DSMs). SLQS is assessed through two tasks: (i.) identifying the hypernym in hyponym-hypernym pairs, and (ii.) discriminating hypernymy among various semantic relations. In both tasks, SLQS(More)