Sabine Schulte im Walde

Learn More
This article presents clustering experiments on German verbs: A statistical grammar model for German serves as the source for a distributional verb description at the lexical syntax–semantics interface, and the unsupervised clustering algorithm k-means uses the empirical verb properties to perform an automatic induction of verb classes. Various evaluation(More)
Verbs were clustered semantically on the basis of their alternation behaviour, as characterised by their syntactic subcategorisation frames extracted from maximum probability parses of a robust statistical parser, and completed by assigning WordNet classes as selectional preferences to the frame arguments. The clustering was achieved a iteratively by(More)
The paper presents a large-scale computational subcategorisation lexicon for several thousand German verbs. The lexical entries were obtained by unsupervised learning in a statistical grammar framework: a German context-free grammar containing frame-predicting grammar rules and information about lexical heads was trained on 18.7 million words of a large(More)
This paper presents an innovative, complex approach to semantic verb classification that relies on selectional preferences as verb properties. The probabilistic verb class model underlying the semantic classes is trained by a combination of the EM algorithm and the MDL principle, providing soft clusters with two dimensions (verb senses and(More)
This paper explores two hypotheses regarding vector space models that predict the compo-sitionality of German noun-noun compounds: (1) Against our intuition, we demonstrate that window-based rather than syntax-based distri-butional features perform better predictions, and that not adjectives or verbs but nouns represent the most salient part-of-speech. Our(More)
We present a study on the automatic acquisition of semantic classes for Catalan adjectives from distributional and morphological information, with particular emphasis on polysemous adjectives. The aim is to distinguish and characterize broad classes, such as qualitative (gran 'big') and relational (pulmonar 'pulmonary') adjectives, as well as to identify(More)
"While continuous word vector representations enjoy increasing popularity, it is still poorly understood (i) how reliable they are for other languages than English, and (ii) to what extent they encode deep semantic relatedness such as paradigmatic relations. In this talk I will present experiments with continuous word vectors for English and German."
There is considerable evidence showing that the human sentence processor is guided by lexical preferences in resolving syntactic ambiguities. Several types of preferences have been identified, including morphological, syntactic, and semantic ones. However, the literature fails to provide a uniform account of what lexical preferences are and how they should(More)
The lack of adequate bases of commonsense or even lexical knowledge is perhaps the main obstacle to the development of high-performance, robust tools for semantic interpretation. It is also generally accepted that, notwithstanding the increasing availability in recent years of substantial hand-coded lexical resources such as WordNet and EuroWordNet,(More)