Patrick Ziering

  • Citations Per Year
Learn More
We address the task of bootstrapping a semantic lexicon from a list of seed terms and a large corpus. By restricting to a small subset of semantically strong patterns, i.e., coordinations, we improve results significantly. We show that the restriction to coordinations has several additional benefits, such as improved extraction of multiword expressions, and(More)
In this paper, we address the task of languageindependent, knowledge-lean and unsupervised compound splitting, which is an essential component for many natural language processing tasks such as machine translation. Previous methods on statistical compound splitting either include language-specific knowledge (e.g., linking elements) or rely on parallel data,(More)
Finding a definition of compoundhood that is cross-lingually valid is a non-trivial task as shown by linguistic literature. We present an iterative method for defining and extracting English noun compounds in a multilingual setting. We show how linguistic criteria can be used to extract compounds automatically and vice versa how the results of this(More)
We address the task of improving the quality of lexicon bootstrapping, i.e., of expanding a semantic lexicon on a given corpus. A main problem of iterative bootstrapping techniques is the fact that lexicon quality degrades gradually as more and more false terms are added. We propose to exploit linguistic variation between languages to reduce this problem of(More)
We present a cross-lingual method for determining NP structures. More specifically, we try to determine whether the semantics of tripartite noun compounds in context requires a left or right branching interpretation. The system exploits the difference in word position between languages as found in parallel corpora. We achieve a bracketing accuracy of 94.6%,(More)
This diploma thesis concerns the link feature engineering based on linguistic analysis for the coreference resolution part in the SUCRE system. The architecture of SUCRE’s coreference resolution is divided into two steps: classification and clustering. The feature research provided in this thesis modifies the input for the classifier (a decision tree(More)
We address the task of parsing semantically indeterminate expressions, for which several correct structures exist that do not lead to differences in meaning. We present a novel non-deterministic structure transfer method that accumulates all structural information based on cross-lingual word distance derived from parallel corpora. Our system’s output is a(More)
Traditionally, compound splitters are evaluated intrinsically on gold-standard data or extrinsically on the task of statistical machine translation. We explore a novel way for the extrinsic evaluation of compound splitters, namely recognizing textual entailment. Compound splitting has great potential for this novel task that is both transparent and(More)
  • 1