A Measure of Syntactic Flexibility for Automatically Identifying Multiword Expressions in Corpora

  title={A Measure of Syntactic Flexibility for Automatically Identifying Multiword Expressions in Corpora},
  author={Colin J. Bannard},
  • Colin J. Bannard
  • Published 2007
  • Computer Science
  • Natural languages contain many multi-word sequences that do not display the variety of syntactic processes we would expect given their phrase type, and consequently must be included in the lexicon as multiword units. This paper describes a method for identifying such items in corpora, focussing on English verb-noun combinations. In an evaluation using a set of dictionary-published MWEs we show that our method achieves greater accuracy than existing MWE extraction methods based on lexical… CONTINUE READING

    Figures, Tables, and Topics from this paper.

    Improving Statistical Machine Translation Using Domain Bilingual Multiword Expressions
    • 103
    • PDF
    Detecting Noun Compounds and Light Verb Constructions: a Contrastive Study
    • 34
    • PDF


    Publications referenced by this paper.
    Foundations of statistical natural language processing
    • 6,245
    • PDF
    Robust Accurate Statistical Annotation of General Text
    • 319
    • PDF
    Methods for the Qualitative Evaluation of Lexical Association Measures
    • 198
    • PDF
    Automatically Constructing a Lexicon of Verb Phrase Idiomatic Combinations
    • 99
    • PDF
    Is Knowledge-Free Induction of Multiword Unit Dictionary Headwords a Solved Problem?
    • 157
    • PDF
    An Extensive Empirical Study of Collocation Extraction Methods
    • 99
    • PDF
    Collocation Extraction Based on Modifiability Statistics
    • 52
    • PDF
    A constructional approach to idioms and word formation
    • 116
    • PDF
    Using Grammatical Relations to Compare Parsers
    • 37
    • Highly Influential
    • PDF
    Information Theory, Inference, and Learning Algorithms
    • 7,852
    • PDF