Corpus ID: 15628229

A Rule based Approach to Word Lemmatization

@inproceedings{Plisson2004ARB,
  title={A Rule based Approach to Word Lemmatization},
  author={J. Plisson and N. Lavrac and D. Mladenic},
  year={2004}
}
  • J. Plisson, N. Lavrac, D. Mladenic
  • Published 2004
  • Computer Science
  • Lemmatization is the process of finding the normalized form of a word. [...] Key Result When learning from a corpus of lemmatized Slovene words the RDR approach results in easy to understand rules of improved classification accuracy compared to the results of rule learning achieved in previous work.Expand Abstract

    Figures, Tables, and Topics from this paper.

    A Comparative Study of Stemming Algorithms
    • 236
    Preprocessing Techniques for Text Mining-An Overview Dr
    • 152
    • PDF
    Developing Text Resources for Ten South African Languages
    • 36
    • PDF
    Design of a Rule Based Hindi Lemmatizer
    • 10
    • PDF
    Structures and distributions in morphology learning
    • 40
    • PDF

    References

    Publications referenced by this paper.
    SHOWING 1-10 OF 14 REFERENCES
    An algorithm for suffix stripping
    • 5,660
    • PDF
    MACHINE LEARNING OF MORPHOSYNTACTIC STRUCTURE: LEMMATIZING UNKNOWN SLOVENE WORDS
    • 79
    • PDF
    Learning Decision Lists
    • 378
    • PDF
    The CN2 Induction Algorithm
    • 1,070
    • PDF
    The MULTEXT-East Slovene Lexicon
    • 12
    A Sequential Model for Multi-Class Classification
    • 64
    • PDF
    A Sequential Model for Multiclass Classification
    • 2001
    An Evaluation of Ripple Down Rules
    • 6