Corpus ID: 11865767

A Resource for Natural Language Processing of Swiss German Dialects

  title={A Resource for Natural Language Processing of Swiss German Dialects},
  author={Nora Hollenstein and No{\"e}mi Aepli},
  • Nora Hollenstein, Noëmi Aepli
  • Published in GSCL 2015
  • History, Computer Science
  • Since there are only a few resources for Swiss German dialects, we compiled a corpus of 115,000 tokens, manually annotated with PoStags. The goal is to provide a basic data set for developing NLP applications for Swiss German. We extended the original corpus and improved its annotation consistency. Furthermore, we trained dialect-specific PoS-tagging models and implemented a baseline system for dialect identification. 
    Findings of the VarDial Evaluation Campaign 2017
    • 115
    • PDF
    Parsing Approaches for Swiss German
    German Dialect Identification in Interview Transcriptions
    • 18
    • PDF
    SB-CH: A Swiss German Corpus with Sentiment Annotations
    • 3
    • PDF
    Digitising Swiss German: how to process and study a polycentric spoken language
    • 3
    • PDF
    HeLI-based Experiments in Swiss German Dialect Identification
    • 13
    • Highly Influenced
    German Dialect Identification Using Classifier Ensembles
    • 5
    • PDF


    Publications referenced by this paper.
    Compilation of a Swiss German Dialect Corpus and its Application to PoS Tagging
    • 20
    • PDF
    Lemmatisation as a Tagging Task
    • 35
    • PDF
    Detecting Errors in Part-of-Speech Annotation
    • 128
    • PDF
    Guidelines fur das Tagging deutscher Textcorpora mit STTS
    • 321