Sequence Tagging with Contextual and Non-Contextual Subword Representations: A Multilingual Evaluation

@inproceedings{Heinzerling2019SequenceTW,
  title={Sequence Tagging with Contextual and Non-Contextual Subword Representations: A Multilingual Evaluation},
  author={Benjamin Heinzerling and Michael Strube},
  booktitle={ACL},
  year={2019}
}
  • Benjamin Heinzerling, Michael Strube
  • Published in ACL 2019
  • Computer Science
  • Pretrained contextual and non-contextual subword embeddings have become available in over 250 languages, allowing massively multilingual NLP. [...] Key Result A more detailed analysis reveals different strengths and weaknesses: Multilingual BERT performs well in medium- to high-resource languages, but is outperformed by non-contextual subword embeddings in a low-resource setting.Expand Abstract
    15 Citations
    On the Choice of Auxiliary Languages for Improved Sequence Tagging
    • 1
    • PDF
    Give your Text Representation Models some Love: the Case for Basque
    • 10
    • Highly Influenced
    • PDF
    Are All Languages Created Equal in Multilingual BERT?
    • 18
    • PDF
    Substructure Substitution: Structured Data Augmentation for NLP
    • PDF
    Projecting Heterogeneous Annotations for Named Entity Recognition
    • 4
    • PDF

    References

    SHOWING 1-10 OF 42 REFERENCES
    context2vec: Learning Generic Context Embedding with Bidirectional LSTM
    • 283
    • PDF
    Multilingual Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Models and Auxiliary Loss
    • 301
    • PDF
    Deep contextualized word representations
    • 5,029
    • PDF
    BPEmb: Tokenization-free Pre-trained Subword Embeddings in 275 Languages
    • 94
    • PDF
    Robust Multilingual Part-of-Speech Tagging via Adversarial Training
    • 51
    • PDF
    Contextual String Embeddings for Sequence Labeling
    • 515
    • PDF
    Efficient Contextual Representation Learning Without Softmax Layer
    • 2
    • PDF
    SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing
    • 623
    • PDF
    Cross-lingual Character-Level Neural Morphological Tagging
    • 43
    • PDF
    A Universal Part-of-Speech Tagset
    • 805
    • PDF