Corpus ID: 233004302

Canonical and Surface Morphological Segmentation for Nguni Languages

@article{Moeng2021CanonicalAS,
  title={Canonical and Surface Morphological Segmentation for Nguni Languages},
  author={Tumi Moeng and Sheldon Reay and Aaron Daniels and Jan Buys},
  journal={ArXiv},
  year={2021},
  volume={abs/2104.00767}
}
Morphological Segmentation involves decomposing words into morphemes, the smallest meaning-bearing units of language. This is an important NLP task for morphologically-rich agglutinative languages such as the Southern African Nguni language group. In this paper, we investigate supervised and unsupervised models for two variants of morphological segmentation: canonical and surface segmentation. We train sequence-to-sequence models for canonical segmentation, where the underlying morphemes may… Expand

Figures and Tables from this paper

References

SHOWING 1-10 OF 28 REFERENCES
Unsupervised models for morpheme segmentation and morphology learning
Cross-lingual Word Segmentation and Morpheme Segmentation as Sequence Labelling
Morphological Segmentation with Window LSTM Neural Networks
Experimental Fast-Tracking of Morphological Analysers for Nguni Languages
Neural Morphological Analysis: Encoding-Decoding Canonical Segments
Ukwabelana - An open-source morphological Zulu corpus
Neural Sequence-to-sequence Learning of Internal Word Structure
...
1
2
3
...