Structure-Level Knowledge Distillation For Multilingual Sequence Labeling

@inproceedings{Wang2020StructureLevelKD,
  title={Structure-Level Knowledge Distillation For Multilingual Sequence Labeling},
  author={Xinyu Wang and Yong Jiang and Nguyen Bach and Tao Wang and Fei Huang and K. Tu},
  booktitle={ACL},
  year={2020}
}
Multilingual sequence labeling is a task of predicting label sequences using a single unified model for multiple languages. Compared with relying on multiple monolingual models, using a multilingual model has the benefit of a smaller model size, easier in online serving, and generalizability to low-resource languages. However, current multilingual models still underperform individual monolingual models significantly due to model capacity limitations. In this paper, we propose to reduce the gap… Expand
4 Citations
Structural Knowledge Distillation
  • PDF
Automated Concatenation of Embeddings for Structured Prediction
  • PDF
Knowledge Distillation Techniques for Biomedical Named Entity Recognition
  • PDF

References

SHOWING 1-10 OF 60 REFERENCES
Multilingual Neural Machine Translation with Knowledge Distillation
  • 91
  • Highly Influential
  • PDF
Cross-lingual Multi-Level Adversarial Transfer to Enhance Low-Resource Name Tagging
  • 9
  • PDF
Massively Multilingual Transfer for NER
  • 34
  • Highly Influential
  • PDF
Knowledge Distillation for Sequence Model
  • 36
  • Highly Influential
  • PDF
Nudging the Envelope of Direct Transfer Methods for Multilingual Named Entity Recognition
  • 19
  • PDF
Small and Practical BERT Models for Sequence Labeling
  • 48
  • PDF
How multilingual is Multilingual BERT?
  • 299
  • PDF
Beto, Bentz, Becas: The Surprising Cross-Lingual Effectiveness of BERT
  • 212
  • PDF
...
1
2
3
4
5
...