Corpus ID: 225061936

ST-BERT: Cross-modal Language Model Pre-training For End-to-end Spoken Language Understanding

@article{Kim2020STBERTCL,
  title={ST-BERT: Cross-modal Language Model Pre-training For End-to-end Spoken Language Understanding},
  author={Minjeong Kim and Gyuwan Kim and Sang-Woo Lee and Jungwoo Ha},
  journal={ArXiv},
  year={2020},
  volume={abs/2010.12283}
}
Language model pre-training has shown promising results in various downstream tasks. In this context, we introduce a cross-modal pre-trained language model, called Speech-Text BERT (ST-BERT), to tackle end-to-end spoken language understanding (E2E SLU) tasks. Taking phoneme posterior and subword-level text as an input, ST-BERT learns a contextualized cross-modal alignment via our two proposed pre-training tasks: Cross-modal Masked Language Modeling (CM-MLM) and Cross-modal Conditioned Language… Expand
1 Citations

Figures and Tables from this paper

Integration of Pre-trained Networks with Continuous Token Interface for End-to-End Spoken Language Understanding
  • PDF

References

SHOWING 1-10 OF 31 REFERENCES
Speech Model Pre-training for End-to-End Spoken Language Understanding
  • 78
  • Highly Influential
  • PDF
End-To-End Spoken Language Understanding Without Matched Language Speech Model Pretraining Data
  • Ryan Price
  • Computer Science
  • ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2020
  • 5
Pretrained Semantic Speech Embeddings for End-to-End Spoken Language Understanding via Cross-Modal Teacher-Student Learning
  • 7
  • PDF
Large-Scale Unsupervised Pre-Training for End-to-End Spoken Language Understanding
  • 14
End-to-End Spoken Language Understanding: Bootstrapping in Low Resource Scenarios
  • 15
  • PDF
Speech to Text Adaptation: Towards an Efficient Cross-Modal Distillation
  • 7
  • PDF
Towards End-to-end Spoken Language Understanding
  • 98
  • PDF
Recent Advances in End-to-End Spoken Language Understanding
  • 10
  • PDF
...
1
2
3
4
...