Corpus ID: 212725842

TRANS-BLSTM: Transformer with Bidirectional LSTM for Language Understanding

@article{Huang2020TRANSBLSTMTW,
  title={TRANS-BLSTM: Transformer with Bidirectional LSTM for Language Understanding},
  author={Zhiheng Huang and Peng Xu and Davis Liang and Ajay K. Mishra and Bing Xiang},
  journal={ArXiv},
  year={2020},
  volume={abs/2003.07000}
}
Bidirectional Encoder Representations from Transformers (BERT) has recently achieved state-of-the-art performance on a broad range of NLP tasks including sentence classification, machine translation, and question answering. The BERT model architecture is derived primarily from the transformer. Prior to the transformer era, bidirectional Long Short-Term Memory (BLSTM) has been the dominant modeling architecture for neural machine translation and question answering. In this paper, we investigate… Expand

References

SHOWING 1-10 OF 33 REFERENCES
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Multi-Task Deep Neural Networks for Natural Language Understanding
Attention is All you Need
Language Models are Unsupervised Multitask Learners
Distilling Task-Specific Knowledge from BERT into Simple Neural Networks
Patient Knowledge Distillation for BERT Model Compression
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Bidirectional LSTM-CRF Models for Sequence Tagging
Neural Machine Translation by Jointly Learning to Align and Translate
...
1
2
3
4
...