Accelerating Natural Language Understanding in Task-Oriented Dialog

@article{Ahuja2020AcceleratingNL,
  title={Accelerating Natural Language Understanding in Task-Oriented Dialog},
  author={Ojas Ahuja and Shrey Desai},
  journal={ArXiv},
  year={2020},
  volume={abs/2006.03701}
}
  • Ojas Ahuja, Shrey Desai
  • Published 2020
  • Computer Science
  • ArXiv
  • Task-oriented dialog models typically leverage complex neural architectures and large-scale, pre-trained Transformers to achieve state-of-the-art performance on popular natural language understanding benchmarks. However, these models frequently have in excess of tens of millions of parameters, making them impossible to deploy on-device where resource-efficiency is a major concern. In this work, we show that a simple convolutional model compressed with structured pruning achieves largely… CONTINUE READING

    References

    SHOWING 1-10 OF 48 REFERENCES
    Distilling Task-Specific Knowledge from BERT into Simple Neural Networks
    • 126
    • PDF
    TinyBERT: Distilling BERT for Natural Language Understanding
    • 177
    • PDF
    Q8BERT: Quantized 8Bit BERT
    • 67
    • PDF
    BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
    • 14,659
    • Highly Influential
    • PDF
    DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
    • 612
    • Highly Influential
    • PDF
    Natural Language Processing (Almost) from Scratch
    • 5,847
    • PDF
    A Stack-Propagation Framework with Token-Level Intent Detection for Spoken Language Understanding
    • 36
    • Highly Influential
    • PDF
    Attention is All you Need
    • 15,891
    • PDF
    An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling
    • 975
    • PDF