The Right Tool for the Job: Matching Model and Instance Complexities

@inproceedings{Schwartz2020TheRT,
  title={The Right Tool for the Job: Matching Model and Instance Complexities},
  author={Roy Schwartz and Gabriel Stanovsky and Swabha Swayamdipta and Jesse Dodge and Noah A. Smith},
  booktitle={ACL},
  year={2020}
}
As NLP models become larger, executing a trained model requires significant computational resources incurring monetary and environmental costs. To better respect a given inference budget, we propose a modification to contextual representation fine-tuning which, during inference, allows for an early (and fast) "exit" from neural network calculations for simple instances, and late (and accurate) exit for hard instances. To achieve this, we add classifiers to different layers of BERT and use their… Expand
Elbert: Fast Albert with Confidence-Window Based Early Exit
Accelerating Pre-trained Language Models via Calibrated Cascade
FastFormers: Highly Efficient Transformer Models for Natural Language Understanding
...
1
2
3
...

References

SHOWING 1-10 OF 55 REFERENCES
Show Your Work: Improved Reporting of Experimental Results
Distilling Task-Specific Knowledge from BERT into Simple Neural Networks
RNN Architecture Learning with Sparse Regularization
Controlling Computation versus Quality for Neural Sequence Models
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Are Sixteen Heads Really Better than One?
Language Models are Unsupervised Multitask Learners
Reducing Transformer Depth on Demand with Structured Dropout
...
1
2
3
4
5
...