Corpus ID: 233033712

Compressing Visual-linguistic Model via Knowledge Distillation

@article{Fang2021CompressingVM,
  title={Compressing Visual-linguistic Model via Knowledge Distillation},
  author={Zhiyuan Fang and Jianfeng Wang and Xiaowei Hu and Lijuan Wang and Yezhou Yang and Zicheng Liu},
  journal={ArXiv},
  year={2021},
  volume={abs/2104.02096}
}
  • Zhiyuan Fang, Jianfeng Wang, +3 authors Zicheng Liu
  • Published 2021
  • Computer Science
  • ArXiv
Despite exciting progress in pre-training for visuallinguistic (VL) representations, very few aspire to a small VL model. In this paper, we study knowledge distillation (KD) to effectively compress a transformer based large VL model into a small VL model. The major challenge arises from the inconsistent regional visual tokens extracted from different detectors of Teacher and Student, resulting in the misalignment of hidden representations and attention distributions. To address the problem, we… Expand

References

SHOWING 1-10 OF 76 REFERENCES
Contrastive Distillation on Intermediate Representations for Language Model Compression
  • 5
  • Highly Influential
  • PDF
VideoBERT: A Joint Model for Video and Language Representation Learning
  • 254
  • PDF
ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks
  • 464
  • Highly Influential
  • PDF
VinVL: Making Visual Representations Matter in Vision-Language Models
  • 7
  • PDF
Like What You Like: Knowledge Distill via Neuron Selectivity Transfer
  • 131
  • PDF
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
  • 17,118
  • Highly Influential
  • PDF
VL-BERT: Pre-training of Generic Visual-Linguistic Representations
  • 265
  • PDF
Unified Vision-Language Pre-Training for Image Captioning and VQA
  • 131
  • Highly Influential
  • PDF
Sequence-Level Knowledge Distillation
  • 299
  • PDF
...
1
2
3
4
5
...