Corpus ID: 236469329

Exceeding the Limits of Visual-Linguistic Multi-Task Learning

  title={Exceeding the Limits of Visual-Linguistic Multi-Task Learning},
  author={Cameron R. Wolfe and Keld T. Lundgaard},
By leveraging large amounts of product data collected across hundreds of live e-commerce websites, we construct 1000 unique classification tasks that share similarly-structured input data, comprised of both text and images. These classification tasks focus on learning the product hierarchy of different e-commerce websites, causing many of them to be correlated. Adopting a multi-modal transformer model, we solve these tasks in unison using multi-task learning (MTL). Extensive experiments are… Expand

Figures and Tables from this paper


12-in-1: Multi-Task Vision and Language Representation Learning
This work develops a large-scale, multi-task model that culminates in a single model on 12 datasets from four broad categories of task including visual question answering, caption-based image retrieval, grounding referring expressions, and multimodal verification and shows that finetuning task-specific models from this model can lead to further improvements, achieving performance at or above the state-of-the-art. Expand
A Hierarchical Multi-task Approach for Learning Embeddings from Semantic Tasks
A hierarchical model trained in a multi-task learning setup on a set of carefully selected semantic tasks achieves state-of-the-art results on a number of tasks, namely Named Entity Recognition, Entity Mention Detection and Relation Extraction without hand-engineered features or external NLP tools like syntactic parsers. Expand
BERT and PALs: Projected Attention Layers for Efficient Adaptation in Multi-Task Learning
Using new adaptation modules, PALs or `projected attention layers', this work matches the performance of separately fine-tuned models on the GLUE benchmark with roughly 7 times fewer parameters, and obtains state-of-the-art results on the Recognizing Textual Entailment dataset. Expand
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
A benchmark of nine diverse NLU tasks, an auxiliary dataset for probing models for understanding of specific linguistic phenomena, and an online platform for evaluating and comparing models, which favors models that can represent linguistic knowledge in a way that facilitates sample-efficient learning and effective knowledge-transfer across tasks. Expand
Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning
This work presents a simple, effective multi-task learning framework for sentence representations that combines the inductive biases of diverse training objectives in a single model and demonstrates that sharing a single recurrent sentence encoder across weakly related tasks leads to consistent improvements over previous methods. Expand
Multi-Task Learning of Hierarchical Vision-Language Representation
This work proposes a multi-task learning approach that enables to learn vision-language representation that is shared by many tasks from their diverse datasets and consistently outperforms previous single-task-learning methods on image caption retrieval, visual question answering, and visual grounding. Expand
Multi-Task Deep Neural Networks for Natural Language Understanding
A Multi-Task Deep Neural Network (MT-DNN) for learning representations across multiple natural language understanding (NLU) tasks that allows domain adaptation with substantially fewer in-domain labels than the pre-trained BERT representations. Expand
End-To-End Multi-Task Learning With Attention
The proposed Multi-Task Attention Network (MTAN) consists of a single shared network containing a global feature pool, together with a soft-attention module for each task, which allows learning of task-specific feature-level attention. Expand
Parameter-Efficient Transfer Learning for NLP
To demonstrate adapter's effectiveness, the recently proposed BERT Transformer model is transferred to 26 diverse text classification tasks, including the GLUE benchmark, and adapter attain near state-of-the-art performance, whilst adding only a few parameters per task. Expand
ERNIE: Enhanced Language Representation with Informative Entities
This paper utilizes both large-scale textual corpora and KGs to train an enhanced language representation model (ERNIE) which can take full advantage of lexical, syntactic, and knowledge information simultaneously, and is comparable with the state-of-the-art model BERT on other common NLP tasks. Expand