Corpus ID: 214667372

Heavy-tailed Representations, Text Polarity Classification & Data Augmentation

@article{Jalalzai2020HeavytailedRT,
  title={Heavy-tailed Representations, Text Polarity Classification \& Data Augmentation},
  author={Hamid Jalalzai and Pierre Colombo and C. Clavel and {\'E}ric Gaussier and Giovanna Varni and Emmanuel Vignon and A. Sabourin},
  journal={ArXiv},
  year={2020},
  volume={abs/2003.11593}
}
The dominant approaches to text representation in natural language rely on learning embeddings on massive corpora which have convenient properties such as compositionality and distance preservation. In this paper, we develop a novel method to learn a heavy-tailed embedding with desirable regularity properties regarding the distributional tails, which allows to analyze the points far away from the distribution bulk using the framework of multivariate extreme value theory. In particular, a… Expand
Graph Neural Networks with Extreme Nodes Discrimination
Graph neural networks (GNNs) are a successful example of leveraging the underlying structure between samples to perform efficient semi-supervised learning. Though the spatial correlation of the nodesExpand
Hierarchical Pre-training for Sequence Labelling in Spoken Dialog
TLDR
This work proposes a new approach to learn generic representations adapted to spoken dialog with a hierarchical encoder based on transformer architectures and demonstrates how hierarchical encoders achieve competitive results with consistently fewer parameters compared to state-of-the-art models. Expand
Informative Clusters for Multivariate Extremes
TLDR
A novel optimization-based approach called MEXICO, standing for Multivariate EXtreme Informative Clustering by Optimization, aims at exhibiting a sparsity pattern within the dependence structure of extremes. Expand
A Novel Estimator of Mutual Information for Learning to Disentangle Textual Representations
TLDR
A novel variational upper bound to the mutual information between an attribute and the latent code of an encoder is introduced, leading to both better disentangled representations and in particular, a precise control of the desirable degree of disentanglement than state-of-the-art methods proposed for textual data. Expand
Feature Clustering for Support Identification in Extreme Regions
TLDR
A novel optimization-based approach to assess the dependence structure of extremes by estimating clusters of features which best capture the support of extremes using the angular measure. Expand
Improving Multimodal fusion via Mutual Dependency Maximisation
Multimodal sentiment analysis is a trending area of research, and the multimodal fusion is one of its most active topic. Acknowledging humans communicate through a variety of channels (i.e visual,Expand

References

SHOWING 1-10 OF 68 REFERENCES
Contextual Augmentation: Data Augmentation by Words with Paradigmatic Relations
TLDR
This work retrofit a language model with a label-conditional architecture, which allows the model to augment sentences without breaking the label-compatibility and improves classifiers based on the convolutional or recurrent neural networks. Expand
Deep Contextualized Word Representations
TLDR
A new type of deep contextualized word representation is introduced that models both complex characteristics of word use and how these uses vary across linguistic contexts, allowing downstream models to mix different types of semi-supervision signals. Expand
The Curious Case of Neural Text Degeneration
TLDR
By sampling text from the dynamic nucleus of the probability distribution, which allows for diversity while effectively truncating the less reliable tail of the distribution, the resulting text better demonstrates the quality of human text, yielding enhanced diversity without sacrificing fluency and coherence. Expand
From Group to Individual Labels Using Deep Features
TLDR
This paper proposes a new objective function that encourages smoothness of inferred instance-level labels based on instance- level similarity, while at the same time respecting group-level label constraints, and applies this approach to the problem of predicting labels for sentences given labels for reviews, using a convolutional neural network to infer sentence similarity. Expand
EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks
TLDR
EDA consists of four simple but powerful operations: synonym replacement, random insertion, random swap, and random deletion, which shows that EDA improves performance for both convolutional and recurrent neural networks. Expand
Improving Language Understanding by Generative Pre-Training
TLDR
The general task-agnostic model outperforms discriminatively trained models that use architectures specifically crafted for each task, significantly improving upon the state of the art in 9 out of the 12 tasks studied. Expand
Mining Quality Phrases from Massive Text Corpora
TLDR
A new framework that extracts quality phrases from text corpora integrated with phrasal segmentation is proposed, which requires only limited training but the quality of phrases so generated is close to human judgment. Expand
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
TLDR
A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks. Expand
Learning to Compose Domain-Specific Transformations for Data Augmentation
TLDR
The proposed method can make use of arbitrary, non-deterministic transformation functions, is robust to misspecified user input, and is trained on unlabeled data, which can be used to perform data augmentation for any end discriminative model. Expand
Sequence to Sequence Learning with Neural Networks
TLDR
This paper presents a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure, and finds that reversing the order of the words in all source sentences improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier. Expand
...
1
2
3
4
5
...