Cross-lingual Distillation for Text Classification

  title={Cross-lingual Distillation for Text Classification},
  author={Ruochen Xu and Yiming Yang},
Cross-lingual text classification(CLTC) is the task of classifying documents written in different languages into the same taxonomy of categories. This paper presents a novel approach to CLTC that builds on model distillation, which adapts and extends a framework originally proposed for model compression. Using soft probabilistic predictions for the documents in a label-rich language as the (induced) supervisory labels in a parallel corpus of documents, we train classifiers successfully for new… 

Figures and Tables from this paper

A Robust Self-Learning Framework for Cross-Lingual Text Classification
This paper presents an elegantly simple robust self-learning framework to include unlabeled non-English samples in the fine-tuning process of pretrained multilingual representation models and observes significant gains in effectiveness on document and sentiment classification for a range of diverse languages.
Cross-Lingual Text Classification with Minimal Resources by Transferring a Sparse Teacher
This work proposes a cross-lingual teacher-student method, CLTS, that generates “weak” supervision in the target language using minimal cross-lingsual resources, in the form of a small number of word translations.
Cross-lingual Text Classification with Heterogeneous Graph Neural Network
This paper proposes a simple yet effective method to incorporate heterogeneous information within and across languages for cross-lingual text classification using graph convolutional networks (GCN), which significantly outperforms state-of-the-art models on all tasks.
NLPDove at SemEval-2020 Task 12: Improving Offensive Language Detection with Cross-lingual Transfer
A new metric, Translation Embedding Distance, is proposed to measure the transferability of instances for cross-lingual data selection and various preprocessing steps tailored for social media text along with methods to fine-tune the pre-trained multilingual BERT (mBERT) for offensive language identification.
Improving Cross-lingual Text Classification with Zero-shot Instance-Weighting
This paper proposes zero-shot instance-weighting, a general model-agnostic zero- shot learning framework for improving CLTC by leveraging source instance weighting, which adds a module on top of pre-trained language models for similarity computation of instance weights, thus aligning each source instance to the target language.
Bridging the domain gap in cross-lingual document classification
It is shown that addressing the domain gap is crucial in XLU and state-of-the-art cross-lingual methods are combined with recently proposed methods for weakly supervised learning such as unsupervised pre-training and unsuper supervised data augmentation to simultaneously close both the language gap and thedomain gap.
Cross-Lingual Unsupervised Sentiment Classification with Multi-View Transfer Learning
This paper proposes an unsupervised cross-lingual sentiment classification model named multi-view encoder-classifier (MVEC) that leverages an un supervised machine translation (UMT) system and a language discriminator that significantly outperforms other models for 8/11 sentiment classification tasks.
A Comparative Analysis of Unsupervised Language Adaptation Methods
It is shown that adversarial training methods are more suitable when the source and target language datasets contain other variations in content besides the language shift, and sentence encoder alignment methods are very effective and can yield scores on the target language that are close to the source language scores.
Cross-Lingual Text Classification with Multilingual Distillation and Zero-Shot-Aware Training
This paper proposes MBLM (multi-branch multilingual language model), a model built on the MPLMs with multiple language branches that improves both the supervised and zero-shot performance of MPL Ms.
Reinforced Iterative Knowledge Distillation for Cross-Lingual Named Entity Recognition
The NER techniques reported in this paper are on their way to become a fundamental component for Web ranking, Entity Pane, Answers Triggering, and Question Answering in the Microsoft Bing search engine.


Transductive Representation Learning for Cross-Lingual Text Classification
  • Yuhong Guo, Min Xiao
  • Computer Science
    2012 IEEE 12th International Conference on Data Mining
  • 2012
A transductive subspace representation learning method to address domain adaptation for cross-lingual text classifications using a nonnegative matrix factorization problem and solved using an iterative optimization procedure.
Cross Language Text Classification via Subspace Co-regularized Multi-view Learning
A novel subspace co-regularized multi-view learning method built on parallel corpora produced by machine translation that jointly minimizes the training error of each classifier in each language while penalizing the distance between the subspace representations of parallel documents.
Cross-lingual Text Classification via Model Translation with Limited Dictionaries
Two new approaches that combines unsupervised word embedding in different languages, supervised mapping of embedded words across languages, and probabilistic translation of classification models are proposed that show significant performance improvement in CLTC.
Cross-Language Text Classification Using Structural Correspondence Learning
We present a new approach to cross-language text classification that builds on structural correspondence learning, a recently proposed theory for domain adaptation. The approach uses unlabeled
Cross-Lingual Mixture Model for Sentiment Classification
This paper proposes a generative cross-lingual mixture model (CLMM) to leverage unlabeled bilingual parallel data and learns previously unseen sentiment words from the large bilingual Parallel data and improves vocabulary coverage significantly.
Is Machine Translation Ripe for Cross-Lingual Sentiment Classification?
It is argued that the cross-lingual adaptation problem is qualitatively different from other (monolingual) adaptation problems in NLP; thus new adaptation algorithms ought to be considered.
Adversarial Deep Averaging Networks for Cross-Lingual Sentiment Classification
An Adversarial Deep Averaging Network (ADAN1) is proposed to transfer the knowledge learned from labeled data on a resource-rich source language to low-resource languages where only unlabeled data exist.
Recurrent Convolutional Neural Networks for Text Classification
A recurrent convolutional neural network is introduced for text classification without human-designed features to capture contextual information as far as possible when learning word representations, which may introduce considerably less noise compared to traditional window-based neural networks.
Attention-based LSTM Network for Cross-Lingual Sentiment Classification
An attention-based bilingual representation learning model which learns the distributed semantics of the documents in both the source and the target languages and proposes a hierarchical attention mechanism for the bilingual LSTM network.
Cross-Lingual Sentiment Classification with Bilingual Document Representation Learning
This study proposes a representation learning approach which simultaneously learns vector representations for the texts in both the source and the target languages and shows that BiDRL outperforms the state-of-the-art methods for all the target language methods.