Corpus ID: 237433834

IndicBART: A Pre-trained Model for Natural Language Generation of Indic Languages

  title={IndicBART: A Pre-trained Model for Natural Language Generation of Indic Languages},
  author={Raj Dabre and Himani Shrotriya and Anoop Kunchukuttan and Ratish Puduppully and Mitesh M. Khapra and Pratyush Kumar},
In this paper we present IndicBART, a multilingual, sequence-to-sequence pre-trained model focusing on 11 Indic languages and English. Different from existing pre-trained models, IndicBART utilizes the orthographic similarity between Indic scripts to improve transfer learning between similar Indic languages. We evaluate IndicBART on two NLG tasks: Neural Machine Translation (NMT) and extreme summarization. Our experiments on NMT for 12 language pairs and extreme summarization for 7 languages… Expand

Tables from this paper


Improving Neural Machine Translation Models with Monolingual Data
This work pairs monolingual training data with an automatic back-translation, and can treat it as additional parallel training data, and obtains substantial improvements on the WMT 15 task English German, and for the low-resourced IWSLT 14 task Turkish->English. Expand
Multilingual Neural Machine Translation with Knowledge Distillation
One model is enough to handle multiple languages, with comparable or even better accuracy than individual models, in this distillation-based approach to boost the accuracy of multilingual machine translation. Expand
Exploiting Language Relatedness for Low Web-Resource Language Model Adaptation: An Indic Languages Study
It is argued that relatedness among languages in a language family may be exploited to overcome some of the corpora limitations of LRLs, and proposed RelateLM, which uses transliteration to convert the unseen script of limited LRL text into the script of a Related Prominent Language (RPL) (Hindi in this case). Expand
Leveraging Orthographic Similarity for Multilingual Neural Transliteration
This work proposes a modified neural encoder-decoder model that maximizes parameter sharing across language pairs in order to effectively leverage orthographic similarity in transliteration involving related tasks. Expand
SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing
SentencePiece, a language-independent subword tokenizer and detokenizer designed for Neural-based text processing, finds that it is possible to achieve comparable accuracy to direct subword training from raw sentences. Expand
Investigating Multilingual NMT Representations at Scale
This work attempts to understand massively multilingual NMT representations using Singular Value Canonical Correlation Analysis (SVCCA), a representation similarity framework that allows us to compare representations across different languages, layers and models. Expand
Unsupervised Cross-lingual Representation Learning at Scale
It is shown that pretraining multilingual language models at scale leads to significant performance gains for a wide range of cross-lingual transfer tasks, and the possibility of multilingual modeling without sacrificing per-language performance is shown for the first time. Expand
Unsupervised Neural Machine Translation
This work proposes a novel method to train an NMT system in a completely unsupervised manner, relying on nothing but monolingual corpora, and consists of a slightly modified attentional encoder-decoder model that can be trained on monolingUAL corpora alone using a combination of denoising and backtranslation. Expand
NICT’s Participation in WAT 2018: Approaches Using Multilingualism and Recurrently Stacked Layers
This paper described all NMT systems for the following translation tasks the authors participated in and noted that a single multilingual/bidirectional model (without ensembling) has the potential to achieve (near) stateof-the-art results for all the language pairs. Expand
Transfer Learning across Low-Resource, Related Languages for Neural Machine Translation
The experiments show that transfer learning helps word-based translation only slightly, but when used on top of a much stronger BPE baseline, it yields larger improvements of up to 4.3 BLEU. Expand