• Computer Science
  • Published in ArXiv 2019

CamemBERT: a Tasty French Language Model

@article{Martin2019CamemBERTAT,
  title={CamemBERT: a Tasty French Language Model},
  author={Louis Martin and Benjamin Muller and Pedro Javier Ortiz Su{\'a}rez and Yoann Dupont and Laurent Romary and 'Eric Villemonte de la Clergerie and Djam{\'e} Seddah and Beno{\^i}t Sagot},
  journal={ArXiv},
  year={2019},
  volume={abs/1911.03894}
}
Pretrained language models are now ubiquitous in Natural Language Processing. Despite their success, most available models have either been trained on English data or on the concatenation of data in multiple languages. This makes practical use of such models --in all languages except English-- very limited. Aiming to address this issue for French, we release CamemBERT, a French version of the Bi-directional Encoders for Transformers (BERT). We measure the performance of CamemBERT compared to… CONTINUE READING

Citations

Publications citing this paper.
SHOWING 1-3 OF 3 CITATIONS

FlauBERT: Unsupervised Language Model Pre-training for French

VIEW 7 EXCERPTS
CITES RESULTS, BACKGROUND & METHODS
HIGHLY INFLUENCED

FQuAD: French Question Answering Dataset

VIEW 7 EXCERPTS
CITES BACKGROUND & METHODS
HIGHLY INFLUENCED

References

Publications referenced by this paper.
SHOWING 1-10 OF 46 REFERENCES

ROBERTA: A ROBUSTLY OPTIMIZED BERT PRE-

  • 2019
VIEW 8 EXCERPTS
HIGHLY INFLUENTIAL

75 Languages, 1 Model: Parsing Universal Dependencies Universally

VIEW 5 EXCERPTS
HIGHLY INFLUENTIAL

Contextual String Embeddings for Sequence Labeling

VIEW 4 EXCERPTS
HIGHLY INFLUENTIAL

Adam: A Method for Stochastic Optimization

VIEW 3 EXCERPTS
HIGHLY INFLUENTIAL

Bidirec - tional LSTM - CRF models for sequence tagging

  • M. Joshi, D. Chen, +3 authors O. Levy
  • Spanbert : Improving pre - training by representing and predicting spans
  • 2019
VIEW 1 EXCERPT