• Publications
  • Influence
How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models
It is found that replacing the original multilingual tokenizer with the specialized monolingual tokenizer improves the downstream performance of the multilingual model for almost every task and language. Expand
PuzzLing Machines: A Challenge on Learning From Small Data
This work introduces a challenge on learning from small data, PuzzLing Machines, which consists of Rosetta Stone puzzles from Linguistic Olympiads for high school students, and shows that both simple statistical algorithms and state-of-the-art deep neural models perform inadequately on this challenge. Expand