• Publications
  • Influence
Federated Learning for Emoji Prediction in a Mobile Keyboard
TLDR
We show that a word-level recurrent neural network can predict emoji from text typed on a mobile keyboard using a distributed on-device learning framework called federated learning. Expand
  • 71
  • 2
  • PDF
Federated Evaluation of On-device Personalization
TLDR
We extend the federation framework to evaluate strategies for personalization of global models. Expand
  • 34
  • 1
  • PDF
Federated Learning of N-gram Language Models
TLDR
We propose algorithms to train production-quality n-gram language models using federated learning, which allows training models without user-typed text ever leaving devices. Expand
  • 21
  • 1
  • PDF
An Investigation Into On-device Personalization of End-to-end Automatic Speech Recognition Models
TLDR
We investigate the idea of securely training personalized end-to-end speech recognition models on mobile devices so that user data never leave the device and are never stored on a server. Expand
  • 12
  • 1
  • PDF
Personalization of End-to-End Speech Recognition on Mobile Devices for Named Entities
TLDR
We study the effectiveness of several techniques to personalize end-to-end speech models and improve the recognition of proper names relevant to the user. Expand
  • 9
  • PDF
Writing Across the World's Languages: Deep Internationalization for Gboard, the Google Keyboard
TLDR
We describe how and why we have been adding support for hundreds of language varieties from around the globe, and we describe the trends we see. Expand
  • 8
  • PDF
Training Production Language Models without Memorizing User Data
TLDR
This paper presents the first consumer-scale next-word prediction (NWP) model trained with Federated Learning (FL) while leveraging the Differentially Private Federated Averaging (DP-FedAvg) technique. Expand
  • 6
  • PDF
Understanding Unintended Memorization in Federated Learning
TLDR
A formal study to understand the effect of different components of canonical FL on unintended memorization in trained models, comparing with the central learning setting. Expand
  • 4
  • PDF
Low-Rank Gradient Approximation for Memory-Efficient on-Device Training of Deep Neural Network
TLDR
We propose approximating the gradient matrices of deep neural networks using a low-rank parameterization as an avenue to save training memory. Expand
  • 1
  • PDF
A Method to Reveal Speaker Identity in Distributed ASR Training, and How to Counter It
TLDR
We propose Hessian-Free Gradients Matching, an input reconstruction technique that operates without second derivatives of the loss function (required in prior works), which can be expensive to compute. Expand