A Discriminative Entity-Aware Language Model for Virtual Assistants

  title={A Discriminative Entity-Aware Language Model for Virtual Assistants},
  author={Mandana Saebi and Ernest Pusateri and Aaksha Meghawat and Christophe Van Gysel},
High-quality automatic speech recognition (ASR) is essential for virtual assistants (VAs) to work well. However, ASR often performs poorly on VA requests containing named entities. In this work, we start from the observation that many ASR errors on named entities are inconsistent with real-world knowledge. We extend previous discriminative n-gram language modeling approaches to incorporate real-world knowledge from a Knowledge Graph (KG), using features that capture entity type-entity and… 

Figures and Tables from this paper

Space-Efficient Representation of Entity-centric Query Language Models

This work introduces a deterministic approximation to probabilistic grammars that avoids the explicit expansion of non-terminals at model creation time, integrates directly with the FST framework, and is complementary to n-gram models.

Listen, Know and Spell: Knowledge-Infused Subword Modeling for Improving ASR Performance of OOV Named Entities

The Knowledge-Infused Subword Model (KISM) is proposed, a novel technique for incorporating semantic context from KGs into the ASR pipeline for improving the performance of OOV named entities.

Training Large-Vocabulary Neural Language Models by Private Federated Learning for Resource-Constrained Devices

Partial Embedding Updates (PEU) is proposed, a novel technique to decrease noise by decreas-ing payload size and Low Rank Adaptation and Noise Contrastive Estimation are adopted to reduce the memory demands of large models on compute-constrained devices.



Predicting Entity Popularity to Improve Spoken Entity Recognition by Virtual Assistants

A method that uses historical user interactions to forecast which entities will gain in popularity and become trending, and it subsequently integrates the predictions within the Automated Speech Recognition (ASR) component of the VA is introduced.

Composition-based on-the-fly rescoring for salient n-gram biasing

A technique for dynamically applying contextually-derived language models to a state-of-the-art speech recognition system and a construction algorithm which takes a trie representing the contextual n-grams and produces a weighted finite state automaton which is more compact than a standard n- gram machine is introduced.

Semantic Lattice Processing in Contextual Automatic Speech Recognition for Google Assistant

This paper uses Named Entity Recognition (NER) to identify and boost contextually relevant paths in order to improve speech recognition accuracy, and uses broad semantic classes comprising millions of entities, such as songs and musical artists, to tag relevant semantic entities in the lattice.

Barack’s Wife Hillary: Using Knowledge Graphs for Fact-Aware Language Modeling

This work introduces the knowledge graph language model (KGLM), a neural language model with mechanisms for selecting and copying facts from a knowledge graph that are relevant to the context that enable the model to render information it has never seen before, as well as generate out-of-vocabulary tokens.

Bringing contextual information to google speech recognition

This paper utilizes an on-the-fly rescoring mechanism to adjust the LM weights of a small set of n-grams relevant to the particular context during speech decoding, which handles out of vocabulary words.

Unsupervised language model adaptation

  • M. BacchianiBrian Roark
  • Computer Science
    2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).
  • 2003
Unsupervised language model adaptation, from ASR transcripts, shows an error rate reduction of 3.9% over the unadapted baseline performance, from 28% to 24.1%, using 17 hours of unsupervised adaptation material.

Voice search language model adaptation using contextual information

The main objective is to automatically, in real time, take advantage of all available sources of contextual information to improve ASR quality using these sources of context.

Discriminative Language Modeling with Conditional Random Fields and the Perceptron Algorithm

This paper compares two parameter estimation methods: the perceptron algorithm, and a method based on conditional random fields (CRFs), which have the benefit of automatically selecting a relatively small feature set in just a couple of passes over the training data.

Latent Relation Language Models

A class of language models that parameterizes the joint distribution over the words in a document and the entities that occur therein via knowledge graph relations is proposed, able to annotate the posterior probability of entity spans for a given text through relations.

Scalable Language Model Adaptation for Spoken Dialogue Systems

This paper proposes a solution to estimate n-gram counts directly from the hand-written grammar for training LMs and uses constrained optimization to optimize the system parameters for future use cases, while not degrading the performance on past usage.