Learn More
We propose improved Deep Neural Network (DNN) training loss functions for more accurate single keyword spotting on resource-constrained embedded devices. The loss function modifications consist of a combination of multi-task training and weighted cross entropy. In the multi-task architecture, the keyword DNN acoustic model is trained with two tasks in(More)
and many other staff at Jobs for Youth/Chicago, all of whom provided essential assistance in the assembly of data for this research. The views expressed in this paper are those of the authors and do not necessarily reflect those who provided funding or assistance to this project. IRP publications(discussion papers, special reports, and the newsletter Focus)(More)
and many other staff at Jobs for Youth/Chicago, all of whom provided essential assistance in the assembly of data for this research. Historically, few programs have provided job training services with a family orientation, even with the more recent welfare reform law's clear goals of encouraging marriage and two-parent families. We present results of a(More)
This paper presents Cisco's speaker segmentation and recognition (SSR) system, which is a part of a commercial product. Cisco SSR uses speaker segmentation and speaker recognition algorithms with a crowd sourcing approach to create speaker metadata. The speaker metadata makes the enterprise videos more accessible and more navigable by itself, and by its(More)
  • 1