Learn More
This paper presents a new method for automatically generating abbreviations for Chi-nese organization names. Abbreviations are commonly used in spoken Chinese, especially for organization names. The generation of Chinese abbreviation is much more complex than English abbreviations, most of which are acronyms and truncations. The abbreviation generation(More)
Position Specific Posterior Lattices (PSPL) have been recently proposed as very powerful, compact structures for indexing speech. In this paper, we take PSPL one step further to Subword-based Position Specific Posterior Lattices (S-PSPL). As with PSPL, we include posterior probabilities and proximity information, but we base this information on subword(More)
In this paper we analytically compare the two widely accepted approaches of spoken document indexing, Position Specific Posterior Lattices (PSPL) and Confusion Network (CN), in terms of retrieval accuracy and index size. The fundamental distinctions between these two approaches in terms of construction units, posterior probabilities, number of clusters,(More)
This paper describes our system for " NEWS 2009 Machine Transliteration Shared Task " (NEWS 2009). We only participated in the standard run, which is a direct orthographical mapping (DOP) between two languages without using any intermediate phonemic mapping. We propose a new two-step conditional random field (CRF) model for DOP machine transliteration, in(More)
Word-based consensus networks have been verified to be very useful in minimizing word error rates (WER) for large vocabulary continuous speech recognition for western languages. By considering the special structure of Chinese language, this paper points out that character-based rather then word-based consensus networks should work better for Chinese(More)
In this paper, we present a new formulation and a new framework for a new type of dialogue system, referred to as the type-II dialogue systems in this paper. The distinct feature of such dialogue systems is their tasks of information access from unstructured knowledge sources, or the lack of a well-organized back-end database offering the information for(More)
Microsoft ABSTRACT We present a contextual spoken language understanding (con-textual SLU) method using Recurrent Neural Networks (RNNs). Previous work has shown that context information, specifically the previously estimated domain assignment, is helpful for domain identification. We further show that other context information such as the previously(More)
This paper presents a new approach of latent semantic retrieval of spoken documents over Position Specific Posterior Lattices(PSPL). This approach performs concept matching instead of literal term matching during retrieval based on the Probabilistic Latent Semantic Analysis (PLSA), so as to solve the problem of term mismatch between the query and the(More)