• Corpus ID: 13859651

Analysis and Word Pronunciation in Text-to-speech Synthesis

@inproceedings{Liberman2013AnalysisAW,
  title={Analysis and Word Pronunciation in Text-to-speech Synthesis},
  author={Mark Y. Liberman and Kenneth Ward Church and T Bell and Mountain Ave. and Murray Hill},
  year={2013}
}
Text analysis includes such things as dividing the text into words and sentences, assigning syntactic categories to words, grouping the words within a sentence into phrases, identifying and expanding abbreviations, recognizing and analyzing expressions such as dates, fractions, and amounts of money, and so on. Word pronunciation is the problem of translating orthographic words -words in ordinary spelling -into phonological words -words whose sound is expressed in a sort of rationalized spelling… 
1 Citations

Tables from this paper

The Development of Pashto Speech Synthesis System

TLDR
A modular concatenative TTS system has been developed for the Pashto language based on data driven techniques such as Classification and Regression Tree (CART), Bigrams, and Non Uniform Units (NUUs).

References

SHOWING 1-10 OF 15 REFERENCES

From text to speech: the MITalk system

This book describes the most comprehensive system yet developed for the automatic conversion of English text to intelligible and natural sounding synthetic speech. It offers detailed accounts of the

Collins COBUILD English Language Dictionary

TLDR
This is a dictionary of English as it is actually used and is also written and presented in plain English, enabling easier and earlier use of a monolingual dictionary.

Stress Assignment in Letter to Sound Rules for Speech Synthesis

TLDR
This paper will discuss how to determine word stress from spelling, andressing Italian names with the Latin pattern yields amusing results as will be demonstrated.

The Automatic Grammatical Tagging of the LOB Corpus

TLDR
An account of the automatic grammatical tagging of the LOB (LancasterOslo/Bergen) Corpus of British English, with special reference to the methods of tagging the authors have adopted.

Three Models for the Description of Language

The grammar of a language is a device that describes the structure of that language. The grammar is comprised of a set of rules whose goal is twofold: first these rules can be used to create

Parallel Networks that Learn to Pronounce English Text

TLDR
H hierarchical clustering techniques applied to NETtalk reveal that these different networks have similar internal representations of letter-to-sound correspondences within groups of processing units, which suggests that invariant internal representations may be found in assemblies of neurons intermediate in size between highly localized and completely distributed representations.

Grammatical Category Disambiguation by Statistical Optimization

TLDR
An algorithm for disambiguation that is similar to CLAWS but that operates in linear rather than in exponential time and space, and which minimizes the unsystematic augments is presented.

On the Recognition of Printed Characters of Any Font and Size

We describe the current state of a system that recognizes printed text of various fonts and sizes for the Roman alphabet. The system combines several techniques in order to improve the overall

Estimation of probabilities from sparse data for the language model component of a speech recognizer

  • S. Katz
  • Computer Science
    IEEE Trans. Acoust. Speech Signal Process.
  • 1987
TLDR
The model offers, via a nonlinear recursive procedure, a computation and space efficient solution to the problem of estimating probabilities from sparse data, and compares favorably to other proposed methods.

A theory of syntactic recognition for natural language

TLDR
It will be shown that this 'determinism' hypothesis, explored within the context of the grammar of English, leads to a simple mechanism, a grammar interpreter.