Learn More
We present``Transcriber'', a tool for assisting in the creation of speech corpora, and describe some aspects of its development and use. Transcriber was designed for the manual segmentation and transcription of long duration broadcast news recordings, including annotation of speech turns, topics and acoustic conditions. It is highly portable, relying on the(More)
Acknowledgments I owe my thanks to a number of people, each of whom contributed in their own way towards this research and in the preparation of this document. First of all, I thank Prof. Aravind Joshi for his continued support during the period of this research. I have beneeted signiicantly from his deep insights and his passion for subtle details which(More)
"Linguistic annotation" covers any descriptive or analytic notations applied to raw language data. The basic data may be in the form of time functions – audio, video and/or physiological recordings – or it may be textual. The added notations may include transcriptions of all sorts (from phonetic features to discourse structures), part-of-speech and sense(More)
The problem of quantitatively comparing tile performance of different broad-coverage grammars of En-glish has to date resisted solution. Prima facie, known English grammars appear to disagree strongly with each other as to the elements of even tile simplest sentences. For instance, the grammars of Steve Abneying), Don tfindle (AT&T), Bob Ingria (BBN), and(More)
We describe a formal model for annotating linguistic artifacts, from which we derive an application programming interface (API) to a suite of tools for manipulating these annotations. The abstract logical model provides for a range of storage formats and promotes the reuse of tools that interact through this API. We focus first on " Annotation Graphs, " a(More)
Acknowledgments I am deeply indebted to my advisor, Mark Steedman, for motivating this dissertation, and for his patient guidance and thoughtful advice during my y ears as a graduate student. By his example, Mark taught me what it means to conduct scientiic research and provided me with the skills to think like a computer scientist, a linguist and a(More)
Speech activity detection (SAD) is an important first step in speech processing. Commonly used methods (e.g., frame-level classification using gaussian mixture models (GMMs)) work well under stationary noise conditions, but do not generalize well to domains such as YouTube, where videos may exhibit a diverse range of environmental conditions. One solution(More)
perception of the sounds that convey phonetic structure – one finds two very different views of its relation to language. The more conventional holds that speech is merely a vehicle, bearing no organic relationship to the linguistic baggage it carries. On that view, speech is produced and perceived by processes that are not specialized for language but(More)