The Fisher Corpus: a Resource for the Next Generations of Speech-to-Text
@inproceedings{Cieri2004TheFC, title={The Fisher Corpus: a Resource for the Next Generations of Speech-to-Text}, author={Christopher Cieri and David Miller and Kevin Walker}, booktitle={International Conference on Language Resources and Evaluation}, year={2004} }
This paper describes, within the context of the DARPA EARS program, the design and implementation of the Fisher protocol for collecting conversational telephone speech which has yielded more than 16,000 English conversations. It also discusses the Quick Transcription specification that allowed 2000 hours of Fisher audio to be transcribed in less than one year. Fisher data is already in use within the DARPA EARS programs and will be published via the Linguistic Data Consortium for general use…
Figures from this paper
520 Citations
Development of a speech-to-text transcription system for Finnish
- Computer ScienceSLTU
- 2010
This paper describes the development of a speech-to-text transcription system for the Finnish language, carried out without any detailed manual transcriptions, relying instead on several sources of audio and textual data found on the web.
IMS-Speech: A Speech to Text Tool
- Computer ScienceArXiv
- 2019
The IMS-Speech is a web based tool for German and English speech transcription aiming to facilitate research in various disciplines which require accesses to lexical information in spoken language materials and is freely available for academic researchers.
Techniques for rapid and robust topic identification of conversational telephone speech
- Computer ScienceINTERSPEECH
- 2009
A modified TF-IDF feature weighting calculation is presented that provides significant robustness under various recognition error conditions and observes classifiers incorporating confidence information to be significantly more robust to errors than those treating output as unweighted text.
Transcription of Russian conversational speech
- Computer ScienceSLTU
- 2012
Initial work in transcribing conversational telephone speech in Russian using acoustic seed models derived from other languages achieves results comparable to those obtained with models trained on the small conversation telephone speech corpus.
Semi-Supervised Model Training for Unbounded Conversational Speech Recognition
- Computer ScienceArXiv
- 2017
This work proposes a technique to construct a modern, high quality conversational speech training corpus on the order of hundreds of millions of utterances (or tens of thousands of hours) for both acoustic and language model training.
Development of a Korean speech recognition system with little annotated data
- LinguisticsSLTU
- 2014
This paper investigates the development of a speech-totext transcription system for the Korean language in the context of the DGA RAPID Rapmat project to assess the influence of the vocabulary size, the type of language model, the acoustic unit, as well as incremental batch vs iterative decoding of the untranscribed audio corpus.
The 2007 AMI(DA) System for Meeting Transcription
- Computer ScienceCLEAR
- 2007
This paper describes the development and system architecture of the 2007 AMIDA meeting transcription system, the third of such systems developed in a collaboration of six research sites and showed very competitive performance.
Generative Spoken Dialogue Language Modeling
- Computer ScienceArXiv
- 2022
dGSLM is introduced, the first “textless” model able to generate audio samples of naturalistic spoken dialogues and reproduces more naturalistic anduid turn taking compared to a text-based cascaded model.
Adapting Lexical and Language Models for Transcription of Highly Spontaneous Spoken Czech
- Computer ScienceTSD
- 2010
Transitions between the most frequent colloquial words and their counterparts in formal Czech are introduced to solve the data sparsity problem when computing a probabilistic language model.
Statistical parametric speech synthesis using conversational data and phenomena
- Computer Science
- 2017
The synthesis of filled pause synthesis is investigated in relation to specific phonetic modelling of filled pauses and through techniques for the mixing of standard prompts with spontaneous utterances in order to retain the higher quality of standard speech based voices while still utilising the spontaneous speech for filled pause modelling.
References
SHOWING 1-4 OF 4 REFERENCES
From switchboard to fisher: telephone collection protocols, their uses and yields
- ChemistryINTERSPEECH
- 2003
In a process for producing a color television picture tube which comprises at least the step of coating phosphor slurries onto the inner surface of a panel to form a phosphor layer, the step of…
Phonological Atlas of North America, http://www.ling.upenn.edu/phono_atlas/home.html
- 2004
Phonological Atlas of North America, http://www.ling.upenn.edu/phono_atlas/home.html Linguistic Data ConsortiumCatalog National Institute of Standards and Technologies
- Phonological Atlas of North America, http://www.ling.upenn.edu/phono_atlas/home.html Linguistic Data ConsortiumCatalog National Institute of Standards and Technologies
- 2004
Catalog National Institute of Standards and Technologies
- Telephone Collection Protocols, their Uses and Yields, Proceedings of EuroSpeech
- 2003