Fearless Steps: Apollo-11 Corpus Advancements for Speech Technologies from Earth to the Moon

@inproceedings{Hansen2018FearlessSA,
  title={Fearless Steps: Apollo-11 Corpus Advancements for Speech Technologies from Earth to the Moon},
  author={John H. L. Hansen and Abhijeet Sangwan and Aditya Joglekar and Ahmet Emin Bulut and Lakshmish Kaushik and Chengzhu Yu},
  booktitle={INTERSPEECH},
  year={2018}
}
The Apollo Program is one of the most significant benchmarks for technology and innovation in human history. The previously introduced UTD-CRSS Fearless Steps initiative resulted in the digitization of the original analog audio tapes recorded during the Apollo Space Missions. The entire speech data for the Apollo 11 Mission is now being made publicly available with the release of the Fearless Steps Corpus. This corpus consists of a cumulative 19,000 hours of conversational speech spanning over… Expand
The 2019 Inaugural Fearless Steps Challenge: A Giant Leap for Naturalistic Audio
TLDR
The Fearless Steps Corpus, Challenge Tasks, their associated baseline systems, and results are described and insights gained by the CRSS-UTDallas team during the inaugural Fearless steps Challenge are provided. Expand
FEARLESS STEPS Challenge (FS-2): Supervised Learning with Massive Naturalistic Apollo Data
TLDR
This paper presents an overview of the challenge sub-tasks, data, performance metrics, and lessons learned from Phase-2 of the Fearless Steps Challenge (FS-2), and presents advancements made in FS-2 through extensive community outreach and feedback. Expand
Scenario Aware Speech Recognition: Advancements for Apollo Fearless Steps & CHiME-4 Corpora
  • Szu-Jui Chen, Wei Xia, John H.L. Hansen
  • Computer Science, Engineering
  • ArXiv
  • 2021
TLDR
It is demonstrated that triplet-loss based embedding performs better than i-Vector in acoustic modeling, confirming that the triplet loss is more effective than a speaker feature in ASR. Expand
Fearless Steps Challenge Phase-3 (FSC P3): Advancing SLT for Unseen Channel and Mission Data Across NASA Apollo Audio
The Fearless Steps Challenge (FSC) initiative was designed to host a series of progressively complex tasks to promote advanced speech research across naturalistic “Big Data” corpora. The Center forExpand
Speaker ID on Apollo 11 corpus: A Study using different Machine Learning Models
The main goal here is to match a voice sample from an unknown speaker to one of several labeled speaker models since speech is easily produced. For the feature extraction, Mel Frequency CepstrumExpand
The DKU Speech Activity Detection and Speaker Identification Systems for Fearless Steps Challenge Phase-02
TLDR
This paper describes the systems developed by the DKU team for the Fearless Steps Challenge Phase-02 competition, and employs the Deep ResNet vector system, which receives a variable-length feature sequence and directly generates speaker posteriors. Expand
Convolutional Recurrent Neural Networks for Speech Activity Detection in Naturalistic Audio from Apollo Missions
TLDR
A SAD solution based on Convolutional Recurrent Neural Networks (CRNN) presented as the ViVoLab submission to the 2020 Fearless steps challenge is described, featuring a novel architecture that executes the fusion of two convolutional feature maps by combining the information captured with 1D and 2D filters. Expand
"This is Houston. Say again, please". The Behavox system for the Apollo-11 Fearless Steps Challenge (phase II)
TLDR
Speech activity detection, speaker diarization, and automatic speech recognition experiments conducted by the Behavox team for the Interspeech 2020 Fearless Steps Challenge achieve a first-place ranking for SD and ASR and fourth-place for SAD in the challenge. Expand
EML Online Speech Activity Detection for the Fearless Steps Challenge Phase-III
TLDR
The proposed EML online algorithm can be trained both in a supervised and unsupervised manner and assigns speech and non-speech labels at runtime approximately every 0.1 sec and shows a competitive accuracy on both development and evaluation datasets with a real-time factor of about 0.002. Expand
SkipConvNet: Skip Convolutional Neural Network for Speech Dereverberation using Optimally Smoothed Spectral Mapping
TLDR
This study proposes `SkipConvNet' where each skip connection is replaced with multiple convolutional modules to provide decoder with intuitive feature maps rather than encoder's output to improve the learning capacity of the network. Expand
...
1
2
3
...

References

SHOWING 1-10 OF 22 REFERENCES
Multi-Channel Apollo Mission Speech Transcripts Calibration
TLDR
A novel method to improve the transcript quality by using Signal-to-Noise ratio of channels and N-Gram sentence similarity metrics across data channels is proposed and shows significant improvement in transcript quality of noisy channels. Expand
Speech activity detection for NASA apollo space missions: challenges and solutions
TLDR
This paper proposes a simple yet efficient technique which builds an alternative model of speech using data from a separate corpora, and embeds this new information within the Combo-SAD framework, and shows that the proposed approach has a major impact on SAD performance, especially in audio segments that contain sparse or no speech information. Expand
A study of voice production characteristics of astronuat speech during Apollo 11 for speaker modeling in space.
  • Chengzhu Yu, J. Hansen
  • Computer Science, Medicine
  • The Journal of the Acoustical Society of America
  • 2017
TLDR
It is suggested that successful applications of speaker recognition during extended space missions require robust speaker modeling techniques that could effectively adapt to voice production variation caused by diverse space conditions. Expand
'houston, we have a solution': using NASA apollo program to advance speech and language processing technology
TLDR
The speech and text aspects of the Apollo data are described while pointing out its applicability to several classical speech processing and natural language processing problems such as audio processing, speech and speaker recognition, information retrieval, document linking and a range of other processing tasks which enable knowledge search, retrieval, and understanding. Expand
'houston, we have a solution': a case study of the analysis of astronaut speech during NASA apollo 11 for long-term speaker modeling
TLDR
This study investigates the variations of astronaut’s voice characteristic during NASA Apollo 11 mission to take the first step towards automating large components of space missions with speech and language technology. Expand
Sentiment extraction from natural audio streams
TLDR
The results show that it is possible to perform sentiment analysis on natural spontaneous speech data despite poor WER (word error rates), and a tuning technique is proposed which dramatically reduces the number of model parameters in ME while retaining classification capability. Expand
Automatic sentiment extraction from YouTube videos
TLDR
This study proposes several enhancements including a better text-based sentiment model due to training on larger and more diverse dataset, an iterative scheme to reduce sentiment model complexity with minimal impact on performance accuracy, and better speech recognition due to superior acoustic modeling and focused (domain dependent) vocabulary/language models. Expand
Automatic Sentiment Detection in Naturalistic Audio
TLDR
The experimental results show that the proposed KWS based system significantly outperforms the traditional ASR architecture in detecting sentiment for challenging practical tasks. Expand
Curriculum Learning Based Approaches for Noise Robust Speaker Recognition
TLDR
This study introduces a novel class of curriculum learning (CL) based algorithms for noise robust speaker recognition at two stages within a state-of-the-art speaker verification system: at the i-Vector extractor estimation and at the probabilistic linear discriminant (PLDA) back-end. Expand
Toward Access to Multi-Perspective Archival Spoken Word Content
TLDR
Initial experiments with integration of multi-channel audio into a mission reconstruction system are described, and work in progress on the development of more advanced user experience designs is described. Expand
...
1
2
3
...