Self-Supervised Acquisition of Vowels in American English

  title={Self-Supervised Acquisition of Vowels in American English},
  author={Michael H. Coen},
This paper presents a self-supervised framework for perceptual learning based upon correlations in different sensory modalities. We demonstrate this with a system that has learned the vowel structure of American English – i.e., the number of vowels and their phonetic descriptions – by simultaneously watching and listening to someone speak. It is highly non-parametric, knowing neither the number of vowels nor their input distributions in advance, and it has no prior linguistic knowledge. This… CONTINUE READING