Learn More
To acquire one's native phonological system, language-specific phonological categories and relationships must be extracted from the input. The acquisition of the categories and relationships has each in its own right been the focus of intense research. However, it is remarkable that research on the acquisition of categories and the relations between them(More)
We summarize the accomplishments of a multi-disciplinary workshop exploring the computational and scientific issues surrounding zero resource (unsupervised) speech technologies and related models of early language acquisition. Centered around the tasks of phonetic and lexical discovery, we consider unified evaluation metrics, present two new approaches for(More)
We report on an architecture for the unsupervised discovery of talker-invariant subword embeddings. It is made out of two components: a dynamic-time warping based spoken term discovery (STD) system and a Siamese deep neural network (DNN). The STD system clusters word-sized repeated fragments in the acoustic streams while the DNN is trained to minimize the(More)
The basic representational hypothesis in phonology is that segments are coded using a universal set of discrete features. We propose a method for quantitatively measuring how well such features align with arbitrary segment representations. We assess artic-ulatory, spectral, and phonotactic representations of English consonants. Our procedure constructs a(More)
In this paper we introduce an inductive bias for language acquisition under a view where learning of the various levels of linguistic structure takes place interactively. The bias encourages the learner to choose sound systems that lead to more " semantically coherent " lexicons. We quantify this coherence using an intrinsic and unsupervised measure of(More)
1 Modals express two broad types of meanings: root meanings like abilities (1a), obligations (1b) or intentions (1c), and epistemic meanings like educated guesses (2a) or evidence-based inferences (2b). (1) a. Olivia can speak Spanish. b. Erin has to be home before 11pm. c. I'm going to go to the shops before working. (2) a. Terence might speak French,(More)
We investigate the idea that the languages of the world have developed coherent sound systems in which having one sound increases or decreases the chances of having certain other sounds, depending on shared properties of those sounds. We investigate the geometries of sound systems that are defined by the inherent properties of sounds. We document three(More)
Recent works have explored deep architectures for learning multimodal speech representation (e.g. audio and images, ar-ticulation and audio) in a supervised way. Here we investigate the role of combining different speech modalities, i.e. audio and visual information representing the lips? movements, in a weakly supervised way using Siamese networks and(More)
To acquire one's native phonological system, language-specific phonological categories and relationships must be extracted from the input. The acquisition of the categories and relationships have each in their own right been the focus of intense research. However, it is remarkable that research on the acquisition of categories and the relations between them(More)
This paper reports on an automated and openly available tool for automatic acoustic analysis and transcription of primate calls, which takes raw field recordings and outputs call labels time-aligned with the audio. The system's output predicts a majority of the start times of calls accurately within 200 milliseconds. The tools do not require any manual(More)
  • 1