Spatial Concept Acquisition for a Mobile Robot That Integrates Self-Localization and Unsupervised Word Discovery From Spoken Sentences

@article{Taniguchi2016SpatialCA,
  title={Spatial Concept Acquisition for a Mobile Robot That Integrates Self-Localization and Unsupervised Word Discovery From Spoken Sentences},
  author={Akira Taniguchi and Tadahiro Taniguchi and Tetsunari Inamura},
  journal={IEEE Transactions on Cognitive and Developmental Systems},
  year={2016},
  volume={8},
  pages={285-297}
}
In this paper, we propose a novel unsupervised learning method for the lexical acquisition of words related to places visited by robots, from human continuous speech signals. We address the problem of learning novel words by a robot that has no prior knowledge of these words except for a primitive acoustic model. Furthermore, we propose a method that allows a robot to effectively use the learned words and their meanings for self-localization tasks. The proposed method is nonparametric Bayesian… 
Spatial concept-based navigation with human speech instructions via probabilistic inference on Bayesian generative model
TLDR
The aim of this study is to enable a mobile robot to perform navigational tasks with human speech instructions via probabilistic inference on a Bayesian generative model using spatial concepts, based on a control-as-inference framework.
Online spatial concept and lexical acquisition with simultaneous localization and mapping
TLDR
An online learning algorithm based on a Rao-Blackwellized particle filter for spatial concept acquisition and mapping and a novel method integrating SpCoA and FastSLAM in the theoretical framework of the Bayesian generative model.
Semantic Mapping Based on Spatial Concepts for Grounding Words Related to Places in Daily Environments
TLDR
A novel statistical semantic mapping method called SpCoMapping is proposed, which integrates probabilistic spatial concept acquisition based on multimodal sensor information and a Markov random field applied for learning the arbitrary shape of a place on a map.
Cross-Situational Learning with Bayesian Generative Models for Multimodal Category and Word Learning in Robots
TLDR
A Bayesian generative model that can form multiple categories based on each sensory-channel and can associate words with any of the four sensory-channels and show that the robot could successfully use the word meanings learned by using the proposed method.
Hierarchical Spatial Concept Formation Based on Multimodal Information for Human Support Robots
TLDR
Results verify that, relative to comparable baseline methods, the proposed method enables a robot to predict location names and position categories closer to predictions made by humans.
Hierarchical Bayesian model for the transfer of knowledge on spatial concepts based on multimodal information
TLDR
Experimental results demonstrated that the proposed hierarchical Bayesian model that enables a robot to transfer the knowledge of places from experienced environments to a new environment has a higher prediction accuracy of location names and positions than the conventional method owing to the transfer of knowledge.
Unsupervised learning for spoken word production based on simultaneous word and phoneme discovery without transcribed data
TLDR
An unsupervised machine learning method for spoken word production that does not use any transcribed data is proposed and can produce many spoken words that can be recognized as the original words provided by the human speakers.
Learning Relationships Between Objects and Places by Multimodal Spatial Concept with Bag of Objects
TLDR
A Bayesian probabilistic model that can automatically model and estimate the probability of objects existing in each place using a multimodal spatial concept based on the co-occurrence of objects is proposed.
From Continuous Observations to Symbolic Concepts: A Discrimination-Based Strategy for Grounded Concept Learning
TLDR
This paper introduces a novel methodology for grounded concept learning that allows for incremental learning, needs few data points, and that the resulting concepts are general enough to be applied to previously unseen objects and can be combined compositionally.
...
...

References

SHOWING 1-10 OF 39 REFERENCES
Learning Place-Names from Spoken Utterances and Localization Results by Mobile Robot
TLDR
Improvements are described of the method for learning both a phoneme sequence of each word and a distribution of objects that the word represents, which can learn the words that represent a single object but cannot learn the Words that represent classes of objects.
Multimodal concept and word learning using phoneme sequences with errors
TLDR
It is experimentally demonstrated that the proposed method for concept formation and word acquisition for robots can improve the accuracy of word segmentation and reduce the phoneme recognition error and that the obtained words enhance the categorization accuracy.
Online learning of concepts and words using multimodal LDA and hierarchical Pitman-Yor Language Model
TLDR
A particle filter is introduced, which significantly improve the performance of the online MLDA, and an unsupervised word segmentation method based on hierarchical Pitman-Yor Language Model (HPYLM) is introduced.
Bayesian Learning of a Language Model from Continuous Speech
TLDR
Experimental results on natural, adult-directed speech demonstrate that LMs built using only continuous speech are able to significantly reduce ASR phoneme error rates, and the proposed technique of joint Bayesian learning of lexical units and an LM over lattices is shown to significantly contribute to this improvement.
Multimodal categorization by hierarchical dirichlet process
TLDR
A nonparametric Bayesian framework for categorizing multimodal sensory signals such as audio, visual, and haptic information by robots that provides a probabilistic framework for inferring object properties from limited observations is proposed.
Mutual learning of an object concept and language model based on MLDA and NPYLM
TLDR
This work proposes a stochastic model of language and concepts, and knowledge is learnt by estimating the model parameters using a nested Pitman-Yor language model and multimodal latent Dirichlet allocation to acquire the language and concept.
Iterative Bayesian word segmentation for unsupervised vocabulary discovery from phoneme lattices
TLDR
This paper proposes a computationally efficient iterative approach, which alternates between the following two steps: first, the most probable string is extracted from the lattice using a phoneme LM learned on the segmentation result of the previous iteration, and second, word segmentation is performed on the extracted string using a word and phonemeLM which is learned alongside the new segmentation.
Learning words from sights and sounds: a computational model
Learning lexicons from spoken utterances based on statistical model selection
TLDR
Experimental results show that the unsupervised learning of lexicons from pairs of a spoken utterance and an object as its meaning without any a priori linguistic knowledge other than a phoneme acoustic model.
...
...