Language Bootstrapping: Learning Word Meanings From Perception–Action Association

@article{Salvi2012LanguageBL,
  title={Language Bootstrapping: Learning Word Meanings From Perception–Action Association},
  author={Giampiero Salvi and Luis Montesano and Alexandre Bernardino and Jos{\'e} Santos-Victor},
  journal={IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics)},
  year={2012},
  volume={42},
  pages={660-671}
}
We address the problem of bootstrapping language acquisition for an artificial system similarly to what is observed in experiments with human infants. Our method works by associating meanings to words in manipulation tasks, as a robot interacts with objects and listens to verbal descriptions of the interactions. The model is based on an affordance network, i.e., a mapping between robot actions, robot perceptions, and the perceived effects of these actions upon objects. We extend the affordance… Expand
Grounding speech utterances in robotics affordances: An embodied statistical language model
We propose an embodied statistical language model for learning the meaning of word sequences, which incorporate action words (i.e. push, pull, tap, etc.,). The model attempts to unify embodiedExpand
A joint model of word segmentation and meaning acquisition through cross-situational learning.
TLDR
It is argued that XSL is not just a mechanism for word-to-meaning mapping, but that it provides strong cues for proto-lexical word segmentation, and results from simulations show that the model is not only capable of replicating behavioral data on word learning in artificial languages, but also shows effective learning of word segments and their meanings from continuous speech. Expand
A computational model of early language acquisition from audiovisual experiences of young infants
TLDR
A neural network model that can learn word segments and their meanings from referentially ambiguous acoustic input is described and shows that beginnings of lexical knowledge may indeed emerge from individually ambiguous learning scenarios. Expand
Towards Understanding Object-Directed Actions: A Generative Model for Grounding Syntactic Categories of Speech Through Visual Perception
  • A. Aly, T. Taniguchi
  • Computer Science
  • 2018 IEEE International Conference on Robotics and Automation (ICRA)
  • 2018
TLDR
The proposed probabilistic framework investigates unsupervised Part-of-Speech (POS) tagging to determine syntactic categories of words so as to infer grammatical structure of language and is successfully evaluated through interaction experiments between a human user and Toyota HSR robot in space. Expand
Beyond the Self: Using Grounded Affordances to Interpret and Describe Others’ Actions
TLDR
A developmental approach that allows a robot to interpret and describe the actions of human agents by reusing previous experience and is a step toward providing robots with the fundamental skills to engage in social collaboration with humans is proposed. Expand
Language is Not About Language: Towards Formalizing the Role of Extra-Linguistic Factors in Human and Machine Language Acquisition and Communication
TLDR
This paper argues that a major obstacle towards a more comprehensive picture of LA is the lack of a unified conceptual framework that would capture the full extent of factors critical to language learning in real world contexts, and that the authors should pursue such a framework in order to be able to place individual behavioral studies and computational models into a mutually compatible context. Expand
Grounding Language by Continuous Observation of Instruction Following
TLDR
This work explores learning word and utterance meanings by continuous observation of the actions of an instruction follower, showing that semantics useful for incremental application, as required in natural dialogue, might also be better acquired from incremental settings. Expand
Disentanglement in conceptual space during sensorimotor interaction
TLDR
The results show that using similar VAE models is a promising way to learn the concepts, and thereby to learning the causal relationship of the sensorimotor interaction, in the affordance learning setting. Expand
Robot Learning of Gestures , Language and Affordances
A growing field in robotics and Artificial Intelligence (AI) research is human–robot collaboration, whose target is to enable effective teamwork between humans and robots. However, in many situationsExpand
The learning of adjectives and nouns from affordance and appearance features
TLDR
This work evaluates three different models for learning adjectives and nouns using features obtained from the appearance and affordances of an object, through cross-validated training as well as through testing on novel objects, and indicates that shape-related adjectives arebest learned using features related to affordances, whereas nouns are best learned using appearance features. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 33 REFERENCES
Affordance based word-to-meaning association
TLDR
It is shown that the robot is able to form useful word-to-meaning associations, even without considering grammatical structure in the learning process and in the presence of recognition errors, and to incorporate context in the speech recognition task. Expand
Developmental Word Acquisition and Grammar Learning by Humanoid Robots Through a Self-Organizing Incremental Neural Network
TLDR
A new approach for online incremental word acquisition and grammar learning by humanoid robots is presented, using no data set provided in advance, and which grounds language in a physical context, as mediated by its perceptual capacities. Expand
A multimodal learning interface for grounding spoken language in sensory perceptions
TLDR
A multimodal interface that learns to associate spoken language with perceptual features by being situated in users' everyday environments and sharing user-centric multisensory information. Expand
Grounded Situation Models for Robots: Where words and percepts meet
  • N. Mavridis, D. Roy
  • Computer Science
  • 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems
  • 2006
TLDR
A novel contribution of the approach is the robot's ability for seamless integration of both language- and sensor-derived information about the situation, allowing bidirectional translation between sensory-derived data/expectations and linguistic descriptions. Expand
A unified model of early word learning: Integrating statistical and social cues
TLDR
It is argued that statistical and social cues can be seamlessly integrated to facilitate early word learning and a unified model is presented that is able to make use of different kinds of embodied social cues in the statistical learning framework. Expand
How many words can my robot learn?: An approach and experiments with one-class learning
TLDR
The results indicate that the robot’s representations are capable of incrementally evolving by correcting class descriptions, based on instructor feedback to classification results, which is comparable to those obtained by other authors. Expand
A computational study of cross-situational techniques for learning word-to-meaning mappings
TLDR
This paper presents a computational study of part of the lexical-acquisition task faced by children, namely the acquisition of word-to-meaning mappings, and presents an implemented algorithm for solving this problem, illustrating its operation on a small example. Expand
Object schemas for grounding language in a responsive robot
An approach is introduced for physically grounded natural language interpretation by robots that reacts appropriately to unanticipated physical changes in the environment and dynamically assimilatesExpand
A Bayesian Framework for Cross-Situational Word-Learning
TLDR
A Bayesian model of cross-situational word learning is presented and an extension of this model that also learns which social cues are relevant to determining reference is extended, finding it performs better than competing models. Expand
Lexicon acquisition based on object-oriented behavior learning
TLDR
A system for lexicon acquisition through behavior learning based on a modified multi-module reinforcement learning system that is able to automatically associate words to objects with various visual features based on similarities in affordances or in functions is presented. Expand
...
1
2
3
4
...