Learn More
This paper presents Recurrent Policy Gradients, a model-free reinforcement learning (RL) method creating limited-memory sto-chastic policies for partially observable Markov decision problems (POMDPs) that require long-term memories of past observations. The approach involves approximating a policy gradient for a Recurrent Neural Network (RNN) by(More)
Automatically classifying terrain such as rocks, sand and gravel from images is a challenging machine vision problem. In addition to human designed approaches, a great deal of progress has been made using machine learning techniques to perform classification from images. In this work, we demonstrate the first known use of Cartesian Genetic Programming (CGP)(More)
Reinforcement learning for partially observable Markov decision problems (POMDPs) is a challenge as it requires policies with an internal state. Traditional approaches suffer significantly from this shortcoming and usually make strong assumptions on the problem domain such as perfect system models, state-estimators and a Markovian hidden system. Recurrent(More)
We present curiosity-driven, autonomous acquisition of tactile exploratory skills on a biomimetic robot finger equipped with an array of microelectromechanical touch sensors. Instead of building tailored algorithms for solving a specific tactile task, we employ a more general curiosity-driven reinforcement learning approach that autonomously learns a set of(More)
We use a Katana robotic arm to teach an iCub humanoid robot how to perceive the location of the objects it sees. To do this, the Katana positions an object within the shared workspace, and tells the iCub where it has placed it. While the iCub moves it observes the object, and a neural network then learns how to relate its pose and visual inputs to the(More)
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Abstract We present a combined machine learning and computer vision approach for robots to localize objects. It allows our iCub humanoid to quickly learn to provide accurate 3D position estimates (in the centimetre range) of objects(More)
We describe a new algorithm for robot localization, efficient both in terms of memory and processing time. It transforms a stream of laser range sensor data into a probabilistic calculation of the robot's position , using a bidirectional Long Short-Term Memory (LSTM) recurrent neural network (RNN) to learn the structure of the environment and to answer(More)
—In this paper, we propose a novel architecture for wireless sensor network testbeds, called MOTEL. The main novelty compared to existing architectures is the possibility to include mobile sensor nodes. To support mobility, we deal with two main challenges: controlled mobility of sensor nodes, and the need to operate sensor nodes in the absence of a(More)
Swarm robotics systems are characterized by decentralized control, limited communication between robots, use of local information, and emergence of global behavior. Such systems have shown their potential for flexibility and robustness [1]-[3]. However, existing swarm robotics systems are by and large still limited to displaying simple proof-of-concept(More)
Business-driven development favors the construction of process models at different abstraction levels and by different people. As a consequence, there is a demand for consolidating different versions of process models by detecting and resolving differences. Existing approaches rely on the existence of a change log which logs the changes when changing a(More)