• Corpus ID: 202750398

Learning to Seek: Autonomous Source Seeking with Deep Reinforcement Learning Onboard a Nano Drone Microcontroller

  title={Learning to Seek: Autonomous Source Seeking with Deep Reinforcement Learning Onboard a Nano Drone Microcontroller},
  author={Bardienus Pieter Duisterhof and Srivatsan Krishnan and Jonathan J. Cruz and Colby R. Banbury and William Fu and Aleksandra Faust and Guido C.H.E. de Croon and Vijay Janapa Reddi},
Fully autonomous navigation using nano drones has numerous applications in the real world, ranging from search and rescue to source seeking. Nano drones are well-suited for source seeking because of their agility, low price, and ubiquitous character. Unfortunately, their constrained form factor limits flight time, sensor payload, and compute capability. These challenges are a crucial limitation for the use of source-seeking nano drones in GPS-denied and highly cluttered environments. Hereby, we… 
Resource Efficient Deep Reinforcement Learning for Acutely Constrained TinyML Devices
  • 2020
The use of Deep Reinforcement Learning (Deep RL) in many resource constrained mobile systems has been limited in scope due to the severe resource consumption (e.g., memory, computation, energy) such
Quantifying the design-space tradeoffs in autonomous drones
This work formalizes the fundamental drone subsystems and finds how computations impact this design space and presents a design-space exploration of autonomous drone systems and quantifies how to provide productive solutions.
Learning-Based Bias Correction for Time Difference of Arrival Ultra-Wideband Localization of Resource-Constrained Mobile Robots
A robust UWB TDOA localization framework comprising of (i) learning-based bias correction and (ii) M-estimation-based robust filtering to handle outliers is proposed, which is computationally efficient enough to run on resource-constrained hardware.
Flocking Towards the Source: Indoor Experiments with Quadrotors
An algorithm and experimental results are presented to address the source-seeking problem in which a group of small quadrotors should cooperatively find and flock towards a source of an underlying spatial scalar field without any centralized information.
Continuous Control in Bearing-only Localization
In this work we model localization in the context of source seeking with drones. Specifically, we show how an autonomous drone fitted with radio antennas can minimize uncertainty over belief of a
Analyzing and Improving Fault Tolerance of Learning-Based Navigation Systems
This paper experimentally evaluates the resilience of navigation systems with respect to algorithms, fault models and data types from both RL training and inference and proposes two efficient fault mitigation techniques that achieve success rate and quality-of-flight improvement.
Stopping criteria for ending autonomous, single detector radiological source searches
Two stopping criteria are investigated in this work for a machine learning navigated system: one based upon Bayesian and maximum likelihood estimation (MLE) strategies commonly used in source localization, and a second providing the navigational machine learning network with a “stop search" action.
TCN Mapping Optimization for Ultra-Low Power Time-Series Edge Inference
An automated exploration approach and a library of optimized kernels to map TCNs on Parallel Ultra-Low Power (PULP) microcontrollers minimizes latency and energy by exploiting a layer tiling optimizer to jointly find the tiling dimensions and select among alternative implementations of the causal and dilated 1D-convolution operations at the core of TCNs.


Ultra Low Power Deep-Learning-powered Autonomous Nano Drones
This work presents the first vertically integrated system for fully autonomous deep neural network-based navigation on nano-size UAVs, based on GAP8, a novel parallel ultra-low-power computing platform, and deployed on a 27 g commercial, open-source CrazyFlie 2.0 nano-quadrotor.
Air Learning: An AI Research Platform for Algorithm-Hardware Benchmarking of Autonomous Aerial Robots
This work introduces Air Learning, an AI research platform for benchmarking algorithm-hardware performance and energy efficiency trade-offs in autonomous unmanned aerial vehicles (UAVs), and focuses in particular on deep reinforcement learning (RL) interactions in UAVs.
Low-Level Control of a Quadrotor With Deep Model-Based Reinforcement Learning
This is the first use of MBRL for controlled hover of a quadrotor using only on-board sensors, direct motor input signals, and no initial dynamics knowledge, and the controller leverages rapid simulation of a neural network forward dynamics model on a graphic processing unit enabled base station.
Generalization through Simulation: Integrating Simulated and Real Data into Deep Reinforcement Learning for Vision-Based Autonomous Flight
This work investigates how data from both simulation and the real world can be combined in a hybrid deep reinforcement learning algorithm, and uses real-world data to learn about the dynamics of the system, and simulated data tolearn a generalizable perception system that can enable the robot to avoid collisions using only a monocular camera.
Soft Actor-Critic Algorithms and Applications
Soft Actor-Critic (SAC), the recently introduced off-policy actor-critic algorithm based on the maximum entropy RL framework, achieves state-of-the-art performance, outperforming prior on-policy and off- policy methods in sample-efficiency and asymptotic performance.
Reinforcement Learning for UAV Attitude Control
This work developed an open source high-fidelity simulation environment to train a flight controller attitude control of a quadrotor through RL, and used this environment to compare their performance to that of a PID controller to identify if using RL is appropriate in high-precision, time-critical flight control.
PRM-RL: Long-range Robotic Navigation Tasks by Combining Reinforcement Learning and Sampling-Based Planning
This work presents PRM-RL, a hierarchical method for long-range navigation task completion that combines sampling-based path planning with reinforcement learning (RL), and evaluates it on two navigation tasks with non-trivial robot dynamics.
QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation
QT-Opt is introduced, a scalable self-supervised vision-based reinforcement learning framework that can leverage over 580k real-world grasp attempts to train a deep neural network Q-function with over 1.2M parameters to perform closed-loop, real- world grasping that generalizes to 96% grasp success on unseen objects.
Combining Q-Learning with Artificial Neural Networks in an Adaptive Light Seeking Robot
This paper investigates an alternative implementation of q-learning in which an artificial neural network is used as a function approximator and the need for an explicit table is eliminated.
Particle Swarm Optimization-Based Source Seeking
A planner for a swarm of mobile agents that try to locate an unknown electromagnetic source is developed and a complete solution to ensure the effectiveness of PSO in complex environments where collisions may occur is proposed.