• Corpus ID: 85481287

Vehicle Community Strategies

@article{Resnick2018VehicleCS,
  title={Vehicle Community Strategies},
  author={Cinjon Resnick and Ilya Kulikov and Kyunghyun Cho and Jason Weston},
  journal={ArXiv},
  year={2018},
  volume={abs/1804.07178}
}
Interest in emergent communication has recently surged in Machine Learning. The focus of this interest has largely been either on investigating the properties of the learned protocol or on utilizing emergent communication to better solve problems that already have a viable solution. Here, we consider self-driving cars coordinating with each other and focus on how communication influences the agents' collective behavior. Our main result is that communication helps (most) with adverse conditions. 
Learning Existing Social Conventions via Observationally Augmented Self-Play
TLDR
It is observed that augmenting MARL with a small amount of imitation learning greatly increases the probability that the strategy found by MARL fits well with the existing social convention, even in an environment where standard training methods very rarely find the true convention of the agent's partners.
Learning Existing Social Conventions in Markov Games
TLDR
It is observed that augmenting MARL with a small amount of imitation learning greatly increases the probability that the strategy found by MARL fits well with the existing social convention, even in an environment where standard training methods very rarely find the true convention of the agent's partners.
Pommerman: A Multi-Agent Playground
TLDR
Pommerman, a multi-agent environment based on the classic console game Bomberman, consists of a set of scenarios, each having at least four players and containing both cooperative and competitive aspects.
Maximum Entropy Population Based Training for Zero-Shot Human-AI Coordination
TLDR
This work considers the problem of training a Reinforcement Learning agent without using any human data, i.e., in a zero-shot setting, to make it capable of collaborating with humans, and derives a centralized population entropy objective to facilitate learning of a diverse population of agents.
Learning Social Conventions in Markov Games
TLDR
It is shown that adding a small amount of imitation learning during self-play training greatly increases the probability that the strategy found by self- play fits well with the social convention the agent will face at test time, even in an environment where standard independent multi-agent RL very rarely finds the correct test-time equilibrium.

References

SHOWING 1-10 OF 17 REFERENCES
Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Networks
TLDR
Empirical results on two multi-agent learning problems based on well-known riddles are presented, demonstrating that DDRQN can successfully solve such tasks and discover elegant communication protocols to do so, the first time deep reinforcement learning has succeeded in learning communication protocols.
Learning to Communicate with Deep Multi-Agent Reinforcement Learning
TLDR
By embracing deep neural networks, this work is able to demonstrate end-to-end learning of protocols in complex environments inspired by communication riddles and multi-agent computer vision problems with partial observability.
Learning Multiagent Communication with Backpropagation
TLDR
A simple neural model is explored, called CommNet, that uses continuous communication for fully cooperative tasks and the ability of the agents to learn to communicate amongst themselves is demonstrated, yielding improved performance over non-communicative agents and baselines.
Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments
TLDR
An adaptation of actor-critic methods that considers action policies of other agents and is able to successfully learn policies that require complex multi-agent coordination is presented.
Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving
TLDR
This paper applies deep reinforcement learning to the problem of forming long term driving strategies and shows how policy gradient iterations can be used without Markovian assumptions, and decomposes the problem into a composition of a Policy for Desires and trajectory planning with hard constraints.
MultiNet: Multi-Modal Multi-Task Learning for Autonomous Driving
TLDR
This paper describes a technique for learning multiple distinct behavioral modes in a single deep neural network through the use of multi-modal multi-task learning, denoted MultiNet, and studies the effectiveness of this approach using self-driving model cars for driving in unstructured environments such as sidewalks and unpaved roads.
Query-Efficient Imitation Learning for End-to-End Autonomous Driving
TLDR
An extension of the DAgger, called SafeDAgger, is proposed that is query-efficient and more suitable for end-to-end autonomous driving and observes a significant speed up in convergence, which is conjecture to be due to the effect of automated curriculum learning.
Proximal Policy Optimization Algorithms
We propose a new family of policy gradient methods for reinforcement learning, which alternate between sampling data through interaction with the environment, and optimizing a "surrogate" objective
Emergent Communication in a Multi-Modal, Multi-Step Referential Game
TLDR
A novel multi-modal, multi-step referential game, where the sender and receiver have access to distinct modalities of an object, and their information exchange is bidirectional and of arbitrary duration is proposed.
End to End Learning for Self-Driving Cars
TLDR
A convolutional neural network is trained to map raw pixels from a single front-facing camera directly to steering commands and it is argued that this will eventually lead to better performance and smaller systems.
...
1
2
...