Daniel Szer

Learn More
We present multi-agent A* (MAA*), the first complete and optimal heuristic search algorithm for solving decentralized partially-observable Markov decision problems (DEC-POMDPs) with finite horizon. The algorithm is suitable for computing optimal plans for a cooperative group of agents that operate in a stochastic environment such as multi-robot(More)
In the domain of decentralized Markov decision processes, we develop the first complete and optimal algorithm that is able to extract deterministic policy vectors based on finite state controllers for a cooperative team of agents. Our algorithm applies to the discounted infinite horizon case and extends best-first search methods to the domain of(More)
Self-Organizing maps (SOM) have become popular for tasks in data visualization, pattern classification or natural language processing and can be seen as one of the major concepts for artificial neural networks of today. Their general idea is to approximate a high dimensional and previously unknown input distribution by a lower dimensional neural network(More)
We present a new algorithm for cooperative reinforcement learning in multiagent systems. Our main concern is the correct coordination between the members of the team: We seek to obtain an optimal solution for the team as a whole while keeping the learning as much decentralized as possible. We consider autonomous and independently learning agents that do not(More)
In the following paper we present a new algorithm for cooperative reinforcement learning in multi-agent systems. We consider autonomous and independently learning agents, and we seek to obtain an optimal solution for the team as a whole while keeping the learning as much decentralized as possible. Coordination between agents occurs through communication,(More)
We present a novel planning algorithm for building reactive and situated multi-agent systems based on the theory of decentralized Markov decision processes (DEC-POMDPs). The algorithm is a synthesis of multi-agent dynamic programming for partially observable stochas-tic games (POSGs), and point-based approximations for single-agent POMDPs. We are able to(More)
  • 1