In reinforcement learning, an agent interacts with its environment by taking actions and receiving rewards for those actions. A good example for such a task is a robot trying to clean up a park. The agent has to interact with multiple different objects and other agents in the park. To learn a behaviour in such a task it needs to be able to represent the state of his surroundings based on the distribution of objects he sees. Similar challenges can be found in arcade games wherein agents have to interact and avoid with objects in their environment. The goal of this thesis is therefore to learn the behaviour of a game agent. The agent will be presented with a view of the world consisting of a number of colored points in a 2D plane. Interactions such as slaying enemies and collecting gold result in rewards for the agent. The agent then has to learn a policy based on the distributions of the different object types in its surroundings. To learn such a policy, we use fitted Q-iteration. The Q-function computation is based on a variant of random trees which was modified into a representation that captures the key elements and conditions for action selection. We evaluate the parametrization of the approach and achieve better results than the standard grid-based state representation. We also explored and evaluated different representations for providing the agent with important global information, e.g. the location of a treasure in the game.