In Long Term Evolution(LTE) networks one of the main focuses is on automating the network optimization. This is done through so called Self Organizing Network (SON) functions like Mobility Load Balancing (MLB), Mobility Robustness Optimization(MRO) and others. A SON instance is a realization of a SON function that governs (optimizes) one or a cluster of eNBs. The SON functions are built in a standalone manner, i.e. without considering the existence of other SON instances. So they do not necessarily operate in a coordinated fashion, especially in a network where different SON instances may come from different vendors. Thus we face a risk of generating conflicts and instabilities in the network and so this raises the need for a SON COordinator (SONCO) . The SONCO, built from an operator point of view, sees the SON instances as black boxes and has a very limited amount of information on them. In these conditions the SONCO has to solve conflicts and improve the network stability. In this paper we propose a Reinforcement Learning (RL) based solution for coordinating SON instances that run independently on neighboring eNBs and we provide results for a case study with MLB instances. We analyze the scalability of our solution and we provide numerical results showing how improvements in network stability can be obtained.