The usage of heterogeneous devices presents two main problems. One is their complex programming, a problem that grows when multiple devices are used. The second issue is that even if the codes for these devices can be portable on top of OpenCL, they lack performance portability, effectively requiring specialized implementations for each device to get good performance. In this paper we extend the Heterogeneous Programming Library (HPL), which improves the usability of heterogeneous systems on top of OpenCL, to better handle both issues. First, we provide HPL with mechanisms to support the implementation of any multi-device application that requires arbitrary patterns of communication between several devices and a host memory. In a second stage HPL is improved with an adaptive scheme to optimize communications between devices depending on the execution environment. An evaluation using benchmarks with very different nature shows that HPL reduces the SLOCs and programming effort of OpenCL applications by 27 and 43 %, respectively, while improving the performance of applications that exchange data between devices by 28 % on average.