A novel merging network architecture is proposed for a handshake join operator in order to achieve much higher data throughput than ever before. Handshake join is a highly parallelized algorithm for window-based stream joins. Result collection performed by a merging network is a significant design issue for the handshake join operator because the merging network becomes an overwhelming bottleneck for scalable performance. To address the issue, an adaptive merging network is proposed for hardware implementation of the algorithm. The proposed architecture is implemented on an FPGA and it is evaluated in terms of the hardware resource usage, the maximum clock frequency, and the performance. Experimental results demonstrate up to 16.3 times higher throughput than nested loops-style join implementation without dropping any tuples. To the best of our knowledge, this is the best performance for handshake join operator implemented on an FPGA.