Performance comparison of full-batch BP and mini-batch BP algorithm on Spark framework
The traditional BP neural network training method processes the training dataset serially on one machine, so the efficiency is quite low. The massive data that need to be explored brings great challenge for BP neural network. The traditional serial training method of BP neural network will encounter many problems, such as costing too much time and insufficient memory to finish the training process. To solve these problems, this paper proposes a new parallel training method that is based on MapReduce and genetic algorithm, and the new training method is called MR-GAIBP (MapReduce based Genetic Algorithm Improved Back Propagation). MR-GAIBP includes two parts: MapReduce based BP algorithm and MapReduce based genetic algorithm. Genetic algorithm is first iterated for a few times to find appropriate initial weights of BP neural network, then BP algorithm is used to find the appropriate weights that meets the requirement. In the phase of BP algorithm, local iteration is used to speed up the convergence. Experiment results demonstrate that MR-GAIBP has faster convergence rate and higher accuracy compared with the previous MapReduce based algorithm proposed in other papers.