FPGA-based convolutional neural network accelerator design using high level synthesize
Deep convolutional neural networks (CNN) is highly efficient in image recognition tasks such as MNIST digit recognition. Accelerators based on FPGA platform are proposed since general purpose processor is disappointing in terms of performance when dealing with recognition tasks. Recently, an optimized FPGA-based accelerator design (work 1) has been proposed claiming best performance compared with existing implementations. But as the author acknowledged, performance could be better if fixed point presentation and computation elements had been used. Inspired by its methodology in implementing the Alexnet convolutional neural network, we implement a 5-layer accelerator for MNIST digit recognition task using the same Vivado HLS tool but using 11-bits fixed point precision on a Virtex7 FPGA. We compare performance on FPGA platform with the performance of the target CNN on MATLAB/CPU platform; we reach a speedup of 16.42. Our implementation runs at 150MHz and reaches a peak performance of 16.58 GMACS. Since our target CNN is simpler, we use much less resource than work 1 has used.