Jan-Matthis Niermann
Resource-efficient Training of Convolutional Neural Networks on FPGAs

Abstract
Recent work has focused on utilizing FPGAs to accelerate Deep Neural Networks (DNNs). While most of them focus on inference tasks, only a few have also implement training. The approaches which have implemented training, are usually not open source. STANN does implement training, but only for dense layers and this makes the training of Convolutional Neural Networks (CNNs) impossible. Therefore, an implementation for convolutional layers as an enhancement to the STANN library is needed. The implementation aims to build a foundation that enables a synthesizable implementation for high-level synthesis (HLS). It needs to implement the functions needed for backpropagation as well as for updates of filters and biases. The proposed implementation is able to perform the training task of a convolutional layer. Evaluation shows that the results are comparable to those of a PyTorch implementation. The average deviation per element is less than 0.00005 %.