Acceso abierto

An Efficient Technique for Size Reduction of Convolutional Neural Networks after Transfer Learning for Scene Recognition Tasks


A complex classification task as scene recognition is considered in the present research. Scene recognition tasks are successfully solved by the paradigm of transfer learning from pretrained convolutional neural networks, but a problem is that the eventual size of the network is huge despite a common scene recognition task has up to a few tens of scene categories. Thus, the goal is to ascertain possibility of a size reduction. The modelling recognition task is a small dataset of 4485 grayscale images broken into 15 image categories. The pretrained network is AlexNet dealing with much simpler image categories whose number is 1000, though. This network has two fully connected layers, which can be potentially reduced or deleted. A regular transfer learning network occupies about 202.6 MB performing at up to 92 % accuracy rate for the scene recognition. It is revealed that deleting the layers is not reasonable. The network size is reduced by setting a fewer number of filters in the 17th and 20th layers of the AlexNet-based networks using a dichotomy principle or similar. The best truncated network with 384 and 192 filters in those layers performs at 93.3 % accuracy rate, and its size is 21.63 MB.