Model compression methods for convolutional neural networks

Tutkimustuotos: Diplomityö tai pro gradu -työ

Abstrakti

Deep learning has been found to be an effective solution to many problems in the field of computer vision. Convolutional networks have been a particularly successful model for computer vision. Convolutional neural networks extract feature maps from an image, then use the feature maps to determine to which of the preset categories the image belongs. Convolutional neural networks can be trained on a powerful machine, and then deployed onto a target device for inference. Computing inference has become feasible on mobile phones and IoT edge devices. However, these devices come with constraints like reduced processing resources, smaller memory caches, decreased memory bandwidth. To make computing inference practical on these devices, the effectiveness of various model compression methods is evaluated quantitatively. Methods are evaluated by applying them on a simple convolutional neural network for optical vehicle classification. Convolutional layers are separated into component vectors for a reduction in inference time on CPU, GPU, and an embedded target. Fully connected layers are pruned and retuned in combination with regularization and dropout. Pruned layers are compressed using a sparse matrix format. All optimizations are tested on three platforms with varying capabilities. Separation of convolutional layers improves latency of the whole model by 3.00× on a CPU platform. Using a sparse format on a pruned model with a large fully connected layer improves latency of the whole model by 2.01× on desktop with a GPU and by 1.82× on the embedded platform. On average, pruning the model allows 39.1× reduction in total model size while causing a 1.67 %-point reduction in accuracy, when dropout is used to control overfitting. This allows for a vehicle classifier to fit in 180 kB of memory with reasonable reduction in accuracy.
AlkuperäiskieliEnglanti
KustantajaTampere University
Sivumäärä61
TilaJulkaistu - 2019
Julkaistu ulkoisestiKyllä
OKM-julkaisutyyppiG2 Pro gradu, diplomityö, ylempi amk-opinnäytetyö

Sormenjälki

Sukella tutkimusaiheisiin 'Model compression methods for convolutional neural networks'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.

Siteeraa tätä