Comparing Optimization Methods of Neural Networks for Real-time Inference

Mir Khan, Henri Lunnikivi, Heikki Huttunen, Jani Boutellier

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

1 Citation (Scopus)

Abstract

This paper compares three different optimization approaches for accelerating the inference of convolutional neural networks (CNNs). We compare the techniques of separable convolution, weight pruning, and binarization. Each method is implemented and empirically compared in three aspects: preservation of accuracy, storage requirements, and achieved speed-up. Experiments are performed both on a desktop computer and on a mobile platform using a CNN model for vehicle type classification. Our experiments show that the largest speed-up is achieved by binarization, whereas pruning achieves the largest reduction in storage requirements. Both of these approaches largely preserve the accuracy of the original network.
Original languageEnglish
Title of host publication2019 27th European Signal Processing Conference (EUSIPCO)
PublisherIEEE
Number of pages5
ISBN (Electronic)978-9-0827-9703-9
ISBN (Print)978-1-5386-7300-3
DOIs
Publication statusPublished - 3 Sept 2019
Publication typeA4 Article in conference proceedings
EventEuropean Signal Processing Conference -
Duration: 1 Jan 1900 → …

Publication series

NameEuropean Signal Processing Conference
ISSN (Print)2219-5491
ISSN (Electronic)2076-1465

Conference

ConferenceEuropean Signal Processing Conference
Period1/01/00 → …

Publication forum classification

  • Publication forum level 1

Fingerprint

Dive into the research topics of 'Comparing Optimization Methods of Neural Networks for Real-time Inference'. Together they form a unique fingerprint.

Cite this