## Abstrakti

Over the last decade, artificial neural networks, especially deep neural networks, have emerged as the main modeling tool in Machine Learning, allowing us to tackle an increasing number of real-world problems in various fields, most notably, in computer vision, natural language processing, biomedical and financial analysis. The success of deep neural networks can be attributed to many factors, namely the increasing amount of data available, the developments of dedicated hardware, the advancements in optimization techniques, and especially the invention of novel neural network architectures. Nowadays, state-of-the-arts neural networks that achieve the best performance in any field are usually formed by several layers, comprising millions, or even billions of parameters. Despite spectacular performances, optimizing a single state-of- the-arts neural network often requires a tremendous amount of computation, which can take several days using high-end hardware. More importantly, it took several years of experimentation for the community to gradually discover effective neural network architectures, moving from AlexNet, VGGNet, to ResNet, and then DenseNet. In addition to the expensive and time-consuming experimentation process, deep neural networks, which require powerful processors to operate during the deployment phase, cannot be easily deployed to mobile or embedded devices. For these reasons, improving the design, training, and deployment of deep neural networks has become an important area of research in the Machine Learning field.

This thesis makes several contributions in the aforementioned research area, which can be grouped into two main categories. The first category consists of research works that focus on designing efficient neural network architectures not only in terms of accuracy but also computational complexity. In the first contribution under this category, the computational efficiency is first addressed at the filter level through the incorporation of a handcrafted design for convolutional neural networks, which are the basis of most deep neural networks. More specifically, the multilinear convolution filter is proposed to replace the linear convolution filter, which is a fundamental element in a convolutional neural network. The new filter design not only better captures multidimensional structures inherent in CNNs but also requires far fewer parameters to be estimated. While using efficient algebraic transforms and approximation techniques to tackle the design problem can significantly reduce the memory and computational footprint of neural network models, this approach requires a lot of trial and error. In addition, the simple neuron model used in most neural networks nowadays, which only performs a linear transformation followed by a nonlinear activation, cannot effectively mimic the diverse activities of biological neurons. For this reason, the second and third contributions transition from a handcrafted, manual design approach to an algorithmic approach in which the type of transformations performed by each neuron as well as the topology of neural networks are optimized in a systematic and completely data-dependent manner. As a result, the algorithms proposed in the second and third contributions are capable of designing highly accurate and compact neural networks while requiring minimal human efforts or intervention in the design process.

Despite significant progress has been made to reduce the runtime complexity of neural network models on embedded devices, the majority of them have been demonstrated on powerful embedded devices, which are costly in applications that require large-scale deployment such as surveillance systems. In these scenarios, complete on-device processing solutions can be infeasible. On the contrary, hybrid solutions, where some preprocessing steps are conducted on the client side while the heavy computation takes place on the server side, are more practical. The second category of contributions made in this thesis focuses on efficient learning methodologies for hybrid solutions that take into ac- count both the signal acquisition and inference steps. More concretely, the first contribution under this category is the formulation of the Multilinear Compressive Learning framework in which multidimensional signals are compressively acquired, and inference is made based on the compressed signals, bypassing the signal reconstruction step. In the second contribution, the relationships be- tween the input signal resolution, the compression rate, and the learning performance of Multilinear Compressive Learning systems are empirically analyzed systematically, leading to the discovery of a surrogate performance indicator that can be used to approximately rank the learning performances of different sensor configurations without conducting the entire optimization process. Nowadays, many communication protocols provide support for adaptive data transmission to maximize the data throughput and minimize energy consumption depending on the network’s strength. The last contribution of this thesis proposes an extension of the Multilinear Compressive Learning framework with an adaptive compression capability, which enables us to take advantage of the adaptive rate transmission feature in existing communication protocols to maximize the informational content throughput of the whole system.

Finally, all methodological contributions of this thesis are accompanied by extensive empirical analyses demonstrating their performance and computational advantages over existing methods in different computer vision applications such as object recognition, face verification, human activity classification, and visual information retrieval.

This thesis makes several contributions in the aforementioned research area, which can be grouped into two main categories. The first category consists of research works that focus on designing efficient neural network architectures not only in terms of accuracy but also computational complexity. In the first contribution under this category, the computational efficiency is first addressed at the filter level through the incorporation of a handcrafted design for convolutional neural networks, which are the basis of most deep neural networks. More specifically, the multilinear convolution filter is proposed to replace the linear convolution filter, which is a fundamental element in a convolutional neural network. The new filter design not only better captures multidimensional structures inherent in CNNs but also requires far fewer parameters to be estimated. While using efficient algebraic transforms and approximation techniques to tackle the design problem can significantly reduce the memory and computational footprint of neural network models, this approach requires a lot of trial and error. In addition, the simple neuron model used in most neural networks nowadays, which only performs a linear transformation followed by a nonlinear activation, cannot effectively mimic the diverse activities of biological neurons. For this reason, the second and third contributions transition from a handcrafted, manual design approach to an algorithmic approach in which the type of transformations performed by each neuron as well as the topology of neural networks are optimized in a systematic and completely data-dependent manner. As a result, the algorithms proposed in the second and third contributions are capable of designing highly accurate and compact neural networks while requiring minimal human efforts or intervention in the design process.

Despite significant progress has been made to reduce the runtime complexity of neural network models on embedded devices, the majority of them have been demonstrated on powerful embedded devices, which are costly in applications that require large-scale deployment such as surveillance systems. In these scenarios, complete on-device processing solutions can be infeasible. On the contrary, hybrid solutions, where some preprocessing steps are conducted on the client side while the heavy computation takes place on the server side, are more practical. The second category of contributions made in this thesis focuses on efficient learning methodologies for hybrid solutions that take into ac- count both the signal acquisition and inference steps. More concretely, the first contribution under this category is the formulation of the Multilinear Compressive Learning framework in which multidimensional signals are compressively acquired, and inference is made based on the compressed signals, bypassing the signal reconstruction step. In the second contribution, the relationships be- tween the input signal resolution, the compression rate, and the learning performance of Multilinear Compressive Learning systems are empirically analyzed systematically, leading to the discovery of a surrogate performance indicator that can be used to approximately rank the learning performances of different sensor configurations without conducting the entire optimization process. Nowadays, many communication protocols provide support for adaptive data transmission to maximize the data throughput and minimize energy consumption depending on the network’s strength. The last contribution of this thesis proposes an extension of the Multilinear Compressive Learning framework with an adaptive compression capability, which enables us to take advantage of the adaptive rate transmission feature in existing communication protocols to maximize the informational content throughput of the whole system.

Finally, all methodological contributions of this thesis are accompanied by extensive empirical analyses demonstrating their performance and computational advantages over existing methods in different computer vision applications such as object recognition, face verification, human activity classification, and visual information retrieval.

Alkuperäiskieli | Englanti |
---|---|

Julkaisupaikka | Tampere |

Kustantaja | Tampere University |

ISBN (elektroninen) | 978-952-03-2295-3 |

ISBN (painettu) | 978-952-03-2294-6 |

Tila | Julkaistu - 2022 |

OKM-julkaisutyyppi | G5 Artikkeliväitöskirja |

### Julkaisusarja

Nimi | Tampere University Dissertations - Tampereen yliopiston väitöskirjat |
---|---|

Vuosikerta | 553 |

ISSN (painettu) | 2489-9860 |

ISSN (elektroninen) | 2490-0028 |