Feature Diversity in Neural Networks: Theory and Algorithms

Firas Laakom

Tutkimustuotos: VäitöskirjaCollection of Articles

Abstrakti

The main strength of neural networks lies in their ability to generalize to unseen data. ‘Why and when do they generalize well?’ are two extremely important questions for a full understanding of this phenomenon and for developing better and more robust models. Several studies have explored these questions from different perspectives and proposed multiple measures/bounds that correlate well with generalization. The dissertation proposes a new perspective by focusing on the ‘feature diversity’ within the hidden layers. From this standpoint, neural networks are seen as a two-stage process, with the first stage being feature (representation) learning through the intermediate layers, followed by the final prediction layer. Empirically, it has been observed that learning a rich and diverse set of features is critical for achieving top performance. Yet, no theoretical justification exists. In this dissertation, we tackle this problem by theoretically analyzing the effect of the features’ diversity on the generalization performance. Specifically, we derive several Rademacher-based rigorous bounds for neural networks in different contexts and we demonstrate that, indeed, having more diverse features correlates well with better generalization performance. Moreover, inspired by these theoretical findings, we propose a new set of data-dependent diversity-inducing regularizers and we present an extensive empirical study confirming that the proposed regularizers enhance the performance of several state-of-the-art neural network models in multiple tasks. Beyond standard neural networks, we also explore different diversity-promoting strategies in different contexts, e.g., Energy-Based Models, autoencoders, and bag-of-features pooling layers and we show that learning diverse features helps consistently.
AlkuperäiskieliEnglanti
JulkaisupaikkaTampere
KustantajaTampere University
ISBN (elektroninen)978-952-03-3355-3
ISBN (painettu)978-952-03-3354-6
TilaJulkaistu - 2024
OKM-julkaisutyyppiG5 Artikkeliväitöskirja

Julkaisusarja

NimiTampere University Dissertations - Tampereen yliopiston väitöskirjat
Vuosikerta984
ISSN (painettu)2489-9860
ISSN (elektroninen)2490-0028

Sormenjälki

Sukella tutkimusaiheisiin 'Feature Diversity in Neural Networks: Theory and Algorithms'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.

Siteeraa tätä