Feature Diversity in Neural Networks: Theory and Algorithms

Firas Laakom

Research output: Book/ReportDoctoral thesisCollection of Articles

Abstract

The main strength of neural networks lies in their ability to generalize to unseen data. ‘Why and when do they generalize well?’ are two extremely important questions for a full understanding of this phenomenon and for developing better and more robust models. Several studies have explored these questions from different perspectives and proposed multiple measures/bounds that correlate well with generalization. The dissertation proposes a new perspective by focusing on the ‘feature diversity’ within the hidden layers. From this standpoint, neural networks are seen as a two-stage process, with the first stage being feature (representation) learning through the intermediate layers, followed by the final prediction layer. Empirically, it has been observed that learning a rich and diverse set of features is critical for achieving top performance. Yet, no theoretical justification exists. In this dissertation, we tackle this problem by theoretically analyzing the effect of the features’ diversity on the generalization performance. Specifically, we derive several Rademacher-based rigorous bounds for neural networks in different contexts and we demonstrate that, indeed, having more diverse features correlates well with better generalization performance. Moreover, inspired by these theoretical findings, we propose a new set of data-dependent diversity-inducing regularizers and we present an extensive empirical study confirming that the proposed regularizers enhance the performance of several state-of-the-art neural network models in multiple tasks. Beyond standard neural networks, we also explore different diversity-promoting strategies in different contexts, e.g., Energy-Based Models, autoencoders, and bag-of-features pooling layers and we show that learning diverse features helps consistently.
Original languageEnglish
Place of PublicationTampere
ISBN (Electronic)978-952-03-3355-3
Publication statusPublished - 2024
Publication typeG5 Doctoral dissertation (articles)

Publication series

NameTampere University Dissertations - Tampereen yliopiston väitöskirjat
Volume984
ISSN (Print)2489-9860
ISSN (Electronic)2490-0028

Fingerprint

Dive into the research topics of 'Feature Diversity in Neural Networks: Theory and Algorithms'. Together they form a unique fingerprint.

Cite this