Class-wise Generalization Error: an Information-Theoretic analysis

Tutkimustuotos: ArtikkeliTieteellinenvertaisarvioitu

2 Lataukset (Pure)

Abstrakti

Existing generalization theories for supervised learning typically take a holistic approach and provide bounds for the expected generalization over the whole data distribution, which implicitly assumes that the model generalizes similarly for all different classes. In practice, however, there are significant variations in generalization performance among different classes, which cannot be captured by the existing generalization bounds. In this work, we tackle this problem by theoretically studying the class-generalization error, which quantifies the generalization performance of the model for each individual class. We derive a novel information-theoretic bound for class-generalization error using the KL divergence, and we further obtain several tighter bounds using recent advances in conditional mutual information bound, which enables practical evaluation. We empirically validate our proposed bounds in various neural networks and show that they accurately capture the complex class-generalization behavior. Moreover, we demonstrate that the theoretical tools developed in this work can be applied in several other applications.

AlkuperäiskieliEnglanti
Sivut1-36
JulkaisuTransactions on Machine Learning Research
Vuosikerta2025
Numero7
TilaJulkaistu - 2025
OKM-julkaisutyyppiA1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä

Julkaisufoorumi-taso

  • Jufo-taso 1

!!ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Vision and Pattern Recognition

Sormenjälki

Sukella tutkimusaiheisiin 'Class-wise Generalization Error: an Information-Theoretic analysis'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.

Siteeraa tätä