TY - JOUR
T1 - Feature transforms for image data augmentation
AU - Nanni, Loris
AU - Paci, Michelangelo
AU - Brahnam, Sheryl
AU - Lumini, Alessandra
N1 - Funding Information:
Open access funding provided by Università degli Studi di Padova within the CRUI-CARE Agreement.
Funding Information:
The authors wish to acknowledge the NVIDIA Corporation for supporting this research with the donation of a Titan Xp GPU and the TCSC–Tampere Center for Scientific Computing for generous computational resources.
Publisher Copyright:
© 2022, The Author(s).
PY - 2022
Y1 - 2022
N2 - A problem with convolutional neural networks (CNNs) is that they require large datasets to obtain adequate robustness; on small datasets, they are prone to overfitting. Many methods have been proposed to overcome this shortcoming with CNNs. In cases where additional samples cannot easily be collected, a common approach is to generate more data points from existing data using an augmentation technique. In image classification, many augmentation approaches utilize simple image manipulation algorithms. In this work, we propose some new methods for data augmentation based on several image transformations: the Fourier transform (FT), the Radon transform (RT), and the discrete cosine transform (DCT). These and other data augmentation methods are considered in order to quantify their effectiveness in creating ensembles of neural networks. The novelty of this research is to consider different strategies for data augmentation to generate training sets from which to train several classifiers which are combined into an ensemble. Specifically, the idea is to create an ensemble based on a kind of bagging of the training set, where each model is trained on a different training set obtained by augmenting the original training set with different approaches. We build ensembles on the data level by adding images generated by combining fourteen augmentation approaches, with three based on FT, RT, and DCT, proposed here for the first time. Pretrained ResNet50 networks are finetuned on training sets that include images derived from each augmentation method. These networks and several fusions are evaluated and compared across eleven benchmarks. Results show that building ensembles on the data level by combining different data augmentation methods produce classifiers that not only compete competitively against the state-of-the-art but often surpass the best approaches reported in the literature.
AB - A problem with convolutional neural networks (CNNs) is that they require large datasets to obtain adequate robustness; on small datasets, they are prone to overfitting. Many methods have been proposed to overcome this shortcoming with CNNs. In cases where additional samples cannot easily be collected, a common approach is to generate more data points from existing data using an augmentation technique. In image classification, many augmentation approaches utilize simple image manipulation algorithms. In this work, we propose some new methods for data augmentation based on several image transformations: the Fourier transform (FT), the Radon transform (RT), and the discrete cosine transform (DCT). These and other data augmentation methods are considered in order to quantify their effectiveness in creating ensembles of neural networks. The novelty of this research is to consider different strategies for data augmentation to generate training sets from which to train several classifiers which are combined into an ensemble. Specifically, the idea is to create an ensemble based on a kind of bagging of the training set, where each model is trained on a different training set obtained by augmenting the original training set with different approaches. We build ensembles on the data level by adding images generated by combining fourteen augmentation approaches, with three based on FT, RT, and DCT, proposed here for the first time. Pretrained ResNet50 networks are finetuned on training sets that include images derived from each augmentation method. These networks and several fusions are evaluated and compared across eleven benchmarks. Results show that building ensembles on the data level by combining different data augmentation methods produce classifiers that not only compete competitively against the state-of-the-art but often surpass the best approaches reported in the literature.
KW - Convolutional neural networks
KW - Data augmentation
KW - Deep learning
KW - Ensemble
U2 - 10.1007/s00521-022-07645-z
DO - 10.1007/s00521-022-07645-z
M3 - Article
AN - SCOPUS:85136590045
SN - 0941-0643
VL - 34
SP - 22345
EP - 22356
JO - Neural Computing and Applications
JF - Neural Computing and Applications
IS - 24
ER -