TY - GEN
T1 - Deep Local Feature Matching Image Anomaly Detection with Patch Adaptive Average Pooling Technique
AU - Dini, Afshin
AU - Rahtu, Esa
PY - 2025/2
Y1 - 2025/2
N2 - We present a new visual defect detection approach based on a deep feature-matching model and a patch adaptive technique. The main idea is to utilize a pre-trained feature-matching model to identify the training sample(s) being most similar to each test sample. By applying a patch-adaptive average pooling on the extracted features and defining an anomaly map using a pixel-wise Mahalanobis distance between the normal and test features, anomalies can be detected properly. By evaluating our method on the MVTec dataset, we discover that our method has many advantages over similar techniques as (1) it skips the training phase and the difficulties of fine-tuning model parameters that may vary from one dataset to another, (2) it performs quite well on datasets with only a few training samples, reducing the costs of collecting large training datasets in real-world applications, (3) it can automatically adjust itself without compromising performance in terms of shift in data domain, and (4) the model’s performance is better than similar state-of-the-art methods.
AB - We present a new visual defect detection approach based on a deep feature-matching model and a patch adaptive technique. The main idea is to utilize a pre-trained feature-matching model to identify the training sample(s) being most similar to each test sample. By applying a patch-adaptive average pooling on the extracted features and defining an anomaly map using a pixel-wise Mahalanobis distance between the normal and test features, anomalies can be detected properly. By evaluating our method on the MVTec dataset, we discover that our method has many advantages over similar techniques as (1) it skips the training phase and the difficulties of fine-tuning model parameters that may vary from one dataset to another, (2) it performs quite well on datasets with only a few training samples, reducing the costs of collecting large training datasets in real-world applications, (3) it can automatically adjust itself without compromising performance in terms of shift in data domain, and (4) the model’s performance is better than similar state-of-the-art methods.
U2 - 10.5220/0013136400003912
DO - 10.5220/0013136400003912
M3 - Conference contribution
VL - 2
T3 - VISIGRAPP
SP - 332
EP - 339
BT - Proceedings of the 20th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications
PB - SCITEPRESS
T2 - International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications
Y2 - 26 February 2025 through 28 February 2025
ER -