TY - JOUR
T1 - Physical Color Calibration of Digital Pathology Scanners for Robust Artificial Intelligence–Assisted Cancer Diagnosis
AU - Ji, Xiaoyi
AU - Salmon, Richard
AU - Mulliqi, Nita
AU - Khan, Umair
AU - Wang, Yinxi
AU - Blilie, Anders
AU - Olsson, Henrik
AU - Pedersen, Bodil Ginnerup
AU - Sørensen, Karina Dalsgaard
AU - Ulhøi, Benedicte Parm
AU - Kjosavik, Svein R.
AU - Janssen, Emilius A.M.
AU - Rantalainen, Mattias
AU - Egevad, Lars
AU - Ruusuvuori, Pekka
AU - Eklund, Martin
AU - Kartasalo, Kimmo
N1 - Publisher Copyright:
© 2025 The Authors
PY - 2025/5
Y1 - 2025/5
N2 - The potential of artificial intelligence (AI) in digital pathology is limited by technical inconsistencies in the production of whole slide images (WSIs). This causes degraded AI performance and poses a challenge for widespread clinical application, as fine-tuning algorithms for each site is impractical. Changes in the imaging workflow can also compromise diagnostic accuracy and patient safety. Physical color calibration of scanners, relying on a biomaterial-based calibrant slide and a spectrophotometric reference measurement, has been proposed for standardizing WSI appearance, but its impact on AI performance has not been investigated. We evaluated whether physical color calibration can enable robust AI performance. We trained fully supervised and foundation model–based AI systems for detecting and Gleason grading prostate cancer using WSIs of prostate biopsies from the STHLM3 clinical trial (n = 3651) and evaluated their performance in 3 external cohorts (n = 1161) with and without calibration. With physical color calibration, the fully supervised system's concordance with pathologists’ grading (Cohen linearly weighted κ) improved from 0.439 to 0.619 in the Stavanger University Hospital cohort (n = 860), from 0.354 to 0.738 in the Karolinska University Hospital cohort (n = 229), and from 0.423 to 0.452 in the Aarhus University Hospital cohort (n = 72). The foundation model's concordance improved as follows: from 0.739 to 0.760 (Karolinska), from 0.424 to 0.459 (Aarhus), and from 0.547 to 0.670 (Stavanger). This study demonstrated that physical color calibration provides a potential solution to the variation introduced by different scanners, making AI-based cancer diagnostics more reliable and applicable in diverse clinical settings.
AB - The potential of artificial intelligence (AI) in digital pathology is limited by technical inconsistencies in the production of whole slide images (WSIs). This causes degraded AI performance and poses a challenge for widespread clinical application, as fine-tuning algorithms for each site is impractical. Changes in the imaging workflow can also compromise diagnostic accuracy and patient safety. Physical color calibration of scanners, relying on a biomaterial-based calibrant slide and a spectrophotometric reference measurement, has been proposed for standardizing WSI appearance, but its impact on AI performance has not been investigated. We evaluated whether physical color calibration can enable robust AI performance. We trained fully supervised and foundation model–based AI systems for detecting and Gleason grading prostate cancer using WSIs of prostate biopsies from the STHLM3 clinical trial (n = 3651) and evaluated their performance in 3 external cohorts (n = 1161) with and without calibration. With physical color calibration, the fully supervised system's concordance with pathologists’ grading (Cohen linearly weighted κ) improved from 0.439 to 0.619 in the Stavanger University Hospital cohort (n = 860), from 0.354 to 0.738 in the Karolinska University Hospital cohort (n = 229), and from 0.423 to 0.452 in the Aarhus University Hospital cohort (n = 72). The foundation model's concordance improved as follows: from 0.739 to 0.760 (Karolinska), from 0.424 to 0.459 (Aarhus), and from 0.547 to 0.670 (Stavanger). This study demonstrated that physical color calibration provides a potential solution to the variation introduced by different scanners, making AI-based cancer diagnostics more reliable and applicable in diverse clinical settings.
KW - artificial intelligence
KW - color calibration
KW - computational pathology
KW - foundation model
KW - prostate cancer
KW - whole slide scanning
U2 - 10.1016/j.modpat.2025.100715
DO - 10.1016/j.modpat.2025.100715
M3 - Article
C2 - 39826798
AN - SCOPUS:85216885872
SN - 0893-3952
VL - 38
JO - Modern Pathology
JF - Modern Pathology
IS - 5
M1 - 100715
ER -