TY - GEN
T1 - When an Explanation is not Enough
T2 - Mediterranean Conference on Medical and Biological Engineering and Computing and International Conference on Medical and Biological Engineering
AU - Pietilä, Essi
AU - Moreno-Sánchez, Pedro A.
PY - 2024
Y1 - 2024
N2 - Despite the promising advantages in diagnostics and treatment that Artificial Intelligence (AI) and Machine Learning (ML) can bring to the healthcare domain, the complexity and black-box behavior of the AI/ML algorithms hinder the adoption by healthcare professionals and patients due to issues regarding explainability and trustworthiness of the results. Explainable AI (XAI) has emerged to support the need for understanding the AI/ML models' outputs and is expected to have a substantial relevance in the success of these models within the healthcare domain. Nevertheless, the information provided by XAI systems might be not enough to generate the required trustworthiness in the models. Thus, the existence of tools and metrics that allow domain experts and stakeholders to evaluate the explanations arises as needed solution. At the moment, there is an obvious lack of standardization and validation of metrics, and researchers require studies that compile the metrics together to know what, how, and why should be measured. This paper aims to provide an overview of the current metrics to evaluate XAI systems with a particular view on the healthcare domain. From the metrics identified and reviewed by following the PRISMA methodology, we present a taxonomy in which certain aspects are considered, such as the domain (general or healthcare) of the metric, as well as whether the expert is included in the validation process (human-in-the-loop). From our results, we observed many metrics developed in the general domain are being used for clinical XAI models. Nevertheless, it is essential to evaluate the XAI models in a more domain-specific manner, particularly because medical experts have valuable specialist information about the use cases that computer scientists might lack.
AB - Despite the promising advantages in diagnostics and treatment that Artificial Intelligence (AI) and Machine Learning (ML) can bring to the healthcare domain, the complexity and black-box behavior of the AI/ML algorithms hinder the adoption by healthcare professionals and patients due to issues regarding explainability and trustworthiness of the results. Explainable AI (XAI) has emerged to support the need for understanding the AI/ML models' outputs and is expected to have a substantial relevance in the success of these models within the healthcare domain. Nevertheless, the information provided by XAI systems might be not enough to generate the required trustworthiness in the models. Thus, the existence of tools and metrics that allow domain experts and stakeholders to evaluate the explanations arises as needed solution. At the moment, there is an obvious lack of standardization and validation of metrics, and researchers require studies that compile the metrics together to know what, how, and why should be measured. This paper aims to provide an overview of the current metrics to evaluate XAI systems with a particular view on the healthcare domain. From the metrics identified and reviewed by following the PRISMA methodology, we present a taxonomy in which certain aspects are considered, such as the domain (general or healthcare) of the metric, as well as whether the expert is included in the validation process (human-in-the-loop). From our results, we observed many metrics developed in the general domain are being used for clinical XAI models. Nevertheless, it is essential to evaluate the XAI models in a more domain-specific manner, particularly because medical experts have valuable specialist information about the use cases that computer scientists might lack.
U2 - 10.1007/978-3-031-49062-0_60
DO - 10.1007/978-3-031-49062-0_60
M3 - Conference contribution
SN - 978-3-031-49061-3
VL - 1
T3 - IFMBE Proceedings
SP - 573
EP - 584
BT - MEDICON’23 and CMBEBIH’23 - Proceedings of the Mediterranean Conference on Medical and Biological Engineering and Computing MEDICON and International Conference on Medical and Biological Engineering CMBEBIH—Volume 1
A2 - Badnjević, Almir
A2 - Gurbeta Pokvić, Lejla
PB - Springer
CY - Cham
Y2 - 14 September 2023 through 16 September 2023
ER -