TY - GEN
T1 - Interpretable classifiers for tabular data via feature selection and discretization
AU - Jaakkola, Reijo
AU - Janhunen, Tomi
AU - Kuusisto, Antti
AU - Feyzbakhsh Rankooh, Masood
AU - Vilander, Miikka
N1 - Publisher Copyright:
© 2024 Copyright for this paper by its authors.
PY - 2024
Y1 - 2024
N2 - We introduce a method for computing immediately human interpretable yet accurate classifiers from tabular data. The classifiers obtained are short Boolean formulas, computed via first discretizing the original data and then using feature selection coupled with a very fast algorithm for producing the best possible Boolean classifier for the setting. We demonstrate the approach via 12 experiments, obtaining results with accuracies comparable to ones obtained via random forests, XGBoost, and existing results for the same datasets in the literature. In most cases, the accuracy of our method is in fact similar to that of the reference methods, even though the main objective of our study is the immediate interpretability of our classifiers. We also prove a new result on the probability that the classifier we obtain from real-life data corresponds to the ideally best classifier with respect to the background distribution the data comes from.
AB - We introduce a method for computing immediately human interpretable yet accurate classifiers from tabular data. The classifiers obtained are short Boolean formulas, computed via first discretizing the original data and then using feature selection coupled with a very fast algorithm for producing the best possible Boolean classifier for the setting. We demonstrate the approach via 12 experiments, obtaining results with accuracies comparable to ones obtained via random forests, XGBoost, and existing results for the same datasets in the literature. In most cases, the accuracy of our method is in fact similar to that of the reference methods, even though the main objective of our study is the immediate interpretability of our classifiers. We also prove a new result on the probability that the classifier we obtain from real-life data corresponds to the ideally best classifier with respect to the background distribution the data comes from.
KW - Boolean logic
KW - Interpretable AI
KW - Overfitting
M3 - Conference contribution
AN - SCOPUS:85211135851
T3 - CEUR Workshop Proceedings
BT - DAO-XAI 2024: Data meets Ontologies in Explainable AI 2024
PB - CEUR-WS
T2 - International Workshop on Data meets Ontologies in Explainable AI
Y2 - 19 October 2024 through 19 October 2024
ER -