Machine learning-based test smell detection

  • Valeria Pontillo*
  • , Dario Amoroso d’Aragona
  • , Fabiano Pecorelli
  • , Dario Di Nucci
  • , Filomena Ferrucci
  • , Fabio Palomba
  • *Corresponding author for this work

Research output: Contribution to journalArticleScientificpeer-review

13 Citations (Scopus)
12 Downloads (Pure)

Abstract

Test smells are symptoms of sub-optimal design choices adopted when developing test cases. Previous studies have proved their harmfulness for test code maintainability and effectiveness. Therefore, researchers have been proposing automated, heuristic-based techniques to detect them. However, the performance of these detectors is still limited and dependent on tunable thresholds. We design and experiment with a novel test smell detection approach based on machine learning to detect four test smells. First, we develop the largest dataset of manually-validated test smells to enable experimentation. Afterward, we train six machine learners and assess their capabilities in within- and cross-project scenarios. Finally, we compare the ML-based approach with state-of-the-art heuristic-based techniques. The key findings of the study report a negative result. The performance of the machine learning-based detector is significantly better than heuristic-based techniques, but none of the learners able to overcome an average F-Measure of 51%. We further elaborate and discuss the reasons behind this negative result through a qualitative investigation into the current issues and challenges that prevent the appropriate detection of test smells, which allowed us to catalog the next steps that the research community may pursue to improve test smell detection techniques.

Original languageEnglish
Article number55
JournalEmpirical Software Engineering
Volume29
Issue number2
DOIs
Publication statusPublished - Mar 2024
Publication typeA1 Journal article-refereed

Funding

Fabio gratefully acknowledges the support of the Swiss National Science Foundation through the SNF project No. PZ00P2_186090. In addition, the work has been partially supported by the EMELIOT national research project, which the MUR has funded under the PRIN 2020 program (Contract 2020W3A5FY). Open access funding provided by Università degli Studi di Salerno within the CRUI-CARE Agreement.

FundersFunder number
Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung2020W3A5FY, PZ00P2_186090
Università degli Studi di Salerno

    Keywords

    • Empirical software engineering
    • Machine learning
    • Test code quality
    • Test smells

    Publication forum classification

    • Publication forum level 3

    ASJC Scopus subject areas

    • Software

    Fingerprint

    Dive into the research topics of 'Machine learning-based test smell detection'. Together they form a unique fingerprint.

    Cite this