Flow Cytometry-Based Classification in Cancer Research: A View on Feature Selection

    Research output: Contribution to journalArticleScientificpeer-review

    11 Citations (Scopus)
    72 Downloads (Pure)

    Abstract

    In this paper, we study the problem of feature selection in cancer-related machine learning tasks. In particular, we study the accuracy and stability of different feature selection approaches within simplistic machine learning pipelines. Earlier studies have shown that for certain cases, the accuracy of detection can easily reach 100% given enough training data. Here, however, we concentrate on simplifying the classification models with and seek for feature selection approaches that are reliable even with extremely small sample sizes. We show that as much as 50% of features can be discarded without compromising the prediction accuracy. Moreover, we study the model selection problem among the ℓ₁ regularization path of logistic regression classifiers. To this aim, we compare a more traditional cross-validation approach with a recently proposed Bayesian error estimator.
    Original languageEnglish
    Pages (from-to)75-85
    JournalCancer Informatics
    Volume2015
    Issue numberSuppl. 5
    DOIs
    Publication statusPublished - 2016
    Publication typeA1 Journal article-refereed

    Publication forum classification

    • Publication forum level 1

    Fingerprint

    Dive into the research topics of 'Flow Cytometry-Based Classification in Cancer Research: A View on Feature Selection'. Together they form a unique fingerprint.

    Cite this