TY - GEN
T1 - Sepsis Prediction in Intensive Care Unit Using Ensemble of XGboost Models
AU - Zabihi, Morteza
AU - Kiranyaz, Serkan
AU - Gabbouj, Moncef
N1 - EXT="Kiranyaz, Serkan"
PY - 2019/9/1
Y1 - 2019/9/1
N2 - Sepsis is caused by the dysregulated host response to infection and potentially is the main cause of 6 million death annually. It is a highly dynamic syndrome and therefore the early prediction of sepsis plays a key role in reducing its high associated mortality. However, this is a challenging task because there is no specific and accurate test or scoring system to perform early prediction. In this paper, we present a systematic approach for sepsis prediction. We also propose a new set of features to model the missingness in clinical data. The pipeline of the proposed method comprises three major components: feature extraction, feature selection, and classification. In total, 407 features are extracted from the clinical data. Then, five different sets of features are selected using a wrapper feature selection algorithm based on XGboost. The selected features are extracted from both valid and missing clinical data. Afterwards, an ensemble model consists of five XGboost models is used for sepsis prediction. The proposed algorithm is ranked officially as third place in the PhysioNet/Computing in Cardiology Challenge 2019 with an overall utility score of 0.339 on the unseen test dataset (our team name: Separatrix).
AB - Sepsis is caused by the dysregulated host response to infection and potentially is the main cause of 6 million death annually. It is a highly dynamic syndrome and therefore the early prediction of sepsis plays a key role in reducing its high associated mortality. However, this is a challenging task because there is no specific and accurate test or scoring system to perform early prediction. In this paper, we present a systematic approach for sepsis prediction. We also propose a new set of features to model the missingness in clinical data. The pipeline of the proposed method comprises three major components: feature extraction, feature selection, and classification. In total, 407 features are extracted from the clinical data. Then, five different sets of features are selected using a wrapper feature selection algorithm based on XGboost. The selected features are extracted from both valid and missing clinical data. Afterwards, an ensemble model consists of five XGboost models is used for sepsis prediction. The proposed algorithm is ranked officially as third place in the PhysioNet/Computing in Cardiology Challenge 2019 with an overall utility score of 0.339 on the unseen test dataset (our team name: Separatrix).
U2 - 10.23919/CinC49843.2019.9005564
DO - 10.23919/CinC49843.2019.9005564
M3 - Conference contribution
AN - SCOPUS:85081132099
T3 - Computing in Cardiology
BT - 2019 Computing in Cardiology, CinC 2019
PB - IEEE Computer Society
T2 - Computing in Cardiology
Y2 - 8 September 2019 through 11 September 2019
ER -