Table 3

Performance metrics on each machine learning approach with versus without using Synthetic Minority Oversampling Technique

Machine learning approachF1AccuracyPrecisionRecallAUC
No SMOTESMOTENo SMOTESMOTENo SMOTESMOTENo SMOTESMOTENo SMOTESMOTE
Logistic regression0.4730.5420.8060.7490.6430.4730.3790.6440.7940.766
Balanced random forest classifier0.7470.8470.8630.9330.6560.8910.8740.8130.9360.959
Balanced bagging classifier0.8030.8410.9010.9310.7520.8870.8690.8060.9420.959
Random forest classifier0.7970.8470.9190.9320.9340.8840.7010.8180.9570.959
Multilayer perceptron classifier0.3990.5050.8020.7120.6900.4400.3010.6380.7770.759
Support vector classifier0.4750.4490.7240.6530.4360.6030.5310.6030.7270.707
  • Values in green font signify improvement in given metric when SMOTE is used. Values in red font signify decrease in performance of given metric when SMOTE is used.

  • AUC, area under the curve; SMOTE, Synthetic Minority Oversampling Technique.