Evaluation metrics

We evaluate the performances of the models using the following metrics, Accuracy (Acc), Sensitivity (Sens), Specificity (Spec), F1, Area Under the Curve (AUC)

Ranking of singletask approaches

The singletask approaches are trained on the benchmark dataset with 3,641 BUS images. Click on a metric to sort approaches based on that metric.

Rank Approaches Acc Sens Spec F1 AUC
1 VGG16 74.5 86.7 62.6 0.77 74.7
2 MobileNet 74.0 87.4 61.3 0.77 74.4
3 Xception 73.7 88.5 59.6 0.77 74.0
4 EfficientNetB0 73.8 86.8 61.2 0.77 74.0
5 InceptionV3 73.0 88.4 57.6 0.77 73.0
6 DenseNet121 72.7 90.1 55.7 0.77 72.9
7 Tanaka 77.8 74.6 81.2 0.76 77.9
8 ResNet50 72.6 86.2 59.4 0.76 72.8
9 Shia 74.6 75.5 74.0 0.75 74.7
10 Xie 62.2 48.6 75.8 0.55 62.2

Ranking of multitask approaches

The multitask approaches are trained on BUSI and BUSIS datasets combined with 1,209 BUS images.

Rank Approaches Acc Sens Spec F1 AUC
1 MT-ESTAN (ours) 90.0 90.4 89.8 0.88 90.1
2 MobileNet 87.0 81.1 91.0 0.83 86.1
3 VGG16 87.1 81.3 90.9 0.83 86.1
4 EfficientNetB0 87.5 81.0 91.2 0.83 86.1
5 Zhang 87.4 81.4 91.4 0.83 86.4
6 ResNet50 86.1 80.9 89.2 0.81 85.0
7 DenseNet121 85.0 79.1 88.9 0.80 84.0
8 Shi 83.9 87.3 81.7 0.80 84.5
9 Vakanski 83.6 77.4 87.8 0.78 82.6