Skip to main content

Table 2 Comparison on the CamVid dataset [16] using 11 road scene categories (in percent)

From: Deep residual coalesced convolutional network for efficient semantic road segmentation

Method Sky Building Road Sidewalk Car Pedestrian Bicyclist Tree Fence Column-pole Sign-symbol Class avg. Class IoU
Local label descriptor [1] 88.8 80.7 98 12.4 16.4 1.09 0.07 61.5 0.05 4.13 n/a 36.3 n/a
Boosting+pairwise CRF [2] 94.7 70.7 94.1 79.3 74.4 45.7 23.1 70.8 37.2 13 55.9 59.9 n/a
Boosting+detection+CRF [3] 96.2 81.5 93.9 81.5 78.7 43 33.9 76.6 47.6 14.3 40.2 62.5 n/a
Dense depth map [4] 95.4 85.3 98.5 38.1 69.2 23.8 28.7 57.3 44.3 22 46.5 55.4 n/a
Super parsing [5] 96.9 87 95.9 70 62.7 14.7 19.4 67.1 17.9 1.7 30.1 51.2 n/a
SegNet-basic [8] 91.2 75 93.3 74.1 82.7 55 16 84.6 47.5 44.8 36.9 62 47.7
SegNet [8] 92.4 88.8 97.2 84.4 82.1 57.1 30.7 87.3 49.3 27.5 20.5 65.2 55.6
ENet [9] 95.1 74.7 95.1 86.7 82.4 67.2 34.1 77.8 51.7 35.4 51 68.3 51.3
RCC-Net (sum) 95.2 70.1 94.1 90.1 82.6 70.6 45.7 81.2 51 52.3 35.4 69.8 52.6
RCC-Net (concatenated) 94.3 71.8 92.6 92.7 79.3 57.7 65.6 80.5 35.7 57.4 59.4 71.5 53.3
  1. The bold values show the highest accuracy for each category