Method Details


Details for method 'Dilation10'

 

Method overview

name Dilation10
challenge pixel-level semantic labeling
details Dilation10 is a convolutional network that consists of a front-end prediction module and a context aggregation module. Both are described in the paper. The combined network was trained jointly. The context module consists of 10 layers, each of which has C=19 feature maps. The larger number of layers in the context module (10 for Cityscapes versus 8 for Pascal VOC) is due to the high input resolution. The Dilation10 model is a pure convolutional network: there is no CRF and no structured prediction. Dilation10 can therefore be used as the baseline input for structured prediction models. Note that the reported results were produced by training on the training set only; the network was not retrained on train+val.
publication Multi-Scale Context Aggregation by Dilated Convolutions
Fisher Yu and Vladlen Koltun
ICLR 2016
http://arxiv.org/abs/1511.07122
project page / code https://github.com/fyu/dilation
used Cityscapes data fine annotations
used external data ImageNet
runtime 4 s
Titan X
subsampling no
submission date April, 2016
previous submissions

 

Average results

Metric Value
IoU Classes 67.1216
iIoU Classes 41.9734
IoU Categories 86.5058
iIoU Categories 71.1055

 

Class results

Class IoU iIoU
road 97.5824 -
sidewalk 79.2017 -
building 89.8566 -
wall 37.274 -
fence 47.6238 -
pole 53.1702 -
traffic light 58.5565 -
traffic sign 65.2286 -
vegetation 91.8275 -
terrain 69.3912 -
sky 93.652 -
person 78.9032 56.2692
rider 54.9755 34.5291
car 93.3365 85.7596
truck 45.4812 21.8373
bus 53.3869 32.7484
train 47.6778 27.5686
motorcycle 52.1536 27.9548
bicycle 66.0307 49.1203

 

Category results

Category IoU iIoU
flat 98.2618 -
nature 91.4171 -
object 60.4657 -
sky 93.652 -
construction 90.1582 -
human 79.75 58.3451
vehicle 91.8355 83.8659

 

Links

Download results as .csv file

Benchmark page