Method Details

Details for method 'ESPNet'


Method overview

name ESPNet
challenge pixel-level semantic labeling
details We introduce a fast and efficient convolutional neural network, ESPNet, for semantic segmentation of high resolution images under resource constraints. ESPNet is based on a new convolutional module, efficient spatial pyramid (ESP), which is efficient in terms of computation, memory, and power. ESPNet is 22 times faster (on a standard GPU) and 180 times smaller than the state-of-the-art semantic segmentation network PSPNet, while its category-wise accuracy is only 8% less. We evaluated EPSNet on a variety of semantic segmentation datasets including Cityscapes, PASCAL VOC, and a breast biopsy whole slide image dataset. Under the same constraints on memory and computation, ESPNet outperforms all the current efficient CNN networks such as MobileNet, ShuffleNet, and ENet on both standard metrics and our newly introduced performance metrics that measure efficiency on edge devices. Our network can process high resolution images at a rate of 112 and 9 frames per second on a standard GPU and edge device, respectively
publication ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation
Sachin Mehta, Mohammad Rastegari, Anat Caspi, Linda Shapiro, and Hannaneh Hajishirzi
project page / code
used Cityscapes data fine annotations
used external data
runtime 0.0089 s
subsampling 2
submission date January, 2018
previous submissions


Average results

Metric Value
IoU Classes 60.336
iIoU Classes 31.82
IoU Categories 82.178
iIoU Categories 63.0655


Class results

Class IoU iIoU
road 95.6812 -
sidewalk 73.2892 -
building 86.6022 -
wall 32.7898 -
fence 36.4273 -
pole 47.0647 -
traffic light 46.9215 -
traffic sign 55.4068 -
vegetation 89.8251 -
terrain 65.9625 -
sky 92.465 -
person 68.4789 45.8096
rider 45.8364 19.1643
car 89.9046 81.6755
truck 40.0044 15.1705
bus 47.7336 24.346
train 40.6992 16.7607
motorcycle 36.4026 16.1571
bicycle 54.8881 35.4765


Category results

Category IoU iIoU
flat 95.4936 -
nature 89.4648 -
object 52.9433 -
sky 92.465 -
construction 86.6703 -
human 69.763 47.0905
vehicle 88.446 79.0405



Download results as .csv file

Benchmark page