Method Details

Details for method 'DSANet: Dilated Spatial Attention for Real-time Semantic Segmentation in Urban Street Scenes'


Method overview

name DSANet: Dilated Spatial Attention for Real-time Semantic Segmentation in Urban Street Scenes
challenge pixel-level semantic labeling
details we present computationally efficient network named DSANet, which follows a two-branch strategy to tackle the problem of real-time semantic segmentation in urban scenes. We first design a Context branch, which employs Depth-wise Asymmetric ShuffleNet DAS as main building block to acquire sufficient receptive fields. In addition, we propose a dual attention module consisting of dilated spatial attention and channel attention to make full use of the multi-level feature maps simultaneously, which helps predict the pixel-wise labels in each stage. Meanwhile, Spatial Encoding Network is used to enhance semantic information by preserving the spatial details. Finally, to better combine context information and spatial information, we introduce a Simple Feature Fusion Module to combine the features from the two branches.
publication Anonymous
project page / code
used Cityscapes data fine annotations
used external data
runtime n/a
subsampling no
submission date February, 2021
previous submissions


Average results

Metric Value
IoU Classes 71.3938
iIoU Classes 42.9331
IoU Categories 87.9738
iIoU Categories 72.4669


Class results

Class IoU iIoU
road 96.8368 -
sidewalk 78.5272 -
building 91.2178 -
wall 50.5392 -
fence 50.836 -
pole 59.3746 -
traffic light 64.0446 -
traffic sign 71.7082 -
vegetation 92.6324 -
terrain 69.9703 -
sky 94.5284 -
person 81.3252 61.1793
rider 61.838 34.6265
car 92.8713 85.7322
truck 56.1423 23.8347
bus 75.616 39.198
train 50.6185 24.4622
motorcycle 50.9864 27.5764
bicycle 66.8683 46.8559


Category results

Category IoU iIoU
flat 97.9754 -
nature 92.2958 -
object 65.5684 -
sky 94.5284 -
construction 91.7907 -
human 81.6466 62.196
vehicle 92.0111 82.7377



Download results as .csv file

Benchmark page