Method Details

Details for method 'Improving Semantic Segmentation via Video Propagation and Label Relaxation'


Method overview

name Improving Semantic Segmentation via Video Propagation and Label Relaxation
challenge pixel-level semantic labeling
details Semantic segmentation requires large amounts of pixel-wise annotations to learn accurate models. In this paper, we present a video prediction-based methodology to scale up training sets by synthesizing new training samples in order to improve the accuracy of semantic segmentation networks. We exploit video prediction models' ability to predict future frames in order to also predict future labels. A joint propagation strategy is also proposed to alleviate mis-alignments in synthesized samples. We demonstrate that training segmentation models on datasets augmented by the synthesized samples lead to significant improvements in accuracy. Furthermore, we introduce a novel boundary label relaxation technique that makes training robust to annotation noise and propagation artifacts along object boundaries. Our proposed methods achieve state-of-the-art mIoUs of 83.5% on Cityscapes and 82.9% on CamVid. Our single model, without model ensembles, achieves 72.8% mIoU on the KITTI semantic segmentation test set, which surpasses the winning entry of the ROB challenge 2018.
publication Improving Semantic Segmentation via Video Propagation and Label Relaxation
Yi Zhu, Karan Sapra, Fitsum A. Reda, Kevin J. Shih, Shawn Newsam, Andrew Tao, Bryan Catanzaro
CVPR 2019
project page / code
used Cityscapes data fine annotations, coarse annotations, video
used external data ImageNet, Mapillary Vistas
runtime n/a
subsampling no
submission date October, 2018
previous submissions


Average results

Metric Value
IoU Classes 83.454
iIoU Classes 64.3881
IoU Categories 92.2272
iIoU Categories 82.0328


Class results

Class IoU iIoU
road 98.7887 -
sidewalk 87.8196 -
building 94.1836 -
wall 64.0654 -
fence 65.0306 -
pole 72.4168 -
traffic light 79.0441 -
traffic sign 82.7999 -
vegetation 94.1813 -
terrain 73.9964 -
sky 96.1408 -
person 88.2181 72.885
rider 75.3782 55.2296
car 96.4535 92.1972
truck 78.8045 50.1483
bus 94.0174 60.9603
train 91.5797 63.0359
motorcycle 73.7247 53.8147
bicycle 78.9823 66.834


Category results

Category IoU iIoU
flat 98.8087 -
nature 93.9264 -
object 77.992 -
sky 96.1408 -
construction 94.3756 -
human 88.2101 73.5848
vehicle 96.1364 90.4808



Download results as .csv file

Benchmark page