Semantic Understanding of Urban Street Scenes

Method Details

Details for method 'Improving Semantic Segmentation via Video Propagation and Label Relaxation'

Method overview

name	Improving Semantic Segmentation via Video Propagation and Label Relaxation
challenge	pixel-level semantic labeling
details	Semantic segmentation requires large amounts of pixel-wise annotations to learn accurate models. In this paper, we present a video prediction-based methodology to scale up training sets by synthesizing new training samples in order to improve the accuracy of semantic segmentation networks. We exploit video prediction models' ability to predict future frames in order to also predict future labels. A joint propagation strategy is also proposed to alleviate mis-alignments in synthesized samples. We demonstrate that training segmentation models on datasets augmented by the synthesized samples lead to significant improvements in accuracy. Furthermore, we introduce a novel boundary label relaxation technique that makes training robust to annotation noise and propagation artifacts along object boundaries. Our proposed methods achieve state-of-the-art mIoUs of 83.5% on Cityscapes and 82.9% on CamVid. Our single model, without model ensembles, achieves 72.8% mIoU on the KITTI semantic segmentation test set, which surpasses the winning entry of the ROB challenge 2018.
publication	Improving Semantic Segmentation via Video Propagation and Label Relaxation Yi Zhu, Karan Sapra, Fitsum A. Reda, Kevin J. Shih, Shawn Newsam, Andrew Tao, Bryan Catanzaro CVPR 2019 https://arxiv.org/abs/1812.01593
project page / code	https://nv-adlr.github.io/publication/2018-Segmentation
used Cityscapes data	fine annotations, coarse annotations, video
used external data	ImageNet, Mapillary Vistas
runtime	n/a
subsampling	no
submission date	October, 2018
previous submissions

Average results

Metric	Value
IoU Classes	83.454
iIoU Classes	64.3881
IoU Categories	92.2272
iIoU Categories	82.0328

Class results

Class	IoU	iIoU
road	98.7887	-
sidewalk	87.8196	-
building	94.1836	-
wall	64.0654	-
fence	65.0306	-
pole	72.4168	-
traffic light	79.0441	-
traffic sign	82.7999	-
vegetation	94.1813	-
terrain	73.9964	-
sky	96.1408	-
person	88.2181	72.885
rider	75.3782	55.2296
car	96.4535	92.1972
truck	78.8045	50.1483
bus	94.0174	60.9603
train	91.5797	63.0359
motorcycle	73.7247	53.8147
bicycle	78.9823	66.834

Category results

Category	IoU	iIoU
flat	98.8087	-
nature	93.9264	-
object	77.992	-
sky	96.1408	-
construction	94.3756	-
human	88.2101	73.5848
vehicle	96.1364	90.4808

Links

Download results as .csv file