Method Details

Details for method 'PAC: Perspective-adaptive Convolutions'


Method overview

name PAC: Perspective-adaptive Convolutions
challenge pixel-level semantic labeling
details Many existing scene parsing methods adopt Convolutional Neural Networks with receptive fields of fixed sizes and shapes, which frequently results in inconsistent predictions of large objects and invisibility of small objects. To tackle this issue, we propose perspective-adaptive convolutions to acquire receptive fields of flexible sizes and shapes during scene parsing. Through adding a new perspective regression layer, we can dynamically infer the position-adaptive perspective coefficient vectors utilized to reshape the convolutional patches. Consequently, the receptive fields can be adjusted automatically according to the various sizes and perspective deformations of the objects in scene images. Our proposed convolutions are differentiable to learn the convolutional parameters and perspective coefficients in an end-to-end way without any extra training supervision of object sizes. Furthermore, considering that the standard convolutions lack contextual information and spatial dependencies, we propose a context adaptive bias to capture both local and global contextual information through average pooling on the local feature patches and global feature maps, followed by flexible attentive summing to the convolutional results. The attentive weights are position-adaptive and context-aware, and can be learned through adding an additional context regression layer. Experiments on Cityscapes and ADE20K datasets well demonstrate the effectiveness of the proposed methods.
publication Perspective-adaptive Convolutions for Scene Parsing
Rui Zhang, Sheng Tang, Yongdong Zhang, Jintao Li, and Shuicheng Yan
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)
project page / code
used Cityscapes data fine annotations
used external data ImageNet
runtime n/a
subsampling no
submission date March, 2018
previous submissions


Average results

Metric Value
IoU Classes 78.8983
iIoU Classes 55.6839
IoU Categories 90.6883
iIoU Categories 78.3441


Class results

Class IoU iIoU
road 98.7114 -
sidewalk 86.9318 -
building 93.3459 -
wall 58.8669 -
fence 60.3572 -
pole 65.7715 -
traffic light 73.0131 -
traffic sign 78.335 -
vegetation 93.5518 -
terrain 72.8317 -
sky 95.6082 -
person 85.9924 67.1502
rider 71.2982 48.5209
car 95.977 90.3585
truck 73.396 42.328
bus 82.3653 51.6824
train 69.5079 42.1912
motorcycle 67.2574 43.236
bicycle 75.9494 60.0042


Category results

Category IoU iIoU
flat 98.7076 -
nature 93.2267 -
object 72.2191 -
sky 95.6082 -
construction 93.5416 -
human 86.1319 68.3796
vehicle 95.3833 88.3086



Download results as .csv file

Benchmark page