The 3D semantic instance prediction task involves detecting and segmenting the object in an 3D scan mesh.

Evaluation and metrics

Our evaluation ranks all methods according to the average precision for each class. We report the mean average precision AP at overlap 0.25 (AP 25%), overlap 0.5 (AP 50%), and over overlaps in the range [0.5:0.95:0.05] (AP). Note that multiple predictions of the same ground truth instance are penalized as false positives.



This table lists the benchmark results for the 3D semantic instance scenario.




Method Infoavg ap 50%bathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
DCNet0.607 11.000 10.907 10.792 10.462 20.788 10.151 10.535 50.292 20.395 30.501 10.263 30.600 11.000 10.598 10.857 10.502 50.918 50.368 4
MTML0.549 21.000 10.807 20.588 40.327 50.647 20.004 120.815 10.180 50.418 10.364 60.182 50.445 31.000 10.442 40.688 30.571 21.000 10.396 2
Occipital-SCS0.512 31.000 10.716 40.509 50.506 10.611 40.092 50.602 40.177 60.346 50.383 50.165 60.442 40.850 50.386 70.618 50.543 30.889 80.389 3
3D-BoNet0.488 41.000 10.672 70.590 30.301 60.484 90.098 40.620 20.306 10.341 60.259 80.125 80.434 50.796 60.402 60.499 100.513 40.909 70.439 1
PanopticFusion-inst0.478 50.667 70.712 60.595 20.259 80.550 80.000 150.613 30.175 70.250 90.434 20.437 10.411 70.857 30.485 20.591 80.267 120.944 40.359 5
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
ResNet-backbone0.459 61.000 10.737 30.159 130.259 70.587 60.138 20.475 70.217 40.416 20.408 40.128 70.315 80.714 70.411 50.536 90.590 10.873 100.304 6
MASCpermissive0.447 70.528 100.555 90.381 70.382 30.633 30.002 130.509 60.260 30.361 40.432 30.327 20.451 20.571 80.367 80.639 40.386 70.980 20.276 7
Chen Liu, Yasutaka Furukawa: MASC: Multi-scale Affinity with Sparse Convolution for 3D Instance Segmentation.
3D-SISpermissive0.382 81.000 10.432 110.245 100.190 100.577 70.013 100.263 100.033 130.320 70.240 100.075 110.422 60.857 30.117 120.699 20.271 110.883 90.235 10
Ji Hou, Angela Dai, Matthias Niessner: 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans. CVPR 2019
DPC-instance0.355 90.500 110.517 100.467 60.228 90.422 110.133 30.405 80.111 90.205 100.241 90.075 100.233 90.306 120.445 30.439 110.457 60.974 30.239 9
Francis Engelmann, Theodora Kontogianni, Bastian Leibe: Dilated Point Convolutions. arXiv
UNet-backbone0.319 100.667 70.715 50.233 110.189 110.479 100.008 110.218 110.067 120.201 110.173 110.107 90.123 110.438 90.150 100.615 60.355 80.916 60.093 14
R-PointNet0.306 110.500 110.405 120.311 80.348 40.589 50.054 60.068 130.126 80.283 80.290 70.028 120.219 100.214 130.331 90.396 120.275 100.821 120.245 8
3D-BEVIS0.248 120.667 70.566 80.076 140.035 150.394 120.027 80.035 140.098 100.099 130.030 140.025 130.098 120.375 100.126 110.604 70.181 130.854 110.171 11
Cathrin Elich, Francis Engelmann, Jonas Schult, Theodora Kontogianni, Bastian Leibe: 3D-BEVIS: Birds-Eye-View Instance Segmentation.
Seg-Clusterpermissive0.215 130.370 130.337 140.285 90.105 120.325 130.025 90.282 90.085 110.105 120.107 120.007 150.079 130.317 110.114 130.309 140.304 90.587 130.123 13
Sgpn_scannet0.143 140.208 150.390 130.169 120.065 130.275 140.029 70.069 120.000 140.087 140.043 130.014 140.027 150.000 140.112 140.351 130.168 140.438 140.138 12
MaskRCNN 2d->3d Proj0.058 150.333 140.002 150.000 150.053 140.002 150.002 140.021 150.000 140.045 150.024 150.238 40.065 140.000 140.014 150.107 150.020 150.110 150.006 15