This table lists the benchmark results for the 3D semantic label scenario.


Method Infoavg ioubathtubbedbookshelfcabinetchaircountercurtaindeskdoorfloorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwallwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
SparseConvNet0.725 10.647 40.821 10.846 10.721 10.869 10.533 10.754 30.603 20.614 10.955 10.572 10.325 10.710 10.870 10.724 10.823 10.628 20.934 10.865 10.683 1
MinkowskiNet340.679 20.811 10.734 30.739 30.641 20.804 30.413 50.759 20.696 10.545 30.938 60.518 20.141 130.623 20.757 20.680 20.723 30.684 10.896 20.821 20.651 2
joint point-based0.634 30.614 60.778 20.667 50.633 30.825 20.420 40.804 10.467 50.561 20.951 20.494 30.291 20.566 30.458 50.579 30.764 20.559 30.838 30.814 30.598 3
TextureNet0.566 40.672 20.664 50.671 40.494 50.719 50.445 20.678 40.411 90.396 70.935 70.356 70.225 30.412 70.535 40.565 40.636 60.464 80.794 60.680 110.568 4
DVVNet0.562 50.648 30.700 40.770 20.586 40.687 90.333 70.650 50.514 30.475 40.906 130.359 60.223 40.340 90.442 60.422 110.668 40.501 60.708 90.779 40.534 6
PointConv0.556 60.636 50.640 80.574 90.472 70.739 40.430 30.433 80.418 80.445 60.944 30.372 50.185 80.464 50.575 30.540 50.639 50.505 50.827 40.762 50.515 7
3DMV, FTSDF0.501 70.558 100.608 110.424 150.478 60.690 80.246 130.586 60.468 40.450 50.911 110.394 40.160 100.438 60.212 120.432 100.541 100.475 70.742 80.727 70.477 9
PCNN0.498 80.559 90.644 70.560 100.420 90.711 70.229 150.414 90.436 60.352 90.941 50.324 80.155 110.238 130.387 70.493 60.529 110.509 40.813 50.751 60.504 8
3DMV0.484 90.484 130.538 130.643 60.424 80.606 150.310 80.574 70.433 70.378 80.796 160.301 90.214 50.537 40.208 130.472 90.507 140.413 130.693 100.602 150.539 5
Angela Dai, Matthias Niessner: 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation. ECCV'18
PointCNN with RGBpermissive0.458 100.577 80.611 100.356 170.321 140.715 60.299 100.376 120.328 140.319 100.944 30.285 110.164 90.216 150.229 110.484 80.545 90.456 90.755 70.709 80.475 10
Yangyan Li, Rui Bu, Mingchao Sun, Baoquan Chen: PointCNN. NeurIPS 2018
PNET20.442 110.548 110.548 120.597 80.363 110.628 130.300 90.292 130.374 110.307 110.881 150.268 120.186 70.238 130.204 140.407 120.506 150.449 100.667 110.620 140.462 11
SurfaceConvPF0.442 110.505 120.622 90.380 160.342 130.654 110.227 160.397 110.367 120.276 130.924 90.240 130.198 60.359 80.262 90.366 130.581 70.435 110.640 120.668 120.398 12
Hao Pan, Shilin Liu, Yang Liu, Xin Tong: Convolutional Neural Networks on 3D Surfaces Using Parallel Frames.
Tangent Convolutionspermissive0.438 130.437 150.646 60.474 120.369 100.645 120.353 60.258 150.282 160.279 120.918 100.298 100.147 120.283 100.294 80.487 70.562 80.427 120.619 130.633 130.352 14
Maxim Tatarchenko, Jaesik Park, Vladlen Koltun, Qian-Yi Zhou: Tangent convolutions for dense prediction in 3d. CVPR 2018
SPLAT Netcopyleft0.393 140.472 140.511 140.606 70.311 150.656 100.245 140.405 100.328 140.197 160.927 80.227 150.000 180.001 180.249 100.271 180.510 120.383 150.593 140.699 90.267 16
Hang Su, Varun Jampani, Deqing Sun, Subhransu Maji, Evangelos Kalogerakis, Ming-Hsuan Yang, Jan Kautz: SPLATNet: Sparse Lattice Networks for Point Cloud Processing. CVPR 2018
ScanNet+FTSDF0.383 150.297 170.491 150.432 140.358 120.612 140.274 110.116 170.411 90.265 140.904 140.229 140.079 160.250 110.185 150.320 160.510 120.385 140.548 150.597 160.394 13
PointNet++permissive0.339 160.584 70.478 160.458 130.256 170.360 180.250 120.247 160.278 170.261 150.677 180.183 160.117 140.212 160.145 170.364 140.346 180.232 180.548 150.523 170.252 17
Charles R. Qi, Li Yi, Hao Su, Leonidas J. Guibas: pointnet++: deep hierarchical feature learning on point sets in a metric space.
SSC-UNetpermissive0.308 170.353 160.290 180.278 180.166 180.553 160.169 180.286 140.147 180.148 180.908 120.182 170.064 170.023 170.018 180.354 150.363 160.345 160.546 170.685 100.278 15
ScanNetpermissive0.306 180.203 180.366 170.501 110.311 150.524 170.211 170.002 180.342 130.189 170.786 170.145 180.102 150.245 120.152 160.318 170.348 170.300 170.460 180.437 180.182 18
Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner: ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. CVPR'17

This table lists the benchmark results for the 3D semantic instance scenario.




Method Infoavg ap 50%bathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
3D-SIS0.382 11.000 10.432 30.245 40.190 20.577 20.013 40.263 20.033 40.320 10.240 20.075 20.422 10.857 10.117 20.699 10.271 30.883 10.235 2
R-PointNet0.306 20.500 40.405 40.311 20.348 10.589 10.054 10.068 60.126 10.283 20.290 10.028 30.219 20.214 40.331 10.396 40.275 20.821 20.245 1
3D-BEVIS0.218 30.667 20.455 20.078 60.053 60.428 30.000 60.072 40.098 20.079 60.030 50.014 40.098 40.375 20.056 50.538 20.158 50.585 40.147 3
Seg-Clusterpermissive0.215 40.370 50.337 60.285 30.105 30.325 50.025 30.282 10.085 30.105 30.107 30.007 60.079 50.317 30.114 30.309 60.304 10.587 30.123 5
MTML0.212 50.667 20.614 10.337 10.027 70.390 40.000 60.118 30.001 50.100 40.028 60.000 70.167 30.143 50.046 60.500 30.105 60.570 50.003 7
Sgpn_scannet0.143 60.208 70.390 50.169 50.065 40.275 60.029 20.069 50.000 60.087 50.043 40.014 50.027 70.000 60.112 40.351 50.168 40.438 60.138 4
MaskRCNN 2d->3d Proj0.058 70.333 60.002 70.000 70.053 50.002 70.002 50.021 70.000 60.045 70.024 70.238 10.065 60.000 60.014 70.107 70.020 70.110 70.006 6

This table lists the benchmark results for the 2D semantic label scenario.


Method Infoavg ioubathtubbedbookshelfcabinetchaircountercurtaindeskdoorfloorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwallwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
SSMAcopyleft0.577 10.695 10.716 20.439 40.563 10.314 30.444 10.719 10.551 10.503 10.887 30.346 10.348 20.603 10.353 30.709 10.600 20.457 10.901 10.786 10.599 1
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. arXiv
FuseNetpermissive0.521 20.591 30.682 30.220 70.488 30.279 40.344 50.610 30.461 30.475 20.910 10.293 20.447 10.512 30.397 20.618 20.567 40.452 20.734 50.782 20.566 2
Caner Hazirbas, Lingni Ma, Csaba Domokos, Daniel Cremers: FuseNet: Incorporating Depth into Semantic Segmentation via Fusion-based CNN Architecture. ACCV 2016
AdapNet++copyleft0.503 30.613 20.722 10.418 50.358 70.337 20.370 40.479 50.443 40.368 50.907 20.207 50.213 60.464 50.525 10.618 20.657 10.450 30.788 30.721 40.408 6
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. arXiv
3DMV (2d proj)0.498 40.481 50.612 40.579 20.456 40.343 10.384 20.623 20.525 20.381 40.845 40.254 40.264 40.557 20.182 50.581 50.598 30.429 40.760 40.661 60.446 5
Angela Dai, Matthias Niessner: 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation. ECCV'18
ILC-PSPNet0.475 50.490 40.581 50.289 60.507 20.067 70.379 30.610 30.417 60.435 30.822 60.278 30.267 30.503 40.228 40.616 40.533 50.375 50.820 20.729 30.560 3
Enet (reimpl)0.376 60.264 70.452 70.452 30.365 50.181 50.143 70.456 60.409 70.346 60.769 70.164 60.218 50.359 60.123 70.403 70.381 70.313 70.571 60.685 50.472 4
Re-implementation of Adam Paszke, Abhishek Chaurasia, Sangpil Kim, Eugenio Culurciello: ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation.
ScanNet (2d proj)permissive0.330 70.293 60.521 60.657 10.361 60.161 60.250 60.004 70.440 50.183 70.836 50.125 70.060 70.319 70.132 60.417 60.412 60.344 60.541 70.427 70.109 7
Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner: ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. CVPR'17

This table lists the benchmark results for the 2D semantic instance scenario.




Method Infoavg apbathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
MaskRCNN_ScanNetpermissive0.119 10.129 10.212 10.002 10.112 10.148 10.014 10.205 10.044 10.066 10.078 10.095 10.142 10.030 10.128 10.139 10.080 10.459 10.057 1
Re-implementation of Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross Girshick: Mask R-CNN. ICCV'17

This table lists the benchmark results for the scene type classification scenario.




Method Infoavg recallapartmentbathroombedroom / hotelbookstore / libraryconference roomcopy/mail roomhallwaykitchenlaundry roomliving room / loungemiscofficestorage / basement / garage
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
SE-ResNeXt-SSMA0.498 10.000 20.812 10.941 10.500 10.500 10.500 10.500 10.429 20.500 10.667 10.500 10.625 10.000 1
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. arXiv
resnet50_scannet0.353 20.250 10.812 10.529 20.500 10.500 10.000 20.500 10.571 10.000 20.556 20.000 20.375 20.000 1