This table lists the benchmark results for the 3D semantic label scenario.


Method Infoavg ioubathtubbedbookshelfcabinetchaircountercurtaindeskdoorfloorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwallwindow
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort by
MVPNet0.641 40.831 40.715 70.671 90.590 60.781 70.394 100.679 100.642 10.553 50.937 120.462 50.256 70.649 30.406 150.626 50.691 60.666 10.877 50.792 70.608 6
MinkowskiNetpermissive0.734 10.858 20.833 10.834 20.716 20.855 20.459 30.836 10.639 20.641 10.953 20.541 20.302 20.743 10.865 20.726 10.771 30.664 20.891 30.851 20.694 1
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019
SparseConvNet0.725 20.647 140.821 20.846 10.721 10.869 10.533 10.754 40.603 40.614 20.955 10.572 10.325 10.710 20.870 10.724 20.823 10.628 30.934 10.865 10.683 2
KP-FCNN0.684 30.847 30.758 40.784 30.647 30.814 40.473 20.772 30.605 30.594 30.935 130.450 60.181 180.587 40.805 30.690 30.785 20.614 40.882 40.819 30.632 3
H. Thomas, C. Qi, J. Deschaud, B. Marcotegui, F. Goulette, L. Guibas.: KPConv: Flexible and Deformable Convolution for Point Clouds. ICCV 2019
LAP-D0.594 90.720 90.692 100.637 140.456 160.773 80.391 120.730 50.587 50.445 120.940 100.381 90.288 50.434 140.453 130.591 80.649 90.581 50.777 150.749 150.610 5
HPEIN0.618 70.729 80.668 120.647 120.597 50.766 90.414 80.680 90.520 70.525 60.946 50.432 70.215 120.493 100.599 70.638 40.617 160.570 60.897 20.806 50.605 7
DPC0.592 100.720 90.700 80.602 170.480 120.762 100.380 140.713 60.585 60.437 140.940 100.369 130.288 50.434 140.509 110.590 90.639 130.567 70.772 160.755 130.592 10
Francis Engelmann, Theodora Kontogianni, Bastian Leibe: Dilated Point Convolutions. arXiv
joint point-basedpermissive0.634 50.614 160.778 30.667 110.633 40.825 30.420 70.804 20.467 130.561 40.951 30.494 30.291 30.566 60.458 120.579 100.764 40.559 80.838 70.814 40.598 9
Hung-Yueh Chiang, Yen-Liang Lin, Yueh-Cheng Liu, Winston H. Hsu: A Unified Point-Based Framework for 3D Segmentation. 3DV 2019
MCCNNpermissive0.633 60.866 10.731 50.771 40.576 80.809 50.410 90.684 80.497 90.491 70.949 40.466 40.105 240.581 50.646 60.620 60.680 70.542 90.817 100.795 60.618 4
P. Hermosilla, T. Ritschel, P.P. Vazquez, A. Vinacua, T. Ropinski: Monte Carlo Convolution for Learning on Non-Uniformly Sampled Point Clouds. SIGGRAPH Asia 2018
PCNN0.498 180.559 190.644 170.560 210.420 180.711 160.229 250.414 190.436 160.352 180.941 90.324 180.155 210.238 230.387 160.493 160.529 220.509 100.813 110.751 140.504 18
PointConv0.556 150.636 150.640 180.574 200.472 140.739 130.430 50.433 180.418 190.445 120.944 70.372 120.185 170.464 120.575 80.540 140.639 130.505 110.827 80.762 100.515 16
Wenxuan Wu, Zhongang Qi, Li Fuxin: PointConv: Deep Convolutional Networks on 3D Point Clouds. CVPR 2019
DMC-Net0.608 80.732 70.729 60.694 60.536 90.783 60.427 60.639 130.438 150.450 100.934 150.379 100.289 40.492 110.674 40.608 70.709 50.503 120.800 120.779 80.567 12
DVVNet0.562 130.648 130.700 80.770 50.586 70.687 180.333 160.650 120.514 80.475 90.906 230.359 140.223 110.340 190.442 140.422 220.668 80.501 130.708 200.779 80.534 15
Pointnet++ & Featurepermissive0.557 140.735 60.661 140.686 70.491 110.744 110.392 110.539 170.451 140.375 170.946 50.376 110.205 140.403 170.356 170.553 130.643 110.497 140.824 90.756 120.515 16
CCRFNet0.589 110.766 50.659 150.683 80.470 150.740 120.387 130.620 140.490 100.476 80.922 180.355 160.245 80.511 80.511 100.571 110.643 110.493 150.872 60.762 100.600 8
3DMV, FTSDF0.501 170.558 200.608 210.424 260.478 130.690 170.246 230.586 150.468 120.450 100.911 210.394 80.160 200.438 130.212 230.432 210.541 210.475 160.742 180.727 160.477 19
TextureNetpermissive0.566 120.672 120.664 130.671 90.494 100.719 140.445 40.678 110.411 200.396 150.935 130.356 150.225 100.412 160.535 90.565 120.636 150.464 170.794 140.680 200.568 11
Jingwei Huang, Haotian Zhang, Li Yi, Thomas Funkerhouser, Matthias Niessner, Leonidas Guibas: TextureNet: Consistent Local Parametrizations for Learning from High-Resolution Signals on Meshes. CVPR
FCPNpermissive0.447 210.679 110.604 220.578 190.380 200.682 190.291 200.106 280.483 110.258 260.920 190.258 230.025 280.231 250.325 180.480 190.560 190.463 180.725 190.666 220.231 28
Dario Rethage, Johanna Wald, Jürgen Sturm, Nassir Navab, Federico Tombari: Fully-Convolutional Point Networks for Large-Scale Point Clouds. ECCV 2018
PointCNN with RGBpermissive0.458 200.577 180.611 200.356 280.321 250.715 150.299 190.376 220.328 250.319 190.944 70.285 210.164 190.216 260.229 220.484 180.545 200.456 190.755 170.709 170.475 20
Yangyan Li, Rui Bu, Mingchao Sun, Baoquan Chen: PointCNN. NeurIPS 2018
PNET20.442 220.548 210.548 230.597 180.363 220.628 240.300 180.292 230.374 220.307 200.881 250.268 220.186 160.238 230.204 250.407 230.506 260.449 200.667 220.620 240.462 21
PanopticFusion-label0.529 160.491 230.688 110.604 160.386 190.632 230.225 270.705 70.434 170.293 210.815 260.348 170.241 90.499 90.669 50.507 150.649 90.442 210.796 130.602 250.561 13
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
SurfaceConvPF0.442 220.505 220.622 190.380 270.342 240.654 210.227 260.397 210.367 230.276 230.924 170.240 240.198 150.359 180.262 200.366 240.581 170.435 220.640 230.668 210.398 22
Hao Pan, Shilin Liu, Yang Liu, Xin Tong: Convolutional Neural Networks on 3D Surfaces Using Parallel Frames.
Tangent Convolutionspermissive0.438 240.437 260.646 160.474 230.369 210.645 220.353 150.258 250.282 270.279 220.918 200.298 200.147 220.283 200.294 190.487 170.562 180.427 230.619 240.633 230.352 24
Maxim Tatarchenko, Jaesik Park, Vladlen Koltun, Qian-Yi Zhou: Tangent convolutions for dense prediction in 3d. CVPR 2018
3DMV0.484 190.484 240.538 240.643 130.424 170.606 260.310 170.574 160.433 180.378 160.796 270.301 190.214 130.537 70.208 240.472 200.507 250.413 240.693 210.602 250.539 14
Angela Dai, Matthias Niessner: 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation. ECCV'18
ScanNet+FTSDF0.383 260.297 280.491 260.432 250.358 230.612 250.274 210.116 270.411 200.265 240.904 240.229 250.079 260.250 210.185 260.320 270.510 230.385 250.548 260.597 270.394 23
SPLAT Netcopyleft0.393 250.472 250.511 250.606 150.311 260.656 200.245 240.405 200.328 250.197 270.927 160.227 260.000 300.001 300.249 210.271 290.510 230.383 260.593 250.699 180.267 26
Hang Su, Varun Jampani, Deqing Sun, Subhransu Maji, Evangelos Kalogerakis, Ming-Hsuan Yang, Jan Kautz: SPLATNet: Sparse Lattice Networks for Point Cloud Processing. CVPR 2018
SSC-UNetpermissive0.308 280.353 270.290 290.278 290.166 290.553 270.169 290.286 240.147 290.148 290.908 220.182 280.064 270.023 290.018 300.354 260.363 270.345 270.546 280.685 190.278 25
ScanNetpermissive0.306 290.203 290.366 280.501 220.311 260.524 280.211 280.002 300.342 240.189 280.786 280.145 290.102 250.245 220.152 270.318 280.348 280.300 280.460 290.437 290.182 29
Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner: ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. CVPR'17
PointNet++permissive0.339 270.584 170.478 270.458 240.256 280.360 290.250 220.247 260.278 280.261 250.677 290.183 270.117 230.212 270.145 280.364 250.346 290.232 290.548 260.523 280.252 27
Charles R. Qi, Li Yi, Hao Su, Leonidas J. Guibas: pointnet++: deep hierarchical feature learning on point sets in a metric space.
ERROR0.054 300.000 300.041 300.172 300.030 300.062 300.001 300.035 290.004 300.051 300.143 300.019 300.003 290.041 280.050 290.003 300.054 300.018 300.005 300.264 300.082 30

This table lists the benchmark results for the 3D semantic instance scenario.




Method Infoavg ap 50%bathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort by
ResNet-backbone0.459 61.000 10.737 30.159 130.259 70.587 60.138 20.475 70.217 40.416 20.408 40.128 70.315 80.714 70.411 50.536 90.590 10.873 100.304 6
MTML0.549 21.000 10.807 20.588 40.327 50.647 20.004 120.815 10.180 50.418 10.364 60.182 50.445 31.000 10.442 40.688 30.571 21.000 10.396 2
Occipital-SCS0.512 31.000 10.716 40.509 50.506 10.611 40.092 50.602 40.177 60.346 50.383 50.165 60.442 40.850 50.386 70.618 50.543 30.889 80.389 3
3D-BoNet0.488 41.000 10.672 70.590 30.301 60.484 90.098 40.620 20.306 10.341 60.259 80.125 80.434 50.796 60.402 60.499 100.513 40.909 70.439 1
DCNet0.607 11.000 10.907 10.792 10.462 20.788 10.151 10.535 50.292 20.395 30.501 10.263 30.600 11.000 10.598 10.857 10.502 50.918 50.368 4
DPC-instance0.355 90.500 110.517 100.467 60.228 90.422 110.133 30.405 80.111 90.205 100.241 90.075 100.233 90.306 120.445 30.439 110.457 60.974 30.239 9
Francis Engelmann, Theodora Kontogianni, Bastian Leibe: Dilated Point Convolutions. arXiv
MASCpermissive0.447 70.528 100.555 90.381 70.382 30.633 30.002 130.509 60.260 30.361 40.432 30.327 20.451 20.571 80.367 80.639 40.386 70.980 20.276 7
Chen Liu, Yasutaka Furukawa: MASC: Multi-scale Affinity with Sparse Convolution for 3D Instance Segmentation.
UNet-backbone0.319 100.667 70.715 50.233 110.189 110.479 100.008 110.218 110.067 120.201 110.173 110.107 90.123 110.438 90.150 100.615 60.355 80.916 60.093 14
Seg-Clusterpermissive0.215 130.370 130.337 140.285 90.105 120.325 130.025 90.282 90.085 110.105 120.107 120.007 150.079 130.317 110.114 130.309 140.304 90.587 130.123 13
R-PointNet0.306 110.500 110.405 120.311 80.348 40.589 50.054 60.068 130.126 80.283 80.290 70.028 120.219 100.214 130.331 90.396 120.275 100.821 120.245 8
3D-SISpermissive0.382 81.000 10.432 110.245 100.190 100.577 70.013 100.263 100.033 130.320 70.240 100.075 110.422 60.857 30.117 120.699 20.271 110.883 90.235 10
Ji Hou, Angela Dai, Matthias Niessner: 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans. CVPR 2019
PanopticFusion-inst0.478 50.667 70.712 60.595 20.259 80.550 80.000 150.613 30.175 70.250 90.434 20.437 10.411 70.857 30.485 20.591 80.267 120.944 40.359 5
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
3D-BEVIS0.248 120.667 70.566 80.076 140.035 150.394 120.027 80.035 140.098 100.099 130.030 140.025 130.098 120.375 100.126 110.604 70.181 130.854 110.171 11
Cathrin Elich, Francis Engelmann, Jonas Schult, Theodora Kontogianni, Bastian Leibe: 3D-BEVIS: Birds-Eye-View Instance Segmentation.
Sgpn_scannet0.143 140.208 150.390 130.169 120.065 130.275 140.029 70.069 120.000 140.087 140.043 130.014 140.027 150.000 140.112 140.351 130.168 140.438 140.138 12
MaskRCNN 2d->3d Proj0.058 150.333 140.002 150.000 150.053 140.002 150.002 140.021 150.000 140.045 150.024 150.238 40.065 140.000 140.014 150.107 150.020 150.110 150.006 15

This table lists the benchmark results for the 2D semantic label scenario.


Method Infoavg ioubathtubbedbookshelfcabinetchaircountercurtaindeskdoorfloorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwallwindow
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort by
RFBNet0.592 10.616 30.758 10.659 10.581 10.330 30.469 10.655 40.543 30.524 10.924 10.355 20.336 40.572 20.479 20.671 30.648 20.480 10.814 40.814 10.614 2
DCRedNet0.583 20.682 20.723 20.542 40.510 40.310 50.451 20.668 20.549 20.520 20.920 20.375 10.446 10.528 40.417 30.670 40.577 60.478 20.862 20.806 20.628 1
FuseNetpermissive0.535 40.570 50.681 50.182 90.512 30.290 60.431 40.659 30.504 50.495 40.903 40.308 40.428 20.523 50.365 40.676 20.621 30.470 30.762 60.779 40.541 5
Caner Hazirbas, Lingni Ma, Csaba Domokos, Daniel Cremers: FuseNet: Incorporating Depth into Semantic Segmentation via Fusion-based CNN Architecture. ACCV 2016
SSMAcopyleft0.577 30.695 10.716 40.439 60.563 20.314 40.444 30.719 10.551 10.503 30.887 50.346 30.348 30.603 10.353 50.709 10.600 40.457 40.901 10.786 30.599 3
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. International Journal of Computer Vision, 2019
AdapNet++copyleft0.503 50.613 40.722 30.418 70.358 90.337 20.370 70.479 70.443 60.368 70.907 30.207 70.213 80.464 70.525 10.618 50.657 10.450 50.788 50.721 60.408 8
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. International Journal of Computer Vision, 2019
3DMV (2d proj)0.498 60.481 70.612 60.579 30.456 60.343 10.384 50.623 50.525 40.381 60.845 60.254 60.264 60.557 30.182 70.581 70.598 50.429 60.760 70.661 80.446 7
Angela Dai, Matthias Niessner: 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation. ECCV'18
ILC-PSPNet0.475 70.490 60.581 70.289 80.507 50.067 90.379 60.610 60.417 80.435 50.822 80.278 50.267 50.503 60.228 60.616 60.533 70.375 70.820 30.729 50.560 4
ScanNet (2d proj)permissive0.330 90.293 80.521 80.657 20.361 80.161 80.250 80.004 90.440 70.183 90.836 70.125 90.060 90.319 90.132 80.417 80.412 80.344 80.541 90.427 90.109 9
Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner: ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. CVPR'17
Enet (reimpl)0.376 80.264 90.452 90.452 50.365 70.181 70.143 90.456 80.409 90.346 80.769 90.164 80.218 70.359 80.123 90.403 90.381 90.313 90.571 80.685 70.472 6
Re-implementation of Adam Paszke, Abhishek Chaurasia, Sangpil Kim, Eugenio Culurciello: ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation.

This table lists the benchmark results for the 2D semantic instance scenario.




Method Infoavg apbathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort by
MaskRCNN_ScanNetpermissive0.119 10.129 10.212 10.002 10.112 10.148 10.014 10.205 10.044 10.066 10.078 10.095 10.142 10.030 10.128 10.139 10.080 10.459 10.057 1
Re-implementation of Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross Girshick: Mask R-CNN. ICCV'17

This table lists the benchmark results for the scene type classification scenario.




Method Infoavg recallapartmentbathroombedroom / hotelbookstore / libraryconference roomcopy/mail roomhallwaykitchenlaundry roomliving room / loungemiscofficestorage / basement / garage
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
SE-ResNeXt-SSMA0.498 10.000 20.812 10.941 10.500 10.500 10.500 10.500 10.429 20.500 10.667 10.500 10.625 10.000 1
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. arXiv
resnet50_scannet0.353 20.250 10.812 10.529 20.500 10.500 10.000 20.500 10.571 10.000 20.556 20.000 20.375 20.000 1