This table lists the benchmark results for the 3D semantic label scenario.


Method Infoavg ioubathtubbedbookshelfcabinetchaircountercurtaindeskdoorfloorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwallwindow
sort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
SparseConvNet0.725 20.647 170.821 10.846 10.721 10.869 10.533 10.754 50.603 40.614 20.955 10.572 10.325 20.710 20.870 20.724 10.823 10.628 30.934 10.865 10.683 2
MinkowskiNetpermissive0.736 10.859 20.818 20.832 20.709 20.840 20.521 20.853 10.660 10.643 10.951 40.544 20.286 60.731 10.893 10.675 40.772 30.683 10.874 70.852 20.727 1
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019
PointConvpermissive0.666 40.781 60.759 50.699 90.644 40.822 40.475 30.779 30.564 80.504 90.953 20.428 80.203 150.586 50.754 40.661 50.753 50.588 50.902 20.813 50.642 3
Wenxuan Wu, Zhongang Qi, Li Fuxin: PointConv: Deep Convolutional Networks on 3D Point Clouds. CVPR 2019
KP-FCNN0.684 30.847 40.758 60.784 30.647 30.814 50.473 40.772 40.605 30.594 30.935 150.450 60.181 190.587 40.805 30.690 20.785 20.614 40.882 40.819 30.632 5
H. Thomas, C. Qi, J. Deschaud, B. Marcotegui, F. Goulette, L. Guibas.: KPConv: Flexible and Deformable Convolution for Point Clouds. ICCV 2019
FLConv0.605 110.751 80.695 130.701 80.545 110.758 130.448 50.596 180.538 90.428 170.953 20.398 120.160 210.457 150.598 100.612 100.686 100.551 100.880 50.768 120.585 13
TextureNetpermissive0.566 160.672 150.664 170.671 120.494 140.719 170.445 60.678 140.411 230.396 180.935 150.356 180.225 100.412 190.535 120.565 160.636 180.464 200.794 160.680 230.568 14
Jingwei Huang, Haotian Zhang, Li Yi, Thomas Funkerhouser, Matthias Niessner, Leonidas Guibas: TextureNet: Consistent Local Parametrizations for Learning from High-Resolution Signals on Meshes. CVPR
CDF-SM3D0.597 120.557 230.728 90.703 70.572 100.730 160.435 70.715 80.504 130.467 130.863 280.427 90.184 180.495 130.569 110.685 30.717 70.516 140.866 90.726 190.479 21
joint point-basedpermissive0.634 60.614 180.778 30.667 140.633 50.825 30.420 80.804 20.467 180.561 40.951 40.494 30.291 30.566 70.458 150.579 140.764 40.559 90.838 110.814 40.598 11
Hung-Yueh Chiang, Yen-Liang Lin, Yueh-Cheng Liu, Winston H. Hsu: A Unified Point-Based Framework for 3D Segmentation. 3DV 2019
HPEIN0.618 90.729 100.668 160.647 150.597 60.766 110.414 90.680 120.520 100.525 70.946 70.432 70.215 120.493 140.599 90.638 60.617 190.570 70.897 30.806 60.605 9
Li Jiang, Hengshuang Zhao, Shu Liu, Xiaoyong Shen, Chi-Wing Fu, Jiaya Jia: Hierarchical Point-Edge Interaction Network for Point Cloud Semantic Segmentation. ICCV 2019
MCCNNpermissive0.633 70.866 10.731 80.771 40.576 90.809 60.410 100.684 110.497 140.491 100.949 60.466 40.105 260.581 60.646 80.620 90.680 110.542 130.817 130.795 80.618 6
P. Hermosilla, T. Ritschel, P.P. Vazquez, A. Vinacua, T. Ropinski: Monte Carlo Convolution for Learning on Non-Uniformly Sampled Point Clouds. SIGGRAPH Asia 2018
SPH3D-GCNpermissive0.610 100.858 30.772 40.489 250.532 130.792 70.404 110.643 160.570 70.507 80.935 150.414 110.046 300.510 100.702 50.602 110.705 80.549 120.859 100.773 110.534 17
Huan Lei, Naveed Akhtar, and Ajmal Mian: Spherical Kernel for Efficient Graph Convolution on 3D Point Clouds.
MVPNet0.641 50.831 50.715 100.671 120.590 70.781 90.394 120.679 130.642 20.553 50.937 140.462 50.256 70.649 30.406 180.626 80.691 90.666 20.877 60.792 90.608 8
Maximilian Jaritz, Jiayuan Gu, Hao Su: Multi-view PointNet for 3D Scene Understanding. GMDL Workshop, ICCV 2019
Pointnet++ & Featurepermissive0.557 180.735 90.661 180.686 100.491 150.744 140.392 130.539 210.451 190.375 200.946 70.376 150.205 140.403 200.356 200.553 170.643 150.497 170.824 120.756 140.515 19
LAP-D0.594 130.720 110.692 140.637 170.456 190.773 100.391 140.730 70.587 50.445 150.940 120.381 140.288 40.434 170.453 160.591 120.649 130.581 60.777 170.749 170.610 7
CCRFNet0.589 150.766 70.659 190.683 110.470 180.740 150.387 150.620 170.490 150.476 110.922 200.355 190.245 80.511 90.511 130.571 150.643 150.493 180.872 80.762 130.600 10
DPC0.592 140.720 110.700 110.602 200.480 160.762 120.380 160.713 90.585 60.437 160.940 120.369 160.288 40.434 170.509 140.590 130.639 170.567 80.772 180.755 150.592 12
Francis Engelmann, Theodora Kontogianni, Bastian Leibe: Dilated Point Convolutions. arXiv
Tangent Convolutionspermissive0.438 270.437 290.646 200.474 260.369 240.645 250.353 170.258 280.282 300.279 250.918 220.298 230.147 240.283 230.294 220.487 200.562 210.427 260.619 270.633 260.352 27
Maxim Tatarchenko, Jaesik Park, Vladlen Koltun, Qian-Yi Zhou: Tangent convolutions for dense prediction in 3d. CVPR 2018
DMC-Net0.630 80.706 130.738 70.745 60.535 120.787 80.335 180.742 60.512 120.546 60.941 100.427 90.330 10.496 120.653 70.634 70.719 60.550 110.754 200.801 70.642 3
DVVNet0.562 170.648 160.700 110.770 50.586 80.687 210.333 190.650 150.514 110.475 120.906 250.359 170.223 110.340 220.442 170.422 250.668 120.501 160.708 230.779 100.534 17
3DMV0.484 220.484 270.538 270.643 160.424 200.606 290.310 200.574 200.433 220.378 190.796 300.301 220.214 130.537 80.208 270.472 230.507 280.413 270.693 240.602 280.539 16
Angela Dai, Matthias Niessner: 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation. ECCV'18
PNET20.442 250.548 240.548 260.597 210.363 250.628 270.300 210.292 260.374 250.307 230.881 270.268 250.186 170.238 260.204 280.407 260.506 290.449 230.667 250.620 270.462 24
PointCNN with RGBpermissive0.458 230.577 200.611 230.356 310.321 280.715 180.299 220.376 250.328 280.319 220.944 90.285 240.164 200.216 290.229 250.484 210.545 230.456 220.755 190.709 200.475 23
Yangyan Li, Rui Bu, Mingchao Sun, Baoquan Chen: PointCNN. NeurIPS 2018
FCPNpermissive0.447 240.679 140.604 250.578 220.380 230.682 220.291 230.106 310.483 160.258 290.920 210.258 260.025 310.231 280.325 210.480 220.560 220.463 210.725 220.666 250.231 31
Dario Rethage, Johanna Wald, Jürgen Sturm, Nassir Navab, Federico Tombari: Fully-Convolutional Point Networks for Large-Scale Point Clouds. ECCV 2018
ScanNet+FTSDF0.383 290.297 310.491 290.432 280.358 260.612 280.274 240.116 300.411 230.265 270.904 260.229 280.079 280.250 240.185 290.320 300.510 260.385 280.548 290.597 300.394 26
PointNet++permissive0.339 300.584 190.478 300.458 270.256 310.360 320.250 250.247 290.278 310.261 280.677 320.183 300.117 250.212 300.145 310.364 280.346 320.232 320.548 290.523 310.252 30
Charles R. Qi, Li Yi, Hao Su, Leonidas J. Guibas: pointnet++: deep hierarchical feature learning on point sets in a metric space.
3DMV, FTSDF0.501 200.558 220.608 240.424 290.478 170.690 200.246 260.586 190.468 170.450 140.911 230.394 130.160 210.438 160.212 260.432 240.541 240.475 190.742 210.727 180.477 22
SPLAT Netcopyleft0.393 280.472 280.511 280.606 180.311 290.656 230.245 270.405 230.328 280.197 300.927 180.227 290.000 330.001 330.249 240.271 320.510 260.383 290.593 280.699 210.267 29
Hang Su, Varun Jampani, Deqing Sun, Subhransu Maji, Evangelos Kalogerakis, Ming-Hsuan Yang, Jan Kautz: SPLATNet: Sparse Lattice Networks for Point Cloud Processing. CVPR 2018
PCNN0.498 210.559 210.644 210.560 230.420 210.711 190.229 280.414 220.436 200.352 210.941 100.324 210.155 230.238 260.387 190.493 190.529 250.509 150.813 140.751 160.504 20
SurfaceConvPF0.442 250.505 250.622 220.380 300.342 270.654 240.227 290.397 240.367 260.276 260.924 190.240 270.198 160.359 210.262 230.366 270.581 200.435 250.640 260.668 240.398 25
Hao Pan, Shilin Liu, Yang Liu, Xin Tong: Convolutional Neural Networks on 3D Surfaces Using Parallel Frames.
PanopticFusion-label0.529 190.491 260.688 150.604 190.386 220.632 260.225 300.705 100.434 210.293 240.815 290.348 200.241 90.499 110.669 60.507 180.649 130.442 240.796 150.602 280.561 15
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
ScanNetpermissive0.306 320.203 320.366 310.501 240.311 290.524 310.211 310.002 330.342 270.189 310.786 310.145 320.102 270.245 250.152 300.318 310.348 310.300 310.460 320.437 320.182 32
Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner: ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. CVPR'17
SSC-UNetpermissive0.308 310.353 300.290 320.278 320.166 320.553 300.169 320.286 270.147 320.148 320.908 240.182 310.064 290.023 320.018 330.354 290.363 300.345 300.546 310.685 220.278 28
ERROR0.054 330.000 330.041 330.172 330.030 330.062 330.001 330.035 320.004 330.051 330.143 330.019 330.003 320.041 310.050 320.003 330.054 330.018 330.005 330.264 330.082 33

This table lists the benchmark results for the 3D semantic instance scenario.




Method Infoavg ap 25%bathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
DPC-instance0.554 100.875 100.715 100.668 60.448 110.563 120.577 10.424 90.443 120.478 90.397 90.147 110.397 90.476 110.813 30.517 130.656 60.974 40.413 9
Francis Engelmann, Theodora Kontogianni, Bastian Leibe: Dilated Point Convolutions. arXiv
3D-BoNet0.687 61.000 10.887 50.836 20.587 90.643 100.550 20.620 50.724 40.522 70.501 70.243 90.512 61.000 10.751 60.807 60.661 40.909 100.612 4
Bo Yang, Jianan Wang, Ronald Clark, Qingyong Hu, Sen Wang, Andrew Markham, Niki Trigoni: Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds. NeurIPS 2019 Spotlight
ResNet-backbone0.695 31.000 10.855 60.579 100.589 80.735 70.484 30.588 60.856 10.634 20.571 20.298 60.500 81.000 10.824 10.818 40.702 30.935 90.545 6
DCNet0.739 11.000 10.941 20.844 10.714 30.851 10.459 40.632 40.755 30.635 10.525 60.428 50.600 11.000 10.794 40.819 30.658 51.000 10.643 1
Occipital-SCS0.688 51.000 10.913 30.730 50.737 20.743 60.442 50.855 20.655 50.546 50.546 40.263 80.508 70.889 50.568 100.771 80.705 20.889 110.625 2
R-PointNet0.544 110.500 150.655 120.661 70.663 40.765 40.432 60.214 130.612 60.584 40.499 80.204 100.286 120.429 120.655 70.650 120.539 80.950 60.499 8
UNet-backbone0.605 81.000 10.909 40.764 40.603 70.704 80.415 70.301 110.548 90.461 100.394 100.267 70.386 100.857 60.649 80.817 50.504 100.959 50.356 12
Seg-Clusterpermissive0.380 140.625 130.420 140.456 130.296 130.473 140.390 80.433 80.293 140.322 130.247 120.066 150.264 130.325 130.388 120.486 140.401 130.614 140.341 13
PanopticFusion-inst0.693 41.000 10.852 70.655 80.616 50.788 20.334 90.763 30.771 20.457 110.555 30.652 10.518 50.857 60.765 50.732 100.631 70.944 70.577 5
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
MTML0.731 21.000 10.992 10.779 30.609 60.746 50.308 100.867 10.601 70.607 30.539 50.519 20.550 31.000 10.824 10.869 10.729 11.000 10.616 3
MaskRCNN 2d->3d Proj0.261 150.903 90.081 150.008 150.233 140.175 150.280 110.106 150.150 150.203 150.175 140.480 30.218 150.143 140.542 110.404 150.153 150.393 150.049 15
Sgpn_scannet0.390 130.556 140.636 130.493 120.353 120.539 130.271 120.160 140.450 110.359 120.178 130.146 120.250 140.143 140.347 150.698 110.436 120.667 130.331 14
3D-SISpermissive0.558 91.000 10.773 90.614 90.503 100.691 90.200 130.412 100.498 100.546 60.311 110.103 130.600 10.857 60.382 130.799 70.445 110.938 80.371 10
Ji Hou, Angela Dai, Matthias Niessner: 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans. CVPR 2019
3D-BEVIS0.401 120.667 120.687 110.419 140.137 150.587 110.188 140.235 120.359 130.211 140.093 150.080 140.311 110.571 100.382 130.754 90.300 140.874 120.357 11
Cathrin Elich, Francis Engelmann, Jonas Schult, Theodora Kontogianni, Bastian Leibe: 3D-BEVIS: Birds-Eye-View Instance Segmentation.
MASCpermissive0.615 70.711 110.802 80.540 110.757 10.777 30.029 150.577 70.588 80.521 80.600 10.436 40.534 40.697 90.616 90.838 20.526 90.980 30.534 7
Chen Liu, Yasutaka Furukawa: MASC: Multi-scale Affinity with Sparse Convolution for 3D Instance Segmentation.

This table lists the benchmark results for the 2D semantic label scenario.


Method Infoavg ioubathtubbedbookshelfcabinetchaircountercurtaindeskdoorfloorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwallwindow
sort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
MCA-Net0.595 10.533 60.756 20.746 10.590 10.334 30.506 10.670 20.587 10.500 40.905 40.366 20.352 30.601 20.506 20.669 50.648 20.501 10.839 30.769 50.516 6
RFBNet0.592 20.616 30.758 10.659 20.581 20.330 40.469 20.655 50.543 40.524 10.924 10.355 30.336 50.572 30.479 30.671 30.648 20.480 20.814 50.814 10.614 2
DCRedNet0.583 30.682 20.723 30.542 50.510 50.310 60.451 30.668 30.549 30.520 20.920 20.375 10.446 10.528 50.417 40.670 40.577 70.478 30.862 20.806 20.628 1
SSMAcopyleft0.577 40.695 10.716 50.439 70.563 30.314 50.444 40.719 10.551 20.503 30.887 60.346 40.348 40.603 10.353 60.709 10.600 50.457 50.901 10.786 30.599 3
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. International Journal of Computer Vision, 2019
FuseNetpermissive0.535 50.570 50.681 60.182 100.512 40.290 70.431 50.659 40.504 60.495 50.903 50.308 50.428 20.523 60.365 50.676 20.621 40.470 40.762 70.779 40.541 5
Caner Hazirbas, Lingni Ma, Csaba Domokos, Daniel Cremers: FuseNet: Incorporating Depth into Semantic Segmentation via Fusion-based CNN Architecture. ACCV 2016
3DMV (2d proj)0.498 70.481 80.612 70.579 40.456 70.343 10.384 60.623 60.525 50.381 70.845 70.254 70.264 70.557 40.182 80.581 80.598 60.429 70.760 80.661 90.446 8
Angela Dai, Matthias Niessner: 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation. ECCV'18
ILC-PSPNet0.475 80.490 70.581 80.289 90.507 60.067 100.379 70.610 70.417 90.435 60.822 90.278 60.267 60.503 70.228 70.616 70.533 80.375 80.820 40.729 60.560 4
AdapNet++copyleft0.503 60.613 40.722 40.418 80.358 100.337 20.370 80.479 80.443 70.368 80.907 30.207 80.213 90.464 80.525 10.618 60.657 10.450 60.788 60.721 70.408 9
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. International Journal of Computer Vision, 2019
ScanNet (2d proj)permissive0.330 100.293 90.521 90.657 30.361 90.161 90.250 90.004 100.440 80.183 100.836 80.125 100.060 100.319 100.132 90.417 90.412 90.344 90.541 100.427 100.109 10
Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner: ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. CVPR'17
Enet (reimpl)0.376 90.264 100.452 100.452 60.365 80.181 80.143 100.456 90.409 100.346 90.769 100.164 90.218 80.359 90.123 100.403 100.381 100.313 100.571 90.685 80.472 7
Re-implementation of Adam Paszke, Abhishek Chaurasia, Sangpil Kim, Eugenio Culurciello: ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation.

This table lists the benchmark results for the 2D semantic instance scenario.




Method Infoavg apbathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
MaskRCNN_ScanNetpermissive0.119 10.129 10.212 10.002 10.112 10.148 10.014 10.205 10.044 10.066 10.078 10.095 10.142 10.030 10.128 10.139 10.080 10.459 10.057 1
Re-implementation of Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross Girshick: Mask R-CNN. ICCV'17

This table lists the benchmark results for the scene type classification scenario.




Method Infoavg recallapartmentbathroombedroom / hotelbookstore / libraryconference roomcopy/mail roomhallwaykitchenlaundry roomliving room / loungemiscofficestorage / basement / garage
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
SSCN0.534 10.500 10.938 10.824 21.000 10.500 10.000 21.000 10.857 10.000 20.444 30.000 20.875 10.000 1
SE-ResNeXt-SSMA0.498 20.000 30.812 20.941 10.500 20.500 10.500 10.500 20.429 30.500 10.667 10.500 10.625 20.000 1
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. arXiv
resnet50_scannet0.353 30.250 20.812 20.529 30.500 20.500 10.000 20.500 20.571 20.000 20.556 20.000 20.375 30.000 1