Presenting the ScanNet200 Benchmark

We present the ScanNet200 benchmark, which studies an order of magnitude more class categories than previous version of ScanNet. The scene geometry is shared within the two tasks, but the parsing of surface annotation allows for a larger vocabulary and more realistic setting for in the wild 3D understanding methods.

The ScanNet200 benchmark includes both finer-grained categories as well as a large number of previously unaddressed classes. This induces a much more challenging setting regarding the diversity of naturally observed semantic classes seen in the raw ScanNet RGB-D observations, where the data also reflects naturally encountered class imbalances. The difference in category frequencies between ScanNet and ScanNet200 can be seen in the Figure above.

ScanNet200 Benchmark

This table lists the benchmark results for the ScanNet200 3D semantic label scenario.




Method Infoavg iouhead ioucommon ioutail ioualarm clockarmchairbackpackbagballbarbasketbathroom cabinetbathroom counterbathroom stallbathroom stall doorbathroom vanitybathtubbedbenchbicyclebinblackboardblanketblindsboardbookbookshelfbottlebowlboxbroombucketbulletin boardcabinetcalendarcandlecartcase of water bottlescd caseceilingceiling lightchairclockclosetcloset doorcloset rodcloset wallclothesclothes dryercoat rackcoffee kettlecoffee makercoffee tablecolumncomputer towercontainercopiercouchcountercratecupcurtaincushiondecorationdeskdining tabledish rackdishwasherdividerdoordoorframedresserdumbbelldustpanend tablefanfile cabinetfire alarmfire extinguisherfireplacefloorfolded chairfurnitureguitarguitar casehair dryerhandicap barhatheadphonesironing boardjacketkeyboardkeyboard pianokitchen cabinetkitchen counterladderlamplaptoplaundry basketlaundry detergentlaundry hamperledgelightlight switchluggagemachinemailboxmatmattressmicrowavemini fridgemirrormonitormousemusic standnightstandobjectoffice chairottomanovenpaperpaper bagpaper cutterpaper towel dispenserpaper towel rollpersonpianopicturepillarpillowpipeplantplateplungerposterpotted plantpower outletpower stripprinterprojectorprojector screenpurserackradiatorrailrange hoodrecycling binrefrigeratorscaleseatshelfshoeshowershower curtainshower curtain rodshower doorshower floorshower headshower wallsignsinksoap dishsoap dispensersofa chairspeakerstair railstairsstandstoolstorage binstorage containerstorage organizerstovestructurestuffed animalsuitcasetabletelephonetissue boxtoastertoaster oventoilettoilet papertoilet paper dispensertoilet paper holdertoilet seat cover dispensertoweltrash bintrash cantraytubetvtv standvacuum cleanerventwallwardrobewashing machinewater bottlewater coolerwater pitcherwhiteboardwindowwindowsill
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
ALS-MinkowskiNetcopyleft0.414 20.610 20.322 30.271 20.542 20.153 30.159 110.000 30.000 70.000 10.404 40.503 50.532 60.672 160.804 50.285 10.888 20.000 30.900 20.226 20.087 20.598 40.342 50.671 10.217 100.087 30.449 40.000 10.000 30.253 30.477 61.000 10.000 10.118 50.000 30.905 10.071 130.710 20.076 20.047 160.665 10.376 80.981 10.000 10.000 20.466 70.632 70.113 40.769 10.956 40.795 20.031 90.314 10.936 10.000 10.390 20.601 30.000 70.458 80.366 20.719 30.440 50.564 10.699 40.314 10.464 70.784 20.200 10.283 60.973 10.142 90.000 10.250 70.285 60.220 70.718 10.752 60.723 20.460 10.248 150.475 100.463 130.000 40.000 10.446 80.021 50.025 110.285 10.000 40.972 10.149 80.769 10.230 30.535 10.879 20.252 80.000 10.693 10.129 20.000 140.000 40.000 10.447 10.958 10.662 90.159 20.598 30.780 110.344 20.646 30.106 60.893 30.135 30.455 30.000 10.194 30.259 10.726 30.475 40.000 90.000 10.741 10.865 10.571 20.817 30.445 30.000 10.506 20.630 30.230 120.916 20.728 10.635 11.000 10.252 60.000 10.804 20.697 70.137 110.043 70.717 20.807 30.000 10.510 130.245 20.000 70.000 10.709 30.000 20.000 10.703 20.572 40.646 20.223 100.531 50.984 10.397 30.813 10.798 10.135 120.800 10.000 10.097 20.832 20.752 80.842 70.000 10.852 10.149 90.846 100.000 10.666 50.359 50.252 80.777 10.690 2
Guangda Ji, Silvan Weder, Francis Engelmann, Marc Pollefeys, Hermann Blum: ARKit LabelMaker: A New Scale for Indoor 3D Scene Understanding. arxiv
DITR0.449 10.629 10.392 10.289 10.650 10.168 20.862 10.000 30.313 30.000 10.580 10.568 20.564 30.766 70.867 10.238 50.949 10.000 30.866 30.300 10.000 90.664 10.482 10.508 120.317 10.420 10.551 20.000 10.000 30.486 20.519 10.662 40.000 10.385 10.000 30.901 30.079 90.727 10.000 70.160 30.606 30.417 40.967 20.000 10.000 20.498 50.596 110.130 20.728 30.998 10.805 10.000 170.314 10.934 20.000 10.278 40.636 10.000 70.403 120.367 10.741 20.484 10.500 21.000 10.113 120.828 10.815 10.000 70.733 20.969 40.374 20.000 10.579 11.000 10.230 50.617 50.983 10.729 10.423 40.855 10.508 60.622 20.018 30.000 10.591 30.034 40.028 100.066 110.869 10.904 70.334 20.651 50.716 10.514 20.871 60.315 30.000 10.664 30.128 30.014 100.000 40.000 10.392 20.851 20.817 10.153 30.823 10.991 10.318 30.680 10.134 30.913 10.157 20.448 40.000 10.000 80.000 30.826 10.978 10.091 60.000 10.660 40.647 30.571 20.804 40.001 90.000 10.480 30.700 10.421 50.947 10.433 140.411 30.148 60.262 50.000 10.849 10.709 60.138 100.150 20.714 30.889 10.000 10.698 10.222 40.000 70.000 10.720 20.000 20.000 10.805 10.600 10.642 30.268 90.904 10.982 20.477 10.632 60.718 20.139 90.776 20.000 10.178 10.886 10.962 10.839 80.000 10.851 20.043 120.869 40.000 10.710 10.315 60.348 30.753 20.397 8
Karim Abou Zeid, Kadir Yilmaz, Daan de Geus, Alexander Hermans, David Adrian, Timm Linder, Bastian Leibe: DINO in the Room: Leveraging 2D Foundation Models for 3D Segmentation.
PTv3 ScanNet2000.393 30.592 30.330 20.216 30.520 30.109 50.108 160.000 30.337 10.000 10.310 120.394 90.494 110.753 90.848 20.256 30.717 80.000 30.842 40.192 50.065 30.449 100.346 40.546 60.190 130.000 90.384 70.000 10.000 30.218 40.505 20.791 30.000 10.136 40.000 30.903 20.073 120.687 60.000 70.168 20.551 50.387 70.941 30.000 10.000 20.397 120.654 30.000 100.714 50.759 150.752 70.118 40.264 40.926 30.000 10.048 60.575 50.000 70.597 20.366 20.755 10.469 20.474 30.798 20.140 100.617 30.692 70.000 70.592 40.971 20.188 40.000 10.133 90.593 20.349 10.650 30.717 80.699 30.455 20.790 20.523 40.636 10.301 10.000 10.622 20.000 110.017 150.259 30.000 40.921 30.337 10.733 20.210 40.514 20.860 80.407 10.000 10.688 20.109 80.000 140.000 40.000 10.151 50.671 80.782 20.115 130.641 20.903 20.349 10.616 40.088 70.832 80.000 60.480 20.000 10.428 10.000 30.497 100.000 50.000 90.000 10.662 30.690 20.612 10.828 10.575 10.000 10.404 70.644 20.325 70.887 40.728 10.009 160.134 70.026 170.000 10.761 30.731 40.172 60.077 40.528 80.727 70.000 10.603 50.220 50.022 30.000 10.740 10.000 20.000 10.661 40.586 20.566 40.436 40.531 50.978 30.457 20.708 30.583 60.141 70.748 30.000 10.026 50.822 30.871 40.879 50.000 10.851 20.405 20.914 10.000 10.682 30.000 150.281 40.738 30.463 6
Xiaoyang Wu, Li Jiang, Peng-Shuai Wang, Zhijian Liu, Xihui Liu, Yu Qiao, Wanli Ouyang, Tong He, Hengshuang Zhao: Point Transformer V3: Simpler, Faster, Stronger. CVPR 2024 (Oral)
ODIN - Sem200permissive0.368 40.562 40.297 40.207 40.380 170.196 10.828 20.000 30.321 20.000 10.400 50.775 10.460 130.501 170.769 120.065 150.870 30.000 30.913 10.213 30.000 90.000 170.389 20.554 40.312 30.000 90.591 10.000 10.000 30.491 10.487 30.894 20.000 10.378 20.303 10.796 170.088 60.669 130.081 10.216 10.256 170.334 130.898 70.000 10.000 20.370 140.599 100.000 100.581 160.988 20.749 80.090 60.242 50.921 40.000 10.202 50.609 20.000 70.655 10.214 130.654 90.346 150.408 70.485 90.169 80.631 20.704 60.000 70.814 10.940 100.127 160.000 10.000 120.462 40.227 60.641 40.885 30.657 50.434 30.000 170.550 20.393 150.000 40.000 10.590 40.000 110.048 20.077 90.000 40.784 160.131 100.557 100.316 20.359 80.833 140.373 20.000 10.661 40.108 90.001 120.000 40.000 10.301 30.612 110.565 150.129 100.482 80.468 160.274 50.561 80.376 10.912 20.181 10.440 60.000 10.166 40.000 30.641 50.000 50.426 20.000 10.642 50.626 70.259 110.787 80.429 40.000 10.589 10.523 80.246 110.857 60.000 170.228 90.000 110.265 40.000 10.752 60.832 10.090 160.157 10.791 10.578 160.000 10.373 150.539 10.000 70.000 10.685 50.000 20.000 10.632 80.575 30.663 10.152 110.358 90.926 130.397 30.454 150.610 40.119 150.685 70.000 10.000 120.803 80.740 90.441 140.000 10.800 100.000 170.871 30.000 10.220 170.487 10.862 10.682 60.054 17
Ayush Jain, Pushkal Katara, Nikolaos Gkanatsios, Adam W. Harley, Gabriel Sarch, Kriti Aggarwal, Vishrav Chaudhary, Katerina Fragkiadaki: ODIN: A Single Model for 2D and 3D Segmentation. CVPR 2024
BFANet ScanNet200permissive0.360 50.553 70.293 50.193 50.483 100.096 60.266 60.000 30.000 70.000 10.298 130.255 120.661 10.810 50.810 30.194 100.785 70.000 30.000 170.161 60.000 90.494 90.382 30.574 30.258 50.000 90.372 90.000 10.000 30.043 140.436 80.000 110.000 10.239 30.000 30.901 30.105 10.689 40.025 40.128 40.614 20.436 10.493 170.000 10.000 20.526 40.546 130.109 50.651 140.953 50.753 60.101 50.143 130.897 50.000 10.431 10.469 150.000 70.522 60.337 50.661 60.459 30.409 60.666 50.102 140.508 60.757 40.000 70.060 140.970 30.497 10.000 10.376 30.511 30.262 40.688 20.921 20.617 100.321 120.590 60.491 90.556 40.000 40.000 10.481 50.093 10.043 30.284 20.000 40.875 140.135 90.669 40.124 130.394 60.849 110.298 40.000 10.476 170.088 130.042 70.000 40.000 10.254 40.653 100.741 60.215 10.573 50.852 50.266 100.654 20.056 120.835 60.000 60.492 10.000 10.000 80.000 30.612 90.000 50.000 90.000 10.616 60.469 170.460 50.698 140.516 20.000 10.378 80.563 40.476 40.863 50.574 90.330 60.000 110.282 30.000 10.760 40.710 50.233 10.000 100.641 50.814 20.000 10.585 100.053 110.000 70.000 10.629 100.000 20.000 10.678 30.528 130.534 50.129 140.596 40.973 40.264 120.772 20.526 100.139 90.707 40.000 10.000 120.764 140.591 160.848 60.000 10.827 40.338 30.806 120.000 10.568 90.151 100.358 20.659 100.510 4
Weiguang Zhao, Rui Zhang, Qiufeng Wang, Guangliang Cheng, Kaizhu Huang: BFANet: Revisiting 3D Semantic Segmentation with Boundary Feature Analysis. CVPR 2025
OctFormer ScanNet200permissive0.326 130.539 100.265 100.131 120.499 60.110 40.522 30.000 30.000 70.000 10.318 110.427 70.455 150.743 110.765 130.175 110.842 40.000 30.828 50.204 40.033 60.429 110.335 60.601 20.312 30.000 90.357 100.000 10.000 30.047 110.423 90.000 110.000 10.105 90.000 30.873 90.079 90.670 120.000 70.117 50.471 130.432 30.829 110.000 10.000 20.584 20.417 170.089 60.684 90.837 120.705 160.021 120.178 110.892 60.000 10.028 80.505 130.000 70.457 90.200 140.662 40.412 90.244 150.496 80.000 170.451 80.626 90.000 70.102 110.943 90.138 130.000 10.000 120.149 80.291 30.534 90.722 70.632 70.331 100.253 140.453 110.487 110.000 40.000 10.479 60.000 110.022 130.000 120.000 40.900 100.128 110.684 30.164 100.413 40.854 100.000 120.000 10.512 160.074 150.003 110.000 40.000 10.000 110.469 150.613 120.132 80.529 70.871 30.227 160.582 70.026 170.787 120.000 60.339 150.000 10.000 80.000 30.626 70.000 50.029 80.000 10.587 90.612 80.411 70.724 100.000 100.000 10.407 60.552 50.513 30.849 100.655 40.408 40.000 110.296 20.000 10.686 150.645 140.145 80.022 80.414 140.633 110.000 10.637 20.224 30.000 70.000 10.650 80.000 20.000 10.622 90.535 120.343 120.483 30.230 130.943 100.289 100.618 70.596 50.140 80.679 80.000 10.022 60.783 110.620 120.906 10.000 10.806 80.137 100.865 50.000 10.378 120.000 150.168 150.680 80.227 13
Peng-Shuai Wang: OctFormer: Octree-based Transformers for 3D Point Clouds. SIGGRAPH 2023
L3DETR-ScanNet_2000.336 80.533 110.279 60.155 100.508 50.073 110.101 170.000 30.058 60.000 10.294 140.233 140.548 40.927 10.788 100.264 20.463 110.000 30.638 120.098 130.014 70.411 120.226 130.525 100.225 90.010 70.397 60.000 10.000 30.192 60.380 140.598 60.000 10.117 60.000 30.883 60.082 80.689 40.000 70.032 170.549 60.417 40.910 50.000 10.000 20.448 80.613 90.000 100.697 70.960 30.759 40.158 20.293 30.883 70.000 10.312 30.583 40.079 40.422 110.068 170.660 70.418 70.298 120.430 120.114 110.526 50.776 30.051 30.679 30.946 60.152 70.000 10.183 80.000 150.211 80.511 100.409 160.565 120.355 80.448 80.512 50.557 30.000 40.000 10.420 90.000 110.007 170.104 60.000 40.125 170.330 30.514 150.146 120.321 130.860 80.174 110.000 10.629 60.075 140.000 140.000 40.000 10.002 100.671 80.712 70.141 60.339 120.856 40.261 120.529 100.067 100.835 60.000 60.369 120.000 10.259 20.000 30.629 60.000 50.487 10.000 10.579 110.646 40.107 170.720 110.122 70.000 10.333 140.505 100.303 90.908 30.503 130.565 20.074 80.324 10.000 10.740 80.661 110.109 130.000 100.427 130.563 170.000 10.579 110.108 80.000 70.000 10.664 60.000 20.000 10.641 70.539 110.416 70.515 20.256 110.940 120.312 60.209 170.620 30.138 110.636 110.000 10.000 120.775 130.861 50.765 120.000 10.801 90.119 110.860 80.000 10.687 20.001 140.192 140.679 90.699 1
Yanmin Wu, Qiankun Gao, Renrui Zhang, Jian Zhang: Language-Assisted 3D Scene Understanding. arXiv23.12
PPT-SpUNet-F.T.0.332 120.556 60.270 70.123 140.519 40.091 70.349 40.000 30.000 70.000 10.339 90.383 100.498 100.833 40.807 40.241 40.584 90.000 30.755 70.124 80.000 90.608 30.330 80.530 90.314 20.000 90.374 80.000 10.000 30.197 50.459 70.000 110.000 10.117 60.000 30.876 70.095 20.682 90.000 70.086 80.518 70.433 20.930 40.000 10.000 20.563 30.542 140.077 70.715 40.858 110.756 50.008 160.171 120.874 80.000 10.039 70.550 110.000 70.545 50.256 80.657 80.453 40.351 100.449 110.213 60.392 120.611 110.000 70.037 150.946 60.138 130.000 10.000 120.063 110.308 20.537 80.796 50.673 40.323 110.392 100.400 140.509 70.000 40.000 10.649 10.000 110.023 120.000 120.000 40.914 60.002 160.506 160.163 110.359 80.872 50.000 120.000 10.623 70.112 60.001 120.000 40.000 10.021 90.753 50.565 150.150 40.579 40.806 90.267 90.616 40.042 140.783 130.000 60.374 110.000 10.000 80.000 30.620 80.000 50.000 90.000 10.572 130.634 50.350 90.792 50.000 100.000 10.376 90.535 60.378 60.855 70.672 30.074 130.000 110.185 100.000 10.727 120.660 120.076 170.000 100.432 120.646 100.000 10.594 80.006 130.000 70.000 10.658 70.000 20.000 10.661 40.549 100.300 140.291 80.045 140.942 110.304 80.600 80.572 70.135 120.695 50.000 10.008 90.793 90.942 20.899 20.000 10.816 60.181 70.897 20.000 10.679 40.223 80.264 50.691 50.345 12
Xiaoyang Wu, Zhuotao Tian, Xin Wen, Bohao Peng, Xihui Liu, Kaicheng Yu, Hengshuang Zhao: Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt Training. CVPR 2024
GSTran0.334 100.533 120.250 120.179 80.487 80.041 160.139 130.003 10.273 50.000 10.273 160.189 160.465 120.704 140.794 80.198 80.831 50.000 30.712 80.055 160.063 50.518 60.306 90.459 160.217 100.028 40.282 140.000 10.000 30.044 120.405 110.558 80.000 10.080 120.000 30.873 90.020 160.684 70.000 70.075 120.496 90.363 90.651 150.000 10.000 20.425 90.648 50.000 100.669 110.914 60.741 100.009 150.200 90.864 90.000 10.000 110.560 90.000 70.357 140.233 120.633 110.408 110.411 40.320 160.242 50.440 90.598 140.047 40.205 80.940 100.139 110.000 10.372 40.138 90.191 100.495 110.618 130.624 90.321 120.595 40.496 70.499 80.000 40.000 10.340 120.014 60.032 70.136 40.000 40.903 80.279 50.601 90.198 50.345 110.849 110.260 60.000 10.573 140.072 170.060 50.000 40.000 10.089 70.838 40.775 40.125 110.381 110.752 130.274 50.517 130.032 150.841 50.000 60.354 140.000 10.047 60.000 30.439 130.787 30.252 40.000 10.512 160.507 160.158 160.717 120.000 100.000 10.337 120.483 130.570 10.853 80.614 70.121 110.070 90.229 70.000 10.732 110.773 20.193 30.000 100.606 70.791 50.000 10.593 90.000 140.010 50.000 10.574 160.000 20.000 10.507 120.554 90.361 110.136 130.608 30.948 70.304 80.593 100.533 80.011 160.634 120.000 10.060 30.821 40.613 130.797 100.000 10.799 110.036 130.782 140.000 10.609 70.423 30.133 170.647 120.213 15
IMFSegNet0.334 90.532 130.251 110.179 70.486 90.041 160.139 130.003 10.283 40.000 10.274 150.191 150.457 140.704 140.795 70.197 90.830 60.000 30.710 90.055 160.064 40.518 60.305 100.458 170.216 120.027 50.284 130.000 10.000 30.044 120.406 100.561 70.000 10.080 120.000 30.873 90.021 150.683 80.000 70.076 90.494 100.363 90.648 160.000 10.000 20.425 90.649 40.000 100.668 120.908 70.740 110.010 140.206 80.862 100.000 10.000 110.560 90.000 70.359 130.237 110.631 120.408 110.411 40.322 150.246 40.439 100.599 130.047 40.213 70.940 100.139 110.000 10.369 50.124 100.188 120.495 110.624 110.626 80.320 140.595 40.495 80.496 100.000 40.000 10.340 120.014 60.032 70.135 50.000 40.903 80.277 60.612 80.196 70.344 120.848 130.260 60.000 10.574 130.073 160.062 40.000 40.000 10.091 60.839 30.776 30.123 120.392 90.756 120.274 50.518 120.029 160.842 40.000 60.357 130.000 10.035 70.000 30.444 120.793 20.245 50.000 10.512 160.512 150.159 150.713 130.000 100.000 10.336 130.484 120.569 20.852 90.615 60.120 120.068 100.228 80.000 10.733 100.773 20.190 40.000 100.608 60.792 40.000 10.597 70.000 140.025 20.000 10.573 170.000 20.000 10.508 110.555 80.363 100.139 120.610 20.947 80.305 70.594 90.527 90.009 170.633 130.000 10.060 30.820 50.604 150.799 90.000 10.799 110.034 140.784 130.000 10.618 60.424 20.134 160.646 130.214 14
OA-CNN-L_ScanNet2000.333 110.558 50.269 90.124 130.448 140.080 90.272 50.000 30.000 70.000 10.342 80.515 40.524 70.713 130.789 90.158 120.384 120.000 30.806 60.125 70.000 90.496 80.332 70.498 140.227 80.024 60.474 30.000 10.003 20.071 90.487 30.000 110.000 10.110 80.000 30.876 70.013 170.703 30.000 70.076 90.473 120.355 110.906 60.000 10.000 20.476 60.706 10.000 100.672 100.835 130.748 90.015 130.223 70.860 110.000 10.000 110.572 70.000 70.509 70.313 70.662 40.398 130.396 80.411 130.276 20.527 40.711 50.000 70.076 130.946 60.166 60.000 10.022 100.160 70.183 130.493 130.699 90.637 60.403 60.330 120.406 130.526 60.024 20.000 10.392 110.000 110.016 160.000 120.196 30.915 50.112 120.557 100.197 60.352 100.877 30.000 120.000 10.592 120.103 110.000 140.067 10.000 10.089 70.735 70.625 110.130 90.568 60.836 70.271 80.534 90.043 130.799 110.001 50.445 50.000 10.000 80.024 20.661 40.000 50.262 30.000 10.591 80.517 130.373 80.788 70.021 80.000 10.455 40.517 90.320 80.823 120.200 160.001 170.150 50.100 120.000 10.736 90.668 100.103 140.052 60.662 40.720 80.000 10.602 60.112 70.002 60.000 10.637 90.000 20.000 10.621 100.569 50.398 90.412 50.234 120.949 60.363 50.492 140.495 110.251 40.665 90.000 10.001 110.805 70.833 60.794 110.000 10.821 50.314 50.843 110.000 10.560 100.245 70.262 60.713 40.370 11
CeCo0.340 70.551 90.247 130.181 60.475 120.057 150.142 120.000 30.000 70.000 10.387 60.463 60.499 90.924 20.774 110.213 60.257 130.000 30.546 150.100 110.006 80.615 20.177 170.534 70.246 60.000 90.400 50.000 10.338 10.006 160.484 50.609 50.000 10.083 110.000 30.873 90.089 50.661 140.000 70.048 150.560 40.408 60.892 80.000 10.000 20.586 10.616 80.000 100.692 80.900 80.721 120.162 10.228 60.860 110.000 10.000 110.575 50.083 30.550 40.347 40.624 130.410 100.360 90.740 30.109 130.321 150.660 80.000 70.121 90.939 130.143 80.000 10.400 20.003 130.190 110.564 60.652 100.615 110.421 50.304 130.579 10.547 50.000 40.000 10.296 140.000 110.030 90.096 70.000 40.916 40.037 130.551 120.171 90.376 70.865 70.286 50.000 10.633 50.102 120.027 80.011 30.000 10.000 110.474 140.742 50.133 70.311 130.824 80.242 130.503 140.068 90.828 90.000 60.429 70.000 10.063 50.000 30.781 20.000 50.000 90.000 10.665 20.633 60.450 60.818 20.000 100.000 10.429 50.532 70.226 130.825 110.510 110.377 50.709 20.079 140.000 10.753 50.683 80.102 150.063 50.401 160.620 130.000 10.619 30.000 140.000 70.000 10.595 130.000 20.000 10.345 140.564 60.411 80.603 10.384 80.945 90.266 110.643 50.367 140.304 10.663 100.000 10.010 70.726 150.767 70.898 30.000 10.784 130.435 10.861 70.000 10.447 110.000 150.257 70.656 110.377 10
Zhisheng Zhong, Jiequan Cui, Yibo Yang, Xiaoyang Wu, Xiaojuan Qi, Xiangyu Zhang, Jiaya Jia: Understanding Imbalanced Semantic Segmentation Through Neural Collapse. CVPR 2023
PonderV2 ScanNet2000.346 60.552 80.270 80.175 90.497 70.070 120.239 70.000 30.000 70.000 10.232 170.412 80.584 20.842 30.804 50.212 70.540 100.000 30.433 160.106 100.000 90.590 50.290 120.548 50.243 70.000 90.356 110.000 10.000 30.062 100.398 130.441 100.000 10.104 100.000 30.888 50.076 110.682 90.030 30.094 70.491 110.351 120.869 100.000 10.063 10.403 110.700 20.000 100.660 130.881 90.761 30.050 80.186 100.852 130.000 10.007 90.570 80.100 20.565 30.326 60.641 100.431 60.290 140.621 60.259 30.408 110.622 100.125 20.082 120.950 50.179 50.000 10.263 60.424 50.193 90.558 70.880 40.545 130.375 70.727 30.445 120.499 80.000 40.000 10.475 70.002 90.034 60.083 80.000 40.924 20.290 40.636 60.115 140.400 50.874 40.186 100.000 10.611 80.128 30.113 20.000 40.000 10.000 110.584 120.636 100.103 140.385 100.843 60.283 40.603 60.080 80.825 100.000 60.377 100.000 10.000 80.000 30.457 110.000 50.000 90.000 10.574 120.608 90.481 40.792 50.394 50.000 10.357 100.503 110.261 100.817 130.504 120.304 70.472 40.115 110.000 10.750 70.677 90.202 20.000 100.509 90.729 60.000 10.519 120.000 140.000 70.000 10.620 120.000 20.000 10.660 60.560 70.486 60.384 60.346 100.952 50.247 140.667 40.436 120.269 30.691 60.000 10.010 70.787 100.889 30.880 40.000 10.810 70.336 40.860 80.000 10.606 80.009 110.248 90.681 70.392 9
Haoyi Zhu, Honghui Yang, Xiaoyang Wu, Di Huang, Sha Zhang, Xianglong He, Tong He, Hengshuang Zhao, Chunhua Shen, Yu Qiao, Wanli Ouyang: PonderV2: Pave the Way for 3D Foundataion Model with A Universal Pre-training Paradigm.
Minkowski 34Dpermissive0.253 160.463 160.154 170.102 160.381 160.084 80.134 150.000 30.000 70.000 10.386 70.141 170.279 170.737 120.703 160.014 170.164 150.000 30.663 100.092 140.000 90.224 150.291 110.531 80.056 170.000 90.242 160.000 10.000 30.013 150.331 160.000 110.000 10.035 170.001 20.858 140.059 140.650 160.000 70.056 140.353 150.299 150.670 130.000 10.000 20.284 160.484 150.071 80.594 150.720 160.710 150.027 110.068 170.813 140.000 10.005 100.492 140.164 10.274 160.111 160.571 160.307 170.293 130.307 170.150 90.163 170.531 160.002 60.545 50.932 150.093 170.000 10.000 120.002 140.159 150.368 170.581 150.440 170.228 170.406 90.282 170.294 160.000 40.000 10.189 160.060 20.036 50.000 120.000 40.897 110.000 170.525 140.025 170.205 170.771 170.000 120.000 10.593 110.108 90.044 60.000 40.000 10.000 110.282 170.589 140.094 160.169 160.466 170.227 160.419 170.125 50.757 140.002 40.334 160.000 10.000 80.000 30.357 150.000 50.000 90.000 10.582 100.513 140.337 100.612 170.000 100.000 10.250 160.352 170.136 170.724 160.655 40.280 80.000 110.046 160.000 10.606 170.559 150.159 70.102 30.445 100.655 90.000 10.310 170.117 60.000 70.000 10.581 150.026 10.000 10.265 170.483 160.084 170.097 170.044 150.865 170.142 170.588 110.351 150.272 20.596 170.000 10.003 100.622 160.720 100.096 170.000 10.771 160.016 150.772 150.000 10.302 140.194 90.214 120.621 160.197 16
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019
AWCS0.305 140.508 140.225 140.142 110.463 130.063 130.195 90.000 30.000 70.000 10.467 30.551 30.504 80.773 60.764 140.142 130.029 170.000 30.626 130.100 110.000 90.360 130.179 150.507 130.137 150.006 80.300 120.000 10.000 30.172 80.364 150.512 90.000 10.056 140.000 30.865 130.093 40.634 170.000 70.071 130.396 140.296 160.876 90.000 10.000 20.373 130.436 160.063 90.749 20.877 100.721 120.131 30.124 140.804 150.000 10.000 110.515 120.010 60.452 100.252 90.578 140.417 80.179 170.484 100.171 70.337 140.606 120.000 70.115 100.937 140.142 90.000 10.008 110.000 150.157 160.484 140.402 170.501 150.339 90.553 70.529 30.478 120.000 40.000 10.404 100.001 100.022 130.077 90.000 40.894 120.219 70.628 70.093 150.305 140.886 10.233 90.000 10.603 90.112 60.023 90.000 40.000 10.000 110.741 60.664 80.097 150.253 140.782 100.264 110.523 110.154 20.707 160.000 60.411 80.000 10.000 80.000 30.332 160.000 50.000 90.000 10.602 70.595 100.185 130.656 160.159 60.000 10.355 110.424 150.154 150.729 150.516 100.220 100.620 30.084 130.000 10.707 140.651 130.173 50.014 90.381 170.582 140.000 10.619 30.049 120.000 70.000 10.702 40.000 20.000 10.302 160.489 150.317 130.334 70.392 70.922 140.254 130.533 130.394 130.129 140.613 150.000 10.000 120.820 50.649 110.749 130.000 10.782 140.282 60.863 60.000 10.288 150.006 120.220 110.633 140.542 3
: Long-Tailed 3D Semantic Segmentation with Adaptive Weight Constraint and Sampling. ICRA 2024
LGroundpermissive0.272 150.485 150.184 150.106 150.476 110.077 100.218 80.000 30.000 70.000 10.547 20.295 110.540 50.746 100.745 150.058 160.112 160.005 10.658 110.077 150.000 90.322 140.178 160.512 110.190 130.199 20.277 150.000 10.000 30.173 70.399 120.000 110.000 10.039 160.000 30.858 140.085 70.676 110.002 50.103 60.498 80.323 140.703 120.000 10.000 20.296 150.549 120.216 10.702 60.768 140.718 140.028 100.092 160.786 160.000 10.000 110.453 160.022 50.251 170.252 90.572 150.348 140.321 110.514 70.063 150.279 160.552 150.000 70.019 160.932 150.132 150.000 10.000 120.000 150.156 170.457 150.623 120.518 140.265 160.358 110.381 150.395 140.000 40.000 10.127 170.012 80.051 10.000 120.000 40.886 130.014 140.437 170.179 80.244 150.826 150.000 120.000 10.599 100.136 10.085 30.000 40.000 10.000 110.565 130.612 130.143 50.207 150.566 140.232 150.446 150.127 40.708 150.000 60.384 90.000 10.000 80.000 30.402 140.000 50.059 70.000 10.525 150.566 110.229 120.659 150.000 100.000 10.265 150.446 140.147 160.720 170.597 80.066 140.000 110.187 90.000 10.726 130.467 170.134 120.000 100.413 150.629 120.000 10.363 160.055 100.022 30.000 10.626 110.000 20.000 10.323 150.479 170.154 160.117 150.028 160.901 150.243 150.415 160.295 170.143 60.610 160.000 10.000 120.777 120.397 170.324 160.000 10.778 150.179 80.702 160.000 10.274 160.404 40.233 100.622 150.398 7
David Rozenberszki, Or Litany, Angela Dai: Language-Grounded Indoor 3D Semantic Segmentation in the Wild. arXiv
CSC-Pretrainpermissive0.249 170.455 170.171 160.079 170.418 150.059 140.186 100.000 30.000 70.000 10.335 100.250 130.316 160.766 70.697 170.142 130.170 140.003 20.553 140.112 90.097 10.201 160.186 140.476 150.081 160.000 90.216 170.000 10.000 30.001 170.314 170.000 110.000 10.055 150.000 30.832 160.094 30.659 150.002 50.076 90.310 160.293 170.664 140.000 10.000 20.175 170.634 60.130 20.552 170.686 170.700 170.076 70.110 150.770 170.000 10.000 110.430 170.000 70.319 150.166 150.542 170.327 160.205 160.332 140.052 160.375 130.444 170.000 70.012 170.930 170.203 30.000 10.000 120.046 120.175 140.413 160.592 140.471 160.299 150.152 160.340 160.247 170.000 40.000 10.225 150.058 30.037 40.000 120.207 20.862 150.014 140.548 130.033 160.233 160.816 160.000 120.000 10.542 150.123 50.121 10.019 20.000 10.000 110.463 160.454 170.045 170.128 170.557 150.235 140.441 160.063 110.484 170.000 60.308 170.000 10.000 80.000 30.318 170.000 50.000 90.000 10.545 140.543 120.164 140.734 90.000 100.000 10.215 170.371 160.198 140.743 140.205 150.062 150.000 110.079 140.000 10.683 160.547 160.142 90.000 100.441 110.579 150.000 10.464 140.098 90.041 10.000 10.590 140.000 20.000 10.373 130.494 140.174 150.105 160.001 170.895 160.222 160.537 120.307 160.180 50.625 140.000 10.000 120.591 170.609 140.398 150.000 10.766 170.014 160.638 170.000 10.377 130.004 130.206 130.609 170.465 5
Ji Hou, Benjamin Graham, Matthias Nießner, Saining Xie: Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts. CVPR 2021


This table lists the benchmark results for the ScanNet200 3D semantic instance scenario.




Method Infoavg ap 25%head ap 25%common ap 25%tail ap 25%alarm clockarmchairbackpackbagballbarbasketbathroom cabinetbathroom counterbathroom stallbathroom stall doorbathroom vanitybathtubbedbenchbicyclebinblackboardblanketblindsboardbookbookshelfbottlebowlboxbroombucketbulletin boardcabinetcalendarcandlecartcase of water bottlescd caseceilingceiling lightchairclockclosetcloset doorcloset rodcloset wallclothesclothes dryercoat rackcoffee kettlecoffee makercoffee tablecolumncomputer towercontainercopiercouchcountercratecupcurtaincushiondecorationdeskdining tabledish rackdishwasherdividerdoordoorframedresserdumbbelldustpanend tablefanfile cabinetfire alarmfire extinguisherfireplacefolded chairfurnitureguitarguitar casehair dryerhandicap barhatheadphonesironing boardjacketkeyboardkeyboard pianokitchen cabinetkitchen counterladderlamplaptoplaundry basketlaundry detergentlaundry hamperledgelightlight switchluggagemachinemailboxmatmattressmicrowavemini fridgemirrormonitormousemusic standnightstandobjectoffice chairottomanovenpaperpaper bagpaper cutterpaper towel dispenserpaper towel rollpersonpianopicturepillarpillowpipeplantplateplungerposterpotted plantpower outletpower stripprinterprojectorprojector screenpurserackradiatorrailrange hoodrecycling binrefrigeratorscaleseatshelfshoeshowershower curtainshower curtain rodshower doorshower floorshower headshower wallsignsinksoap dishsoap dispensersofa chairspeakerstair railstairsstandstoolstorage binstorage containerstorage organizerstovestructurestuffed animalsuitcasetabletelephonetissue boxtoastertoaster oventoilettoilet papertoilet paper dispensertoilet paper holdertoilet seat cover dispensertoweltrash bintrash cantraytubetvtv standvacuum cleanerventwardrobewashing machinewater bottlewater coolerwater pitcherwhiteboardwindowwindowsill
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
ODIN - Ins200permissive0.451 10.637 20.407 10.277 10.583 50.116 10.500 10.000 10.125 10.000 10.599 20.823 20.407 40.667 60.941 30.542 31.000 10.000 31.000 10.162 30.000 20.028 50.357 20.695 30.550 10.000 10.475 10.000 10.000 20.714 10.626 11.000 10.000 10.500 10.125 10.749 20.080 20.742 60.528 10.078 30.500 20.334 10.667 10.333 10.000 10.278 60.723 50.250 40.859 41.000 10.826 60.108 30.221 10.763 10.000 30.250 10.742 30.500 30.750 10.400 30.855 10.769 10.701 10.469 40.203 10.406 20.870 20.000 20.963 10.200 30.000 10.000 30.500 10.370 10.886 11.000 10.782 20.504 30.429 40.494 10.337 30.000 10.000 10.600 10.000 40.215 30.226 20.000 10.944 20.200 30.887 10.750 10.874 10.877 30.438 10.000 10.867 30.089 30.003 30.500 10.000 20.333 11.000 10.742 20.125 10.671 10.417 40.616 50.637 10.238 10.873 10.528 10.494 50.000 10.250 30.000 20.688 10.000 11.000 10.000 10.872 10.833 20.275 10.779 51.000 10.000 30.441 10.577 10.167 21.000 10.500 50.777 30.000 20.778 20.000 30.910 20.800 20.232 40.019 30.717 10.833 50.000 30.638 10.284 10.000 30.000 20.778 10.000 10.000 10.597 10.699 30.850 10.333 30.250 30.944 50.571 10.677 30.795 10.264 40.852 20.000 10.000 20.824 11.000 10.668 30.000 10.000 40.667 30.000 10.333 50.333 20.760 10.679 30.404 2
TD3D Scannet200permissive0.379 30.603 30.306 30.190 30.635 20.073 30.500 10.000 10.000 20.000 10.495 40.735 30.275 61.000 10.979 20.590 20.000 50.021 20.000 40.146 40.000 20.356 20.173 60.795 10.226 30.000 10.173 30.000 10.000 20.226 30.390 30.000 30.000 10.250 20.000 20.706 30.061 40.885 10.093 30.186 20.259 50.200 20.667 10.000 30.000 10.667 20.825 10.250 40.834 51.000 10.958 10.553 10.111 40.748 20.220 20.051 30.866 20.792 10.390 60.045 60.800 30.302 60.517 20.533 30.113 30.427 10.843 30.000 20.458 20.600 10.000 10.101 20.000 20.259 20.717 30.500 30.615 30.520 20.526 20.457 20.270 50.000 10.000 10.400 30.088 20.294 20.181 30.000 11.000 10.400 10.710 60.103 40.477 60.905 20.061 30.000 10.906 20.102 20.232 10.125 30.000 20.003 30.792 41.000 10.000 30.102 40.125 50.559 60.523 40.075 30.715 20.000 30.424 60.000 10.396 20.250 10.638 20.000 10.000 30.000 10.622 60.833 20.221 20.970 10.250 30.038 10.260 30.415 20.125 31.000 11.000 10.857 20.000 20.908 10.012 10.869 40.836 10.635 10.111 10.625 21.000 10.020 20.510 20.003 40.009 21.000 10.778 10.000 10.000 10.370 40.755 10.288 30.333 30.274 21.000 10.557 20.731 20.456 30.433 30.769 60.000 10.000 20.621 51.000 10.458 50.000 10.196 20.817 10.000 10.472 10.222 40.205 60.689 20.274 4
Maksim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: Top-Down Beats Bottom-Up in 3D Instance Segmentation. WACV 2024
Minkowski 34D Inst.permissive0.280 50.488 50.192 60.124 50.593 40.010 50.500 10.000 10.000 20.000 10.447 50.535 50.445 31.000 10.861 50.400 40.225 30.000 30.000 40.142 50.000 20.074 40.342 40.467 60.067 40.000 10.119 60.000 10.000 20.000 50.337 60.000 30.000 10.000 50.000 20.506 60.070 30.804 40.000 40.000 50.333 40.172 40.150 60.000 30.000 10.479 50.745 30.000 60.830 61.000 10.904 30.167 20.090 50.732 30.000 30.000 40.443 50.000 40.500 40.542 10.772 60.396 50.077 60.385 50.044 50.118 60.777 50.000 20.000 50.200 30.000 10.000 30.000 20.148 50.502 50.500 30.419 50.159 60.281 50.404 60.317 40.000 10.000 10.200 40.000 40.077 40.000 40.000 10.750 40.200 30.715 50.021 50.551 30.828 60.000 40.000 10.743 50.059 60.000 40.000 40.000 20.000 40.125 60.648 40.000 30.191 30.500 10.669 40.502 50.000 60.568 50.000 30.516 40.000 10.000 40.000 20.305 60.000 10.000 30.000 10.825 20.833 20.021 60.918 20.000 40.000 30.191 50.346 50.100 50.981 41.000 10.286 50.000 20.000 60.000 30.868 50.648 60.292 30.000 40.375 41.000 10.000 30.500 30.000 50.333 10.000 20.538 60.000 10.000 10.213 60.518 50.098 50.528 10.250 30.997 30.284 60.677 30.398 40.167 50.790 50.000 10.000 20.618 60.903 60.200 60.000 10.333 10.333 50.000 10.442 30.083 50.213 50.587 50.131 6
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019
LGround Inst.permissive0.314 40.529 40.225 40.155 40.578 60.010 40.500 10.000 10.000 20.000 10.515 30.556 40.696 11.000 10.927 40.400 40.083 40.000 31.000 10.252 10.000 20.167 30.350 30.731 20.067 40.000 10.123 50.000 10.000 20.036 40.372 40.000 30.000 10.250 20.000 20.569 50.031 60.810 30.000 40.000 50.630 10.183 30.278 40.000 30.000 10.582 40.589 60.500 20.863 31.000 10.940 20.000 50.144 20.716 40.000 30.000 40.484 40.000 40.500 40.400 30.798 40.500 30.278 50.750 10.093 40.166 50.783 40.000 20.200 30.400 20.000 10.000 30.000 20.219 30.539 40.500 30.578 40.413 40.181 60.457 30.375 20.000 10.000 10.050 60.000 40.077 50.000 40.000 10.500 60.000 60.743 40.250 30.488 50.846 40.000 40.000 10.800 40.069 40.000 40.000 40.000 20.000 41.000 10.607 50.000 30.200 20.500 10.694 20.528 30.063 40.659 30.000 30.594 20.000 10.000 40.000 20.571 30.000 10.000 30.000 10.716 50.647 60.221 30.857 40.000 40.000 30.217 40.346 40.071 60.530 61.000 10.429 40.000 20.286 40.000 30.826 60.706 40.208 50.000 40.250 50.744 60.000 30.500 30.042 20.000 30.000 20.746 40.000 10.000 10.517 20.625 40.085 60.333 30.000 51.000 10.378 50.533 60.376 50.042 60.814 40.000 10.000 20.765 41.000 10.600 40.000 10.000 40.667 30.000 10.472 10.333 20.337 40.605 40.305 3
David Rozenberszki, Or Litany, Angela Dai: Language-Grounded Indoor 3D Semantic Segmentation in the Wild.
Mask3D Scannet2000.445 20.653 10.392 20.254 20.648 10.097 20.125 60.000 10.000 20.000 10.657 10.971 10.451 21.000 11.000 10.640 10.500 20.045 11.000 10.241 20.409 10.363 10.440 10.686 40.300 20.000 10.201 20.000 10.009 10.290 20.556 21.000 10.000 10.063 40.000 20.830 10.573 10.844 20.333 20.204 10.058 60.158 60.552 30.056 20.000 11.000 10.725 40.750 10.927 11.000 10.888 40.042 40.120 30.615 50.226 10.250 10.890 10.792 10.677 30.510 20.818 20.699 20.512 30.167 60.125 20.315 30.943 10.309 10.017 40.200 30.000 10.188 10.000 20.183 40.815 21.000 10.827 10.741 10.442 30.414 50.600 10.000 10.000 10.458 20.049 30.321 10.381 10.000 10.908 30.400 10.841 20.260 20.710 20.966 10.265 20.000 10.924 10.152 10.025 20.500 10.027 10.028 21.000 10.556 60.016 20.080 60.500 10.694 30.608 20.084 20.604 40.194 20.538 30.000 10.500 10.000 20.354 50.000 11.000 10.000 10.761 30.930 10.053 50.890 31.000 10.008 20.262 20.358 31.000 11.000 10.792 40.966 11.000 10.765 30.004 20.930 10.780 30.330 20.027 20.625 20.974 40.050 10.412 60.021 30.000 30.000 20.778 10.000 10.000 10.493 30.746 20.454 20.335 20.396 10.930 60.551 31.000 10.552 20.606 10.853 10.000 10.004 10.806 21.000 10.727 20.000 10.042 30.745 20.000 10.399 40.391 10.630 20.721 10.619 1
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
CSC-Pretrain Inst.permissive0.275 60.466 60.218 50.110 60.625 30.007 60.500 10.000 10.000 20.000 10.000 60.222 60.377 51.000 10.661 60.400 40.000 50.000 30.000 40.119 60.000 20.000 60.277 50.685 50.067 40.000 10.132 40.000 10.000 20.000 50.367 50.000 30.000 10.000 50.000 20.591 40.055 50.783 50.000 40.014 40.500 20.161 50.278 40.000 30.000 10.667 20.768 20.500 20.866 21.000 10.829 50.000 50.019 60.555 60.000 30.000 40.305 60.000 40.750 10.200 50.783 50.429 40.395 40.677 20.020 60.286 40.584 60.000 20.000 50.115 60.000 10.000 30.000 20.145 60.423 60.500 30.364 60.369 50.571 10.448 40.206 60.000 10.000 10.200 40.106 10.065 60.000 40.000 10.750 40.200 30.774 30.000 60.501 40.841 50.000 40.000 10.692 60.063 50.000 40.000 40.000 20.000 40.500 50.649 30.000 30.084 50.125 50.719 10.413 60.004 50.450 60.000 30.638 10.000 10.000 40.000 20.505 40.000 10.000 30.000 10.727 40.833 20.221 30.779 50.000 40.000 30.168 60.311 60.125 30.571 50.500 50.143 60.000 20.250 50.000 30.869 30.667 50.162 60.000 40.250 51.000 10.000 30.500 30.000 50.000 30.000 20.689 50.000 10.000 10.312 50.383 60.114 40.333 30.000 50.997 30.420 40.613 50.212 60.500 20.819 30.000 10.000 20.768 31.000 10.918 10.000 10.000 40.278 60.000 10.333 50.000 60.353 30.546 60.258 5
Ji Hou, Benjamin Graham, Matthias Nießner, Saining Xie: Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts. CVPR 2021


ScanNet Benchmark

This table lists the benchmark results for the 3D semantic label scenario.


Method Infoavg ioubathtubbedbookshelfcabinetchaircountercurtaindeskdoorfloorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwallwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
PTv3-PPT-ALCcopyleft0.798 10.911 110.812 220.854 80.770 120.856 150.555 170.943 10.660 260.735 20.979 10.606 70.492 10.792 40.934 40.841 20.819 60.716 90.947 100.906 10.822 1
Guangda Ji, Silvan Weder, Francis Engelmann, Marc Pollefeys, Hermann Blum: ARKit LabelMaker: A New Scale for Indoor 3D Scene Understanding. arxiv
DITR ScanNet0.797 20.727 760.869 10.882 10.785 60.868 70.578 50.943 10.744 10.727 30.979 10.627 20.364 90.824 10.949 20.779 150.844 10.757 10.982 10.905 20.802 3
Karim Abou Zeid, Kadir Yilmaz, Daan de Geus, Alexander Hermans, David Adrian, Timm Linder, Bastian Leibe: DINO in the Room: Leveraging 2D Foundation Models for 3D Segmentation.
PTv3 ScanNet0.794 30.941 30.813 210.851 110.782 70.890 20.597 10.916 60.696 110.713 50.979 10.635 10.384 30.793 30.907 100.821 50.790 360.696 140.967 40.903 30.805 2
Xiaoyang Wu, Li Jiang, Peng-Shuai Wang, Zhijian Liu, Xihui Liu, Yu Qiao, Wanli Ouyang, Tong He, Hengshuang Zhao: Point Transformer V3: Simpler, Faster, Stronger. CVPR 2024 (Oral)
PonderV20.785 40.978 10.800 300.833 290.788 40.853 200.545 210.910 90.713 30.705 60.979 10.596 90.390 20.769 150.832 450.821 50.792 350.730 20.975 20.897 60.785 7
Haoyi Zhu, Honghui Yang, Xiaoyang Wu, Di Huang, Sha Zhang, Xianglong He, Tong He, Hengshuang Zhao, Chunhua Shen, Yu Qiao, Wanli Ouyang: PonderV2: Pave the Way for 3D Foundataion Model with A Universal Pre-training Paradigm.
Mix3Dpermissive0.781 50.964 20.855 20.843 200.781 80.858 130.575 80.831 380.685 170.714 40.979 10.594 100.310 300.801 20.892 190.841 20.819 60.723 60.940 150.887 80.725 28
Alexey Nekrasov, Jonas Schult, Or Litany, Bastian Leibe, Francis Engelmann: Mix3D: Out-of-Context Data Augmentation for 3D Scenes. 3DV 2021 (Oral)
Swin3Dpermissive0.779 60.861 230.818 160.836 260.790 30.875 40.576 70.905 100.704 70.739 10.969 120.611 30.349 120.756 250.958 10.702 510.805 190.708 100.916 390.898 50.801 4
TTT-KD0.773 70.646 970.818 160.809 410.774 100.878 30.581 30.943 10.687 150.704 70.978 60.607 60.336 190.775 110.912 80.838 40.823 40.694 150.967 40.899 40.794 6
Lisa Weijler, Muhammad Jehanzeb Mirza, Leon Sick, Can Ekkazan, Pedro Hermosilla: TTT-KD: Test-Time Training for 3D Semantic Segmentation through Knowledge Distillation from Foundation Models.
ResLFE_HDS0.772 80.939 40.824 70.854 80.771 110.840 350.564 130.900 120.686 160.677 140.961 180.537 360.348 130.769 150.903 120.785 130.815 90.676 260.939 160.880 130.772 11
PPT-SpUNet-Joint0.766 90.932 50.794 360.829 310.751 260.854 180.540 250.903 110.630 390.672 170.963 160.565 260.357 100.788 50.900 140.737 310.802 200.685 200.950 80.887 80.780 8
Xiaoyang Wu, Zhuotao Tian, Xin Wen, Bohao Peng, Xihui Liu, Kaicheng Yu, Hengshuang Zhao: Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt Training. CVPR 2024
OctFormerpermissive0.766 90.925 70.808 260.849 130.786 50.846 300.566 120.876 190.690 130.674 160.960 190.576 220.226 720.753 270.904 110.777 160.815 90.722 70.923 310.877 160.776 10
Peng-Shuai Wang: OctFormer: Octree-based Transformers for 3D Point Clouds. SIGGRAPH 2023
CU-Hybrid Net0.764 110.924 80.819 140.840 230.757 210.853 200.580 40.848 300.709 50.643 270.958 230.587 160.295 380.753 270.884 230.758 230.815 90.725 50.927 270.867 270.743 19
OccuSeg+Semantic0.764 110.758 610.796 340.839 240.746 300.907 10.562 140.850 290.680 190.672 170.978 60.610 40.335 210.777 90.819 490.847 10.830 30.691 170.972 30.885 100.727 26
O-CNNpermissive0.762 130.924 80.823 80.844 190.770 120.852 220.577 60.847 320.711 40.640 310.958 230.592 110.217 780.762 200.888 200.758 230.813 130.726 40.932 250.868 260.744 18
Peng-Shuai Wang, Yang Liu, Yu-Xiao Guo, Chun-Yu Sun, Xin Tong: O-CNN: Octree-based Convolutional Neural Networks for 3D Shape Analysis. SIGGRAPH 2017
DiffSegNet0.758 140.725 780.789 410.843 200.762 170.856 150.562 140.920 40.657 290.658 210.958 230.589 140.337 180.782 60.879 240.787 110.779 410.678 220.926 290.880 130.799 5
DTC0.757 150.843 290.820 120.847 160.791 20.862 110.511 380.870 220.707 60.652 230.954 400.604 80.279 480.760 210.942 30.734 320.766 500.701 130.884 610.874 220.736 20
OA-CNN-L_ScanNet200.756 160.783 470.826 60.858 60.776 90.837 390.548 200.896 150.649 310.675 150.962 170.586 170.335 210.771 140.802 540.770 190.787 380.691 170.936 200.880 130.761 13
ConDaFormer0.755 170.927 60.822 100.836 260.801 10.849 250.516 350.864 260.651 300.680 130.958 230.584 190.282 450.759 230.855 350.728 340.802 200.678 220.880 660.873 230.756 16
Lunhao Duan, Shanshan Zhao, Nan Xue, Mingming Gong, Guisong Xia, Dacheng Tao: ConDaFormer : Disassembled Transformer with Local Structure Enhancement for 3D Point Cloud Understanding. Neurips, 2023
LSK3DNetpermissive0.755 170.899 160.823 80.843 200.764 160.838 380.584 20.845 330.717 20.638 330.956 300.580 210.229 710.640 480.900 140.750 260.813 130.729 30.920 350.872 240.757 14
Tuo Feng, Wenguan Wang, Fan Ma, Yi Yang: LSK3DNet: Towards Effective and Efficient 3D Perception with Large Sparse Kernels. CVPR 2024
PNE0.755 170.786 450.835 50.834 280.758 190.849 250.570 100.836 370.648 320.668 190.978 60.581 200.367 70.683 390.856 330.804 80.801 240.678 220.961 60.889 70.716 35
P. Hermosilla: Point Neighborhood Embeddings.
PointTransformerV20.752 200.742 680.809 250.872 20.758 190.860 120.552 180.891 170.610 460.687 80.960 190.559 300.304 330.766 180.926 60.767 200.797 280.644 380.942 130.876 190.722 31
Xiaoyang Wu, Yixing Lao, Li Jiang, Xihui Liu, Hengshuang Zhao: Point Transformer V2: Grouped Vector Attention and Partition-based Pooling. NeurIPS 2022
DMF-Net0.752 200.906 140.793 380.802 470.689 450.825 520.556 160.867 230.681 180.602 500.960 190.555 320.365 80.779 80.859 300.747 270.795 320.717 80.917 380.856 350.764 12
C.Yang, Y.Yan, W.Zhao, J.Ye, X.Yang, A.Hussain, B.Dong, K.Huang: Towards Deeper and Better Multi-view Feature Fusion for 3D Semantic Segmentation. ICONIP 2023
PointConvFormer0.749 220.793 430.790 390.807 430.750 280.856 150.524 310.881 180.588 580.642 300.977 100.591 120.274 510.781 70.929 50.804 80.796 290.642 390.947 100.885 100.715 36
Wenxuan Wu, Qi Shan, Li Fuxin: PointConvFormer: Revenge of the Point-based Convolution.
BPNetcopyleft0.749 220.909 120.818 160.811 390.752 240.839 370.485 530.842 340.673 210.644 260.957 280.528 420.305 320.773 120.859 300.788 100.818 80.693 160.916 390.856 350.723 30
Wenbo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia, Tien-Tsin Wong: Bidirectional Projection Network for Cross Dimension Scene Understanding. CVPR 2021 (Oral)
MSP0.748 240.623 1000.804 280.859 50.745 310.824 540.501 420.912 80.690 130.685 100.956 300.567 250.320 270.768 170.918 70.720 390.802 200.676 260.921 330.881 120.779 9
StratifiedFormerpermissive0.747 250.901 150.803 290.845 180.757 210.846 300.512 370.825 410.696 110.645 250.956 300.576 220.262 620.744 330.861 290.742 290.770 480.705 110.899 510.860 320.734 21
Xin Lai*, Jianhui Liu*, Li Jiang, Liwei Wang, Hengshuang Zhao, Shu Liu, Xiaojuan Qi, Jiaya Jia: Stratified Transformer for 3D Point Cloud Segmentation. CVPR 2022
VMNetpermissive0.746 260.870 210.838 30.858 60.729 360.850 240.501 420.874 200.587 590.658 210.956 300.564 270.299 350.765 190.900 140.716 420.812 150.631 440.939 160.858 330.709 37
Zeyu HU, Xuyang Bai, Jiaxiang Shang, Runze Zhang, Jiayu Dong, Xin Wang, Guangyuan Sun, Hongbo Fu, Chiew-Lan Tai: VMNet: Voxel-Mesh Network for Geodesic-Aware 3D Semantic Segmentation. ICCV 2021 (Oral)
Virtual MVFusion0.746 260.771 550.819 140.848 150.702 430.865 100.397 900.899 130.699 90.664 200.948 620.588 150.330 230.746 320.851 390.764 210.796 290.704 120.935 210.866 280.728 24
Abhijit Kundu, Xiaoqi Yin, Alireza Fathi, David Ross, Brian Brewington, Thomas Funkhouser, Caroline Pantofaru: Virtual Multi-view Fusion for 3D Semantic Segmentation. ECCV 2020
DiffSeg3D20.745 280.725 780.814 200.837 250.751 260.831 460.514 360.896 150.674 200.684 110.960 190.564 270.303 340.773 120.820 480.713 450.798 270.690 190.923 310.875 200.757 14
ODINpermissive0.744 290.658 930.752 640.870 30.714 400.843 330.569 110.919 50.703 80.622 400.949 590.591 120.343 150.736 340.784 560.816 70.838 20.672 310.918 370.854 390.725 28
Ayush Jain, Pushkal Katara, Nikolaos Gkanatsios, Adam W. Harley, Gabriel Sarch, Kriti Aggarwal, Vishrav Chaudhary, Katerina Fragkiadaki: ODIN: A Single Model for 2D and 3D Segmentation. CVPR 2024
Retro-FPN0.744 290.842 300.800 300.767 610.740 320.836 410.541 230.914 70.672 220.626 370.958 230.552 330.272 530.777 90.886 220.696 520.801 240.674 290.941 140.858 330.717 33
Peng Xiang*, Xin Wen*, Yu-Shen Liu, Hui Zhang, Yi Fang, Zhizhong Han: Retrospective Feature Pyramid Network for Point Cloud Semantic Segmentation. ICCV 2023
EQ-Net0.743 310.620 1010.799 330.849 130.730 350.822 560.493 500.897 140.664 230.681 120.955 340.562 290.378 40.760 210.903 120.738 300.801 240.673 300.907 430.877 160.745 17
Zetong Yang*, Li Jiang*, Yanan Sun, Bernt Schiele, Jiaya JIa: A Unified Query-based Paradigm for Point Cloud Understanding. CVPR 2022
SAT0.742 320.860 240.765 550.819 340.769 140.848 270.533 270.829 390.663 240.631 360.955 340.586 170.274 510.753 270.896 170.729 330.760 560.666 330.921 330.855 370.733 22
LRPNet0.742 320.816 380.806 270.807 430.752 240.828 500.575 80.839 360.699 90.637 340.954 400.520 450.320 270.755 260.834 430.760 220.772 450.676 260.915 410.862 300.717 33
LargeKernel3D0.739 340.909 120.820 120.806 450.740 320.852 220.545 210.826 400.594 570.643 270.955 340.541 350.263 610.723 370.858 320.775 180.767 490.678 220.933 230.848 430.694 42
Yukang Chen*, Jianhui Liu*, Xiangyu Zhang, Xiaojuan Qi, Jiaya Jia: LargeKernel3D: Scaling up Kernels in 3D Sparse CNNs. CVPR 2023
RPN0.736 350.776 510.790 390.851 110.754 230.854 180.491 520.866 240.596 560.686 90.955 340.536 370.342 160.624 550.869 260.787 110.802 200.628 450.927 270.875 200.704 39
MinkowskiNetpermissive0.736 350.859 250.818 160.832 300.709 410.840 350.521 330.853 280.660 260.643 270.951 510.544 340.286 430.731 350.893 180.675 600.772 450.683 210.874 720.852 410.727 26
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019
IPCA0.731 370.890 170.837 40.864 40.726 370.873 50.530 300.824 420.489 920.647 240.978 60.609 50.336 190.624 550.733 640.758 230.776 430.570 700.949 90.877 160.728 24
online3d0.727 380.715 830.777 480.854 80.748 290.858 130.497 470.872 210.572 650.639 320.957 280.523 430.297 370.750 300.803 530.744 280.810 160.587 660.938 180.871 250.719 32
PointTransformer++0.725 390.727 760.811 240.819 340.765 150.841 340.502 410.814 470.621 420.623 390.955 340.556 310.284 440.620 570.866 270.781 140.757 600.648 360.932 250.862 300.709 37
SparseConvNet0.725 390.647 960.821 110.846 170.721 380.869 60.533 270.754 630.603 520.614 420.955 340.572 240.325 250.710 380.870 250.724 370.823 40.628 450.934 220.865 290.683 45
MatchingNet0.724 410.812 400.812 220.810 400.735 340.834 430.495 490.860 270.572 650.602 500.954 400.512 470.280 470.757 240.845 410.725 360.780 400.606 550.937 190.851 420.700 41
INS-Conv-semantic0.717 420.751 640.759 580.812 380.704 420.868 70.537 260.842 340.609 480.608 460.953 440.534 390.293 390.616 580.864 280.719 410.793 330.640 400.933 230.845 470.663 50
PointMetaBase0.714 430.835 310.785 430.821 320.684 470.846 300.531 290.865 250.614 430.596 540.953 440.500 500.246 670.674 400.888 200.692 530.764 520.624 470.849 870.844 480.675 47
contrastBoundarypermissive0.705 440.769 580.775 490.809 410.687 460.820 590.439 780.812 480.661 250.591 560.945 700.515 460.171 970.633 520.856 330.720 390.796 290.668 320.889 580.847 440.689 43
Liyao Tang, Yibing Zhan, Zhe Chen, Baosheng Yu, Dacheng Tao: Contrastive Boundary Learning for Point Cloud Segmentation. CVPR2022
ClickSeg_Semantic0.703 450.774 530.800 300.793 520.760 180.847 290.471 570.802 510.463 990.634 350.968 140.491 530.271 550.726 360.910 90.706 470.815 90.551 820.878 670.833 490.570 82
RFCR0.702 460.889 180.745 690.813 370.672 500.818 630.493 500.815 460.623 400.610 440.947 640.470 620.249 660.594 620.848 400.705 480.779 410.646 370.892 560.823 550.611 65
Jingyu Gong, Jiachen Xu, Xin Tan, Haichuan Song, Yanyun Qu, Yuan Xie, Lizhuang Ma: Omni-Supervised Point Cloud Segmentation via Gradual Receptive Field Component Reasoning. CVPR2021
One Thing One Click0.701 470.825 350.796 340.723 680.716 390.832 450.433 800.816 440.634 370.609 450.969 120.418 880.344 140.559 740.833 440.715 430.808 180.560 760.902 480.847 440.680 46
JSENetpermissive0.699 480.881 200.762 560.821 320.667 510.800 760.522 320.792 540.613 440.607 470.935 900.492 520.205 840.576 670.853 370.691 540.758 580.652 350.872 750.828 520.649 54
Zeyu HU, Mingmin Zhen, Xuyang BAI, Hongbo Fu, Chiew-lan Tai: JSENet: Joint Semantic Segmentation and Edge Detection Network for 3D Point Clouds. ECCV 2020
One-Thing-One-Click0.693 490.743 670.794 360.655 910.684 470.822 560.497 470.719 730.622 410.617 410.977 100.447 750.339 170.750 300.664 810.703 500.790 360.596 590.946 120.855 370.647 55
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
PicassoNet-IIpermissive0.692 500.732 720.772 500.786 530.677 490.866 90.517 340.848 300.509 850.626 370.952 490.536 370.225 740.545 800.704 710.689 570.810 160.564 750.903 470.854 390.729 23
Huan Lei, Naveed Akhtar, Mubarak Shah, and Ajmal Mian: Geometric feature learning for 3D meshes.
Feature_GeometricNetpermissive0.690 510.884 190.754 620.795 500.647 580.818 630.422 820.802 510.612 450.604 480.945 700.462 650.189 920.563 730.853 370.726 350.765 510.632 430.904 450.821 580.606 69
Kangcheng Liu, Ben M. Chen: https://arxiv.org/abs/2012.09439. arXiv Preprint
FusionNet0.688 520.704 850.741 730.754 650.656 530.829 480.501 420.741 680.609 480.548 630.950 550.522 440.371 50.633 520.756 590.715 430.771 470.623 480.861 830.814 610.658 51
Feihu Zhang, Jin Fang, Benjamin Wah, Philip Torr: Deep FusionNet for Point Cloud Semantic Segmentation. ECCV 2020
Feature-Geometry Netpermissive0.685 530.866 220.748 660.819 340.645 600.794 790.450 680.802 510.587 590.604 480.945 700.464 640.201 870.554 760.840 420.723 380.732 710.602 570.907 430.822 570.603 72
VACNN++0.684 540.728 750.757 610.776 580.690 440.804 740.464 620.816 440.577 640.587 570.945 700.508 490.276 500.671 410.710 690.663 650.750 640.589 640.881 640.832 510.653 53
DGNet0.684 540.712 840.784 440.782 570.658 520.835 420.499 460.823 430.641 340.597 530.950 550.487 550.281 460.575 680.619 850.647 730.764 520.620 500.871 780.846 460.688 44
KP-FCNN0.684 540.847 280.758 600.784 550.647 580.814 660.473 560.772 570.605 500.594 550.935 900.450 730.181 950.587 630.805 520.690 550.785 390.614 510.882 630.819 590.632 61
H. Thomas, C. Qi, J. Deschaud, B. Marcotegui, F. Goulette, L. Guibas.: KPConv: Flexible and Deformable Convolution for Point Clouds. ICCV 2019
Superpoint Network0.683 570.851 270.728 770.800 490.653 550.806 720.468 590.804 490.572 650.602 500.946 670.453 720.239 700.519 850.822 460.689 570.762 550.595 610.895 540.827 530.630 62
PointContrast_LA_SEM0.683 570.757 620.784 440.786 530.639 620.824 540.408 850.775 560.604 510.541 650.934 940.532 400.269 570.552 770.777 570.645 760.793 330.640 400.913 420.824 540.671 48
VI-PointConv0.676 590.770 570.754 620.783 560.621 660.814 660.552 180.758 610.571 680.557 610.954 400.529 410.268 590.530 830.682 750.675 600.719 740.603 560.888 590.833 490.665 49
Xingyi Li, Wenxuan Wu, Xiaoli Z. Fern, Li Fuxin: The Devils in the Point Clouds: Studying the Robustness of Point Cloud Convolutions.
ROSMRF3D0.673 600.789 440.748 660.763 630.635 640.814 660.407 870.747 650.581 630.573 580.950 550.484 560.271 550.607 590.754 600.649 700.774 440.596 590.883 620.823 550.606 69
SALANet0.670 610.816 380.770 530.768 600.652 560.807 710.451 650.747 650.659 280.545 640.924 1000.473 610.149 1070.571 700.811 510.635 800.746 650.623 480.892 560.794 740.570 82
O3DSeg0.668 620.822 360.771 520.496 1110.651 570.833 440.541 230.761 600.555 740.611 430.966 150.489 540.370 60.388 1040.580 880.776 170.751 620.570 700.956 70.817 600.646 56
PointASNLpermissive0.666 630.703 860.781 460.751 670.655 540.830 470.471 570.769 580.474 950.537 670.951 510.475 600.279 480.635 500.698 740.675 600.751 620.553 810.816 940.806 650.703 40
Xu Yan, Chaoda Zheng, Zhen Li, Sheng Wang, Shuguang Cui: PointASNL: Robust Point Clouds Processing using Nonlocal Neural Networks with Adaptive Sampling. CVPR 2020
PointConvpermissive0.666 630.781 480.759 580.699 760.644 610.822 560.475 550.779 550.564 710.504 820.953 440.428 820.203 860.586 650.754 600.661 660.753 610.588 650.902 480.813 630.642 57
Wenxuan Wu, Zhongang Qi, Li Fuxin: PointConv: Deep Convolutional Networks on 3D Point Clouds. CVPR 2019
PPCNN++permissive0.663 650.746 650.708 800.722 690.638 630.820 590.451 650.566 1010.599 540.541 650.950 550.510 480.313 290.648 460.819 490.616 850.682 890.590 630.869 790.810 640.656 52
Pyunghwan Ahn, Juyoung Yang, Eojindl Yi, Chanho Lee, Junmo Kim: Projection-based Point Convolution for Efficient Point Cloud Segmentation. IEEE Access
DCM-Net0.658 660.778 490.702 830.806 450.619 670.813 690.468 590.693 810.494 880.524 730.941 820.449 740.298 360.510 870.821 470.675 600.727 730.568 730.826 920.803 670.637 59
Jonas Schult*, Francis Engelmann*, Theodora Kontogianni, Bastian Leibe: DualConvMesh-Net: Joint Geodesic and Euclidean Convolutions on 3D Meshes. CVPR 2020 [Oral]
HPGCNN0.656 670.698 880.743 710.650 920.564 840.820 590.505 400.758 610.631 380.479 860.945 700.480 580.226 720.572 690.774 580.690 550.735 690.614 510.853 860.776 890.597 75
Jisheng Dang, Qingyong Hu, Yulan Guo, Jun Yang: HPGCNN.
SAFNet-segpermissive0.654 680.752 630.734 750.664 890.583 790.815 650.399 890.754 630.639 350.535 690.942 800.470 620.309 310.665 420.539 910.650 690.708 790.635 420.857 850.793 760.642 57
Linqing Zhao, Jiwen Lu, Jie Zhou: Similarity-Aware Fusion Network for 3D Semantic Segmentation. IROS 2021
RandLA-Netpermissive0.645 690.778 490.731 760.699 760.577 800.829 480.446 700.736 690.477 940.523 750.945 700.454 690.269 570.484 940.749 630.618 830.738 670.599 580.827 910.792 790.621 64
PointConv-SFPN0.641 700.776 510.703 820.721 700.557 870.826 510.451 650.672 860.563 720.483 850.943 790.425 850.162 1020.644 470.726 650.659 670.709 780.572 690.875 700.786 840.559 88
MVPNetpermissive0.641 700.831 320.715 780.671 860.590 750.781 850.394 910.679 830.642 330.553 620.937 870.462 650.256 630.649 450.406 1040.626 810.691 860.666 330.877 680.792 790.608 68
Maximilian Jaritz, Jiayuan Gu, Hao Su: Multi-view PointNet for 3D Scene Understanding. GMDL Workshop, ICCV 2019
PointMRNet0.640 720.717 820.701 840.692 790.576 810.801 750.467 610.716 740.563 720.459 920.953 440.429 810.169 990.581 660.854 360.605 860.710 760.550 830.894 550.793 760.575 80
FPConvpermissive0.639 730.785 460.760 570.713 740.603 700.798 770.392 930.534 1060.603 520.524 730.948 620.457 670.250 650.538 810.723 670.598 900.696 840.614 510.872 750.799 690.567 85
Yiqun Lin, Zizheng Yan, Haibin Huang, Dong Du, Ligang Liu, Shuguang Cui, Xiaoguang Han: FPConv: Learning Local Flattening for Point Convolution. CVPR 2020
PD-Net0.638 740.797 420.769 540.641 970.590 750.820 590.461 630.537 1050.637 360.536 680.947 640.388 950.206 830.656 430.668 790.647 730.732 710.585 670.868 800.793 760.473 108
PointSPNet0.637 750.734 710.692 910.714 730.576 810.797 780.446 700.743 670.598 550.437 970.942 800.403 910.150 1060.626 540.800 550.649 700.697 830.557 790.846 880.777 880.563 86
SConv0.636 760.830 330.697 870.752 660.572 830.780 870.445 720.716 740.529 780.530 700.951 510.446 760.170 980.507 890.666 800.636 790.682 890.541 890.886 600.799 690.594 76
Supervoxel-CNN0.635 770.656 940.711 790.719 710.613 680.757 960.444 750.765 590.534 770.566 590.928 980.478 590.272 530.636 490.531 930.664 640.645 990.508 970.864 820.792 790.611 65
joint point-basedpermissive0.634 780.614 1020.778 470.667 880.633 650.825 520.420 830.804 490.467 970.561 600.951 510.494 510.291 400.566 710.458 990.579 960.764 520.559 780.838 890.814 610.598 74
Hung-Yueh Chiang, Yen-Liang Lin, Yueh-Cheng Liu, Winston H. Hsu: A Unified Point-Based Framework for 3D Segmentation. 3DV 2019
PointMTL0.632 790.731 730.688 940.675 830.591 740.784 840.444 750.565 1020.610 460.492 830.949 590.456 680.254 640.587 630.706 700.599 890.665 950.612 540.868 800.791 820.579 79
3DSM_DMMF0.631 800.626 990.745 690.801 480.607 690.751 970.506 390.729 720.565 700.491 840.866 1140.434 770.197 900.595 610.630 840.709 460.705 810.560 760.875 700.740 990.491 103
PointNet2-SFPN0.631 800.771 550.692 910.672 840.524 930.837 390.440 770.706 790.538 760.446 940.944 760.421 870.219 770.552 770.751 620.591 920.737 680.543 880.901 500.768 910.557 89
APCF-Net0.631 800.742 680.687 960.672 840.557 870.792 820.408 850.665 880.545 750.508 790.952 490.428 820.186 930.634 510.702 720.620 820.706 800.555 800.873 730.798 710.581 78
Haojia, Lin: Adaptive Pyramid Context Fusion for Point Cloud Perception. GRSL
FusionAwareConv0.630 830.604 1040.741 730.766 620.590 750.747 980.501 420.734 700.503 870.527 710.919 1040.454 690.323 260.550 790.420 1030.678 590.688 870.544 860.896 530.795 730.627 63
Jiazhao Zhang, Chenyang Zhu, Lintao Zheng, Kai Xu: Fusion-Aware Point Convolution for Online Semantic 3D Scene Segmentation. CVPR 2020
DenSeR0.628 840.800 410.625 1060.719 710.545 900.806 720.445 720.597 960.448 1020.519 770.938 860.481 570.328 240.489 930.499 980.657 680.759 570.592 620.881 640.797 720.634 60
SegGroup_sempermissive0.627 850.818 370.747 680.701 750.602 710.764 930.385 970.629 930.490 900.508 790.931 970.409 900.201 870.564 720.725 660.618 830.692 850.539 900.873 730.794 740.548 92
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
SIConv0.625 860.830 330.694 890.757 640.563 850.772 910.448 690.647 910.520 810.509 780.949 590.431 800.191 910.496 910.614 860.647 730.672 930.535 930.876 690.783 850.571 81
dtc_net0.625 860.703 860.751 650.794 510.535 910.848 270.480 540.676 850.528 790.469 890.944 760.454 690.004 1190.464 960.636 830.704 490.758 580.548 850.924 300.787 830.492 102
Weakly-Openseg v30.625 860.924 80.787 420.620 990.555 890.811 700.393 920.666 870.382 1100.520 760.953 440.250 1140.208 810.604 600.670 770.644 770.742 660.538 910.919 360.803 670.513 100
HPEIN0.618 890.729 740.668 970.647 940.597 730.766 920.414 840.680 820.520 810.525 720.946 670.432 780.215 790.493 920.599 870.638 780.617 1040.570 700.897 520.806 650.605 71
Li Jiang, Hengshuang Zhao, Shu Liu, Xiaoyong Shen, Chi-Wing Fu, Jiaya Jia: Hierarchical Point-Edge Interaction Network for Point Cloud Semantic Segmentation. ICCV 2019
SPH3D-GCNpermissive0.610 900.858 260.772 500.489 1120.532 920.792 820.404 880.643 920.570 690.507 810.935 900.414 890.046 1160.510 870.702 720.602 880.705 810.549 840.859 840.773 900.534 95
Huan Lei, Naveed Akhtar, and Ajmal Mian: Spherical Kernel for Efficient Graph Convolution on 3D Point Clouds. TPAMI 2020
AttAN0.609 910.760 600.667 980.649 930.521 940.793 800.457 640.648 900.528 790.434 990.947 640.401 920.153 1050.454 970.721 680.648 720.717 750.536 920.904 450.765 920.485 104
Gege Zhang, Qinghua Ma, Licheng Jiao, Fang Liu and Qigong Sun: AttAN: Attention Adversarial Networks for 3D Point Cloud Semantic Segmentation. IJCAI2020
wsss-transformer0.600 920.634 980.743 710.697 780.601 720.781 850.437 790.585 990.493 890.446 940.933 950.394 930.011 1180.654 440.661 820.603 870.733 700.526 940.832 900.761 940.480 105
LAP-D0.594 930.720 800.692 910.637 980.456 1030.773 900.391 950.730 710.587 590.445 960.940 840.381 960.288 410.434 1000.453 1010.591 920.649 970.581 680.777 980.749 980.610 67
DPC0.592 940.720 800.700 850.602 1030.480 990.762 950.380 980.713 770.585 620.437 970.940 840.369 980.288 410.434 1000.509 970.590 940.639 1020.567 740.772 990.755 960.592 77
Francis Engelmann, Theodora Kontogianni, Bastian Leibe: Dilated Point Convolutions: On the Receptive Field Size of Point Convolutions on 3D Point Clouds. ICRA 2020
CCRFNet0.589 950.766 590.659 1010.683 810.470 1020.740 1000.387 960.620 950.490 900.476 870.922 1020.355 1010.245 680.511 860.511 960.571 970.643 1000.493 1010.872 750.762 930.600 73
ROSMRF0.580 960.772 540.707 810.681 820.563 850.764 930.362 1000.515 1070.465 980.465 910.936 890.427 840.207 820.438 980.577 890.536 1000.675 920.486 1020.723 1050.779 860.524 97
SD-DETR0.576 970.746 650.609 1100.445 1160.517 950.643 1110.366 990.714 760.456 1000.468 900.870 1130.432 780.264 600.558 750.674 760.586 950.688 870.482 1030.739 1030.733 1010.537 94
SQN_0.1%0.569 980.676 900.696 880.657 900.497 960.779 880.424 810.548 1030.515 830.376 1040.902 1110.422 860.357 100.379 1050.456 1000.596 910.659 960.544 860.685 1080.665 1120.556 90
TextureNetpermissive0.566 990.672 920.664 990.671 860.494 970.719 1010.445 720.678 840.411 1080.396 1020.935 900.356 1000.225 740.412 1020.535 920.565 980.636 1030.464 1050.794 970.680 1090.568 84
Jingwei Huang, Haotian Zhang, Li Yi, Thomas Funkerhouser, Matthias Niessner, Leonidas Guibas: TextureNet: Consistent Local Parametrizations for Learning from High-Resolution Signals on Meshes. CVPR
DVVNet0.562 1000.648 950.700 850.770 590.586 780.687 1050.333 1040.650 890.514 840.475 880.906 1080.359 990.223 760.340 1070.442 1020.422 1110.668 940.501 980.708 1060.779 860.534 95
Pointnet++ & Featurepermissive0.557 1010.735 700.661 1000.686 800.491 980.744 990.392 930.539 1040.451 1010.375 1050.946 670.376 970.205 840.403 1030.356 1070.553 990.643 1000.497 990.824 930.756 950.515 98
GMLPs0.538 1020.495 1120.693 900.647 940.471 1010.793 800.300 1070.477 1080.505 860.358 1060.903 1100.327 1040.081 1130.472 950.529 940.448 1090.710 760.509 950.746 1010.737 1000.554 91
PanopticFusion-label0.529 1030.491 1130.688 940.604 1020.386 1080.632 1120.225 1180.705 800.434 1050.293 1120.815 1160.348 1020.241 690.499 900.669 780.507 1020.649 970.442 1110.796 960.602 1160.561 87
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
subcloud_weak0.516 1040.676 900.591 1130.609 1000.442 1040.774 890.335 1030.597 960.422 1070.357 1070.932 960.341 1030.094 1120.298 1090.528 950.473 1070.676 910.495 1000.602 1140.721 1040.349 116
Online SegFusion0.515 1050.607 1030.644 1040.579 1050.434 1050.630 1130.353 1010.628 940.440 1030.410 1000.762 1190.307 1060.167 1000.520 840.403 1050.516 1010.565 1070.447 1090.678 1090.701 1060.514 99
Davide Menini, Suryansh Kumar, Martin R. Oswald, Erik Sandstroem, Cristian Sminchisescu, Luc van Gool: A Real-Time Learning Framework for Joint 3D Reconstruction and Semantic Segmentation. Robotics and Automation Letters Submission
3DMV, FTSDF0.501 1060.558 1080.608 1110.424 1180.478 1000.690 1040.246 1140.586 980.468 960.450 930.911 1060.394 930.160 1030.438 980.212 1140.432 1100.541 1120.475 1040.742 1020.727 1020.477 106
PCNN0.498 1070.559 1070.644 1040.560 1070.420 1070.711 1030.229 1160.414 1090.436 1040.352 1080.941 820.324 1050.155 1040.238 1140.387 1060.493 1030.529 1130.509 950.813 950.751 970.504 101
3DMV0.484 1080.484 1140.538 1160.643 960.424 1060.606 1160.310 1050.574 1000.433 1060.378 1030.796 1170.301 1070.214 800.537 820.208 1150.472 1080.507 1160.413 1140.693 1070.602 1160.539 93
Angela Dai, Matthias Niessner: 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation. ECCV'18
PointCNN with RGBpermissive0.458 1090.577 1060.611 1090.356 1200.321 1160.715 1020.299 1090.376 1130.328 1160.319 1100.944 760.285 1090.164 1010.216 1170.229 1120.484 1050.545 1110.456 1070.755 1000.709 1050.475 107
Yangyan Li, Rui Bu, Mingchao Sun, Baoquan Chen: PointCNN. NeurIPS 2018
FCPNpermissive0.447 1100.679 890.604 1120.578 1060.380 1090.682 1060.291 1100.106 1200.483 930.258 1180.920 1030.258 1130.025 1170.231 1160.325 1080.480 1060.560 1090.463 1060.725 1040.666 1110.231 120
Dario Rethage, Johanna Wald, Jürgen Sturm, Nassir Navab, Federico Tombari: Fully-Convolutional Point Networks for Large-Scale Point Clouds. ECCV 2018
DGCNN_reproducecopyleft0.446 1110.474 1150.623 1070.463 1140.366 1110.651 1090.310 1050.389 1120.349 1140.330 1090.937 870.271 1110.126 1090.285 1100.224 1130.350 1160.577 1060.445 1100.625 1120.723 1030.394 112
Yue Wang, Yongbin Sun, Ziwei Liu, Sanjay E. Sarma, Michael M. Bronstein, Justin M. Solomon: Dynamic Graph CNN for Learning on Point Clouds. TOG 2019
PNET20.442 1120.548 1090.548 1150.597 1040.363 1120.628 1140.300 1070.292 1150.374 1110.307 1110.881 1120.268 1120.186 930.238 1140.204 1160.407 1120.506 1170.449 1080.667 1100.620 1150.462 110
SurfaceConvPF0.442 1120.505 1110.622 1080.380 1190.342 1140.654 1080.227 1170.397 1110.367 1120.276 1140.924 1000.240 1150.198 890.359 1060.262 1100.366 1130.581 1050.435 1120.640 1110.668 1100.398 111
Hao Pan, Shilin Liu, Yang Liu, Xin Tong: Convolutional Neural Networks on 3D Surfaces Using Parallel Frames.
Tangent Convolutionspermissive0.438 1140.437 1170.646 1030.474 1130.369 1100.645 1100.353 1010.258 1170.282 1190.279 1130.918 1050.298 1080.147 1080.283 1110.294 1090.487 1040.562 1080.427 1130.619 1130.633 1140.352 115
Maxim Tatarchenko, Jaesik Park, Vladlen Koltun, Qian-Yi Zhou: Tangent convolutions for dense prediction in 3d. CVPR 2018
3DWSSS0.425 1150.525 1100.647 1020.522 1080.324 1150.488 1200.077 1210.712 780.353 1130.401 1010.636 1210.281 1100.176 960.340 1070.565 900.175 1200.551 1100.398 1150.370 1210.602 1160.361 114
SPLAT Netcopyleft0.393 1160.472 1160.511 1170.606 1010.311 1170.656 1070.245 1150.405 1100.328 1160.197 1190.927 990.227 1170.000 1210.001 1220.249 1110.271 1190.510 1140.383 1170.593 1150.699 1070.267 118
Hang Su, Varun Jampani, Deqing Sun, Subhransu Maji, Evangelos Kalogerakis, Ming-Hsuan Yang, Jan Kautz: SPLATNet: Sparse Lattice Networks for Point Cloud Processing. CVPR 2018
ScanNet+FTSDF0.383 1170.297 1190.491 1180.432 1170.358 1130.612 1150.274 1120.116 1190.411 1080.265 1150.904 1090.229 1160.079 1140.250 1120.185 1170.320 1170.510 1140.385 1160.548 1160.597 1190.394 112
PointNet++permissive0.339 1180.584 1050.478 1190.458 1150.256 1190.360 1210.250 1130.247 1180.278 1200.261 1170.677 1200.183 1180.117 1100.212 1180.145 1190.364 1140.346 1210.232 1210.548 1160.523 1200.252 119
Charles R. Qi, Li Yi, Hao Su, Leonidas J. Guibas: pointnet++: deep hierarchical feature learning on point sets in a metric space.
GrowSP++0.323 1190.114 1210.589 1140.499 1100.147 1210.555 1170.290 1110.336 1140.290 1180.262 1160.865 1150.102 1210.000 1210.037 1200.000 1220.000 1220.462 1180.381 1180.389 1200.664 1130.473 108
SSC-UNetpermissive0.308 1200.353 1180.290 1210.278 1210.166 1200.553 1180.169 1200.286 1160.147 1210.148 1210.908 1070.182 1190.064 1150.023 1210.018 1210.354 1150.363 1190.345 1190.546 1180.685 1080.278 117
ScanNetpermissive0.306 1210.203 1200.366 1200.501 1090.311 1170.524 1190.211 1190.002 1220.342 1150.189 1200.786 1180.145 1200.102 1110.245 1130.152 1180.318 1180.348 1200.300 1200.460 1190.437 1210.182 121
Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner: ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. CVPR'17
ERROR0.054 1220.000 1220.041 1220.172 1220.030 1220.062 1230.001 1220.035 1210.004 1220.051 1220.143 1220.019 1220.003 1200.041 1190.050 1200.003 1210.054 1220.018 1220.005 1230.264 1220.082 122
MVF-GNN0.014 1230.000 1220.000 1230.000 1230.007 1230.086 1220.000 1230.000 1230.001 1230.000 1230.029 1230.001 1230.000 1210.000 1230.000 1220.000 1220.000 1230.018 1220.015 1220.115 1230.000 123


This table lists the benchmark results for the 3D semantic instance scenario.




Method Infoavg ap 25%bathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
PointRel0.901 11.000 10.978 230.928 30.879 10.962 50.882 50.749 380.947 30.912 20.802 30.753 190.820 21.000 10.984 40.919 50.894 41.000 10.815 15
: Relation3D (PointRel): Enhancing Relation Modeling for Point Cloud Instance Segmentation.
Competitor-MAFT0.896 21.000 11.000 10.872 160.847 110.967 30.955 10.778 330.901 150.919 10.784 50.812 10.770 131.000 10.949 90.865 350.868 181.000 10.840 5
OneFormer3Dcopyleft0.896 21.000 11.000 10.913 60.858 60.951 100.786 160.837 190.916 120.908 40.778 80.803 60.750 151.000 10.976 60.926 40.882 80.995 480.849 2
Maxim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: OneFormer3D: One Transformer for Unified Point Cloud Segmentation.
MG-Former0.887 41.000 10.991 140.837 260.801 250.935 190.887 40.857 110.946 40.891 100.748 180.805 50.739 171.000 10.993 20.809 590.876 151.000 10.842 4
DCD0.885 51.000 10.933 410.856 220.832 150.959 70.930 20.858 100.802 380.859 180.767 90.796 100.709 211.000 10.971 70.871 290.904 21.000 10.874 1
UniPerception0.884 61.000 10.979 200.872 160.869 30.892 280.806 130.890 60.835 290.892 90.755 140.811 20.779 100.955 490.951 80.876 230.914 10.997 400.840 6
KmaxOneFormerNetpermissive0.883 71.000 11.000 10.798 410.848 100.971 10.853 70.903 30.827 320.910 30.748 170.809 40.724 191.000 10.980 50.855 410.844 241.000 10.832 7
InsSSM0.883 71.000 10.996 60.800 400.865 40.960 60.808 120.852 160.940 60.899 80.785 40.810 30.700 231.000 10.912 200.851 440.895 30.997 400.827 9
Lei Yao, Yi Wang, Moyun Liu, Lap-Pui Chau: SGIFormer: Semantic-guided and Geometric-enhanced Interleaving Transformer for 3D Instance Segmentation. TCSVT, 2024
Competitor-SPFormer0.881 91.000 11.000 10.845 240.854 70.962 40.714 230.857 120.904 140.902 60.782 70.789 130.662 291.000 10.988 30.874 260.886 70.997 400.847 3
TST3D0.879 101.000 10.994 90.921 50.807 240.939 160.771 170.887 70.923 100.862 170.722 230.768 160.756 141.000 10.910 310.904 70.836 270.999 390.824 11
Duc Tran Dang Trung, Byeongkeun Kang, Yeejin Lee: MSTA3D: Multi-scale Twin-attention for 3D Instance Segmentation. ACM Multimedia 2024
SIM3D0.878 111.000 10.972 250.863 190.817 220.952 90.821 100.783 300.890 180.902 70.735 210.797 80.799 91.000 10.931 170.893 130.853 221.000 10.792 18
EV3D0.877 121.000 10.996 80.873 140.854 80.950 110.691 270.783 310.926 70.889 130.754 150.794 120.820 21.000 10.912 200.900 90.860 201.000 10.779 21
Spherical Mask(CtoF)0.875 131.000 10.991 150.873 140.850 90.946 130.691 270.752 370.926 70.889 120.759 120.794 110.820 21.000 10.912 200.900 90.878 121.000 10.769 23
TD3Dpermissive0.875 131.000 10.976 240.877 120.783 310.970 20.889 30.828 200.945 50.803 240.713 250.720 260.709 201.000 10.936 150.934 30.873 161.000 10.791 19
Maksim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: Top-Down Beats Bottom-Up in 3D Instance Segmentation. WACV 2024
SoftGroup++0.874 151.000 10.972 260.947 10.839 140.898 270.556 420.913 20.881 210.756 260.828 20.748 210.821 11.000 10.937 140.937 10.887 61.000 10.821 12
Queryformer0.874 151.000 10.978 220.809 380.876 20.936 180.702 240.716 430.920 110.875 160.766 100.772 150.818 61.000 10.995 10.916 60.892 51.000 10.767 24
Mask3D0.870 171.000 10.985 170.782 480.818 210.938 170.760 180.749 380.923 90.877 150.760 110.785 140.820 21.000 10.912 200.864 370.878 120.983 540.825 10
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
ExtMask3D0.867 181.000 11.000 10.756 550.816 230.940 150.795 140.760 360.862 230.888 140.739 190.763 170.774 111.000 10.929 180.878 220.879 101.000 10.819 14
SoftGrouppermissive0.865 191.000 10.969 270.860 200.860 50.913 230.558 390.899 40.911 130.760 250.828 10.736 230.802 80.981 460.919 190.875 240.877 141.000 10.820 13
Thang Vu, Kookhoi Kim, Tung M. Luu, Xuan Thanh Nguyen, Chang D. Yoo: SoftGroup for 3D Instance Segmentaiton on Point Clouds. CVPR 2022 [Oral]
MAFT0.860 201.000 10.990 160.810 370.829 160.949 120.809 110.688 490.836 280.904 50.751 160.796 90.741 161.000 10.864 410.848 460.837 251.000 10.828 8
SPFormerpermissive0.851 211.000 10.994 100.806 390.774 330.942 140.637 310.849 170.859 250.889 110.720 240.730 240.665 281.000 10.911 280.868 340.873 171.000 10.796 17
Sun Jiahao, Qing Chunmei, Tan Junpeng, Xu Xiangmin: Superpoint Transformer for 3D Scene Instance Segmentation. AAAI 2023 [Oral]
IPCA-Inst0.851 211.000 10.968 280.884 110.842 130.862 410.693 260.812 250.888 200.677 380.783 60.698 270.807 71.000 10.911 280.865 360.865 191.000 10.757 27
ODIN - Inspermissive0.847 231.000 10.951 340.834 310.828 170.875 330.871 60.767 340.821 340.816 210.690 320.800 70.771 121.000 10.912 200.891 140.821 280.886 700.713 34
Ayush Jain, Pushkal Katara, Nikolaos Gkanatsios, Adam W. Harley, Gabriel Sarch, Kriti Aggarwal, Vishrav Chaudhary, Katerina Fragkiadaki: ODIN: A Single Model for 2D and 3D Segmentation. CVPR 2024
Mask3D_evaluation0.843 241.000 10.955 330.847 230.795 270.932 200.750 200.780 320.891 170.818 200.737 200.633 360.703 221.000 10.902 330.870 300.820 290.941 620.805 16
ISBNetpermissive0.835 251.000 10.950 350.731 570.819 190.918 210.790 150.740 400.851 270.831 190.661 340.742 220.650 321.000 10.937 130.814 580.836 261.000 10.765 25
Tuan Duc Ngo, Binh-Son Hua, Khoi Nguyen: ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution. CVPR 2023
SphereSeg0.835 251.000 10.963 310.891 90.794 280.954 80.822 90.710 440.961 20.721 300.693 310.530 490.653 311.000 10.867 400.857 400.859 210.991 510.771 22
GraphCut0.832 271.000 10.922 500.724 590.798 260.902 260.701 250.856 140.859 240.715 310.706 260.748 200.640 431.000 10.934 160.862 380.880 91.000 10.729 30
TopoSeg0.832 271.000 10.981 190.933 20.819 200.826 500.524 480.841 180.811 350.681 370.759 130.687 280.727 180.981 460.911 280.883 180.853 231.000 10.756 28
PBNetpermissive0.825 291.000 10.963 300.837 280.843 120.865 360.822 80.647 520.878 220.733 280.639 410.683 290.650 321.000 10.853 420.870 310.820 301.000 10.744 29
Weiguang Zhao, Yuyao Yan, Chaolong Yang, Jianan Ye, Xi Yang, Kaizhu Huang: Divide and Conquer: 3D Instance Segmentation With Point-Wise Binarization. ICCV 2023
SSEC0.820 301.000 10.983 180.924 40.826 180.817 530.415 570.899 50.793 400.673 390.731 220.636 340.653 301.000 10.939 120.804 610.878 111.000 10.780 20
DKNet0.815 311.000 10.930 420.844 250.765 370.915 220.534 460.805 270.805 370.807 230.654 350.763 180.650 321.000 10.794 540.881 190.766 341.000 10.758 26
Yizheng Wu, Min Shi, Shuaiyuan Du, Hao Lu, Zhiguo Cao, Weicai Zhong: 3D Instances as 1D Kernels. ECCV 2022
RPGN0.806 321.000 10.992 120.789 430.723 500.891 290.650 300.810 260.832 300.665 410.699 290.658 300.700 231.000 10.881 350.832 500.774 320.997 400.613 51
Shichao Dong, Guosheng Lin, Tzu-Yi Hung: Learning Regional Purity for Instance Segmentation on 3D Point Clouds. ECCV 2022
HAISpermissive0.803 331.000 10.994 100.820 330.759 380.855 420.554 430.882 80.827 330.615 470.676 330.638 330.646 411.000 10.912 200.797 640.767 330.994 490.726 31
Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang: Hierarchical Aggregation for 3D Instance Segmentation. ICCV 2021
Box2Mask0.803 331.000 10.962 320.874 130.707 540.887 320.686 290.598 570.961 10.715 320.694 300.469 540.700 231.000 10.912 200.902 80.753 390.997 400.637 45
Julian Chibane, Francis Engelmann, Tuan Anh Tran, Gerard Pons-Moll: Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation Using Bounding Boxes. ECCV 2022
Mask-Group0.792 351.000 10.968 290.812 340.766 360.864 370.460 510.815 240.888 190.598 510.651 380.639 320.600 490.918 520.941 100.896 120.721 461.000 10.723 32
Min Zhong, Xinghao Chen, Xiaokang Chen, Gang Zeng, Yunhe Wang: MaskGroup: Hierarchical Point Grouping and Masking for 3D Instance Segmentation. ICME 2022
CSC-Pretrained0.791 361.000 10.996 60.829 320.767 350.889 310.600 340.819 230.770 450.594 520.620 450.541 460.700 231.000 10.941 100.889 160.763 351.000 10.526 61
SSTNetpermissive0.789 371.000 10.840 640.888 100.717 510.835 460.717 220.684 500.627 600.724 290.652 370.727 250.600 491.000 10.912 200.822 530.757 381.000 10.691 39
Zhihao Liang, Zhihao Li, Songcen Xu, Mingkui Tan, Kui Jia: Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks. ICCV2021
GICN0.788 381.000 10.978 210.867 180.781 320.833 470.527 470.824 210.806 360.549 600.596 480.551 420.700 231.000 10.853 420.935 20.733 431.000 10.651 42
DENet0.786 391.000 10.929 430.736 560.750 440.720 660.755 190.934 10.794 390.590 530.561 540.537 470.650 321.000 10.882 340.804 620.789 311.000 10.719 33
DANCENET0.786 391.000 10.936 380.783 460.737 470.852 440.742 210.647 520.765 470.811 220.624 440.579 390.632 461.000 10.909 320.898 110.696 510.944 580.601 54
DualGroup0.782 411.000 10.927 440.811 350.772 340.853 430.631 330.805 270.773 420.613 480.611 460.610 370.650 320.835 630.881 350.879 210.750 411.000 10.675 40
PointGroup0.778 421.000 10.900 540.798 420.715 520.863 380.493 490.706 450.895 160.569 580.701 270.576 400.639 441.000 10.880 370.851 430.719 470.997 400.709 36
Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia: PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation. CVPR 2020 [oral]
PE0.776 431.000 10.900 550.860 200.728 490.869 340.400 580.857 130.774 410.568 590.701 280.602 380.646 410.933 510.843 450.890 150.691 550.997 400.709 35
Biao Zhang, Peter Wonka: Point Cloud Instance Segmentation using Probabilistic Embeddings. CVPR 2021
AOIA0.767 441.000 10.937 370.810 360.740 460.906 240.550 440.800 290.706 520.577 570.624 430.544 450.596 540.857 550.879 390.880 200.750 400.992 500.658 41
DD-UNet+Group0.764 451.000 10.897 570.837 270.753 410.830 490.459 530.824 210.699 540.629 450.653 360.438 570.650 321.000 10.880 370.858 390.690 561.000 10.650 43
H. Liu, R. Liu, K. Yang, J. Zhang, K. Peng, R. Stiefelhagen: HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor. ICCVW 2021
INS-Conv-instance0.762 461.000 10.923 470.765 510.785 300.905 250.600 340.655 510.646 590.683 360.647 390.530 480.650 321.000 10.824 470.830 510.693 540.944 580.644 44
Dyco3Dcopyleft0.761 471.000 10.935 390.893 80.752 430.863 390.600 340.588 580.742 490.641 430.633 420.546 440.550 560.857 550.789 560.853 420.762 360.987 520.699 37
Tong He; Chunhua Shen; Anton van den Hengel: DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution. CVPR2021
OccuSeg+instance0.742 481.000 10.923 470.785 440.745 450.867 350.557 400.578 610.729 500.670 400.644 400.488 520.577 551.000 10.794 540.830 510.620 641.000 10.550 57
Lei Han, Tian Zheng, Lan Xu, Lu Fang: OccuSeg: Occupancy-aware 3D Instance Segmentation. CVPR2020
RWSeg0.739 491.000 10.899 560.759 530.753 420.823 510.282 630.691 480.658 570.582 560.594 490.547 430.628 471.000 10.795 530.868 330.728 451.000 10.692 38
3D-MPA0.737 501.000 10.933 400.785 440.794 290.831 480.279 650.588 580.695 550.616 460.559 550.556 410.650 321.000 10.809 510.875 250.696 521.000 10.608 53
Francis Engelmann, Martin Bokeloh, Alireza Fathi, Bastian Leibe, Matthias Nießner: 3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation. CVPR 2020
MTML0.731 511.000 10.992 120.779 500.609 630.746 610.308 620.867 90.601 630.607 490.539 580.519 500.550 561.000 10.824 470.869 320.729 441.000 10.616 49
Jean Lahoud, Bernard Ghanem, Marc Pollefeys, Martin R. Oswald: 3D Instance Segmentation via Multi-task Metric Learning. ICCV 2019 [oral]
OSIS0.725 521.000 10.885 600.653 650.657 600.801 540.576 380.695 470.828 310.698 340.534 590.457 560.500 630.857 550.831 460.841 480.627 621.000 10.619 48
SSEN0.724 531.000 10.926 450.781 490.661 580.845 450.596 370.529 640.764 480.653 420.489 650.461 550.500 630.859 540.765 570.872 280.761 371.000 10.577 55
Dongsu Zhang, Junha Chun, Sang Kyun Cha, Young Min Kim: Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with Deep Metric Learning. Arxiv
NeuralBF0.718 541.000 10.945 360.901 70.754 400.817 520.460 510.700 460.772 430.688 350.568 530.000 760.500 630.981 460.606 670.872 270.740 421.000 10.614 50
Weiwei Sun, Daniel Rebain, Renjie Liao, Vladimir Tankovich, Soroosh Yazdani, Kwang Moo Yi, Andrea Tagliasacchi: NeuralBF: Neural Bilateral Filtering for Top-down Instance Segmentation on Point Clouds. WACV 2023
Sparse R-CNN0.714 551.000 10.926 460.694 600.699 560.890 300.636 320.516 650.693 560.743 270.588 500.369 610.601 480.594 690.800 520.886 170.676 570.986 530.546 58
SALoss-ResNet0.695 561.000 10.855 620.579 700.589 650.735 640.484 500.588 580.856 260.634 440.571 520.298 620.500 631.000 10.824 470.818 540.702 500.935 650.545 59
Zhidong Liang, Ming Yang, Hao Li, Chunxiang Wang: 3D Instance Embedding Learning With a Structure-Aware Loss Function for Point Cloud Segmentation. IEEE Robotics and Automation Letters (IROS2020)
PanopticFusion-inst0.693 571.000 10.852 630.655 640.616 620.788 560.334 600.763 350.771 440.457 700.555 560.652 310.518 600.857 550.765 570.732 700.631 600.944 580.577 56
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
Occipital-SCS0.688 581.000 10.913 510.730 580.737 480.743 630.442 540.855 150.655 580.546 610.546 570.263 640.508 620.889 530.568 680.771 670.705 490.889 680.625 47
3D-BoNet0.687 591.000 10.887 590.836 290.587 660.643 730.550 440.620 540.724 510.522 650.501 630.243 650.512 611.000 10.751 590.807 600.661 590.909 670.612 52
Bo Yang, Jianan Wang, Ronald Clark, Qingyong Hu, Sen Wang, Andrew Markham, Niki Trigoni: Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds. NeurIPS 2019 Spotlight
ClickSeg_Instance0.685 601.000 10.818 660.600 680.715 530.795 550.557 400.533 630.591 650.601 500.519 610.429 590.638 450.938 500.706 620.817 560.624 630.944 580.502 63
PCJC0.684 611.000 10.895 580.757 540.659 590.862 400.189 720.739 410.606 620.712 330.581 510.515 510.650 320.857 550.357 730.785 650.631 610.889 680.635 46
SPG_WSIS0.678 621.000 10.880 610.836 290.701 550.727 650.273 670.607 560.706 530.541 630.515 620.174 680.600 490.857 550.716 610.846 470.711 481.000 10.506 62
One_Thing_One_Clickpermissive0.675 631.000 10.823 650.782 470.621 610.766 580.211 690.736 420.560 670.586 540.522 600.636 350.453 670.641 670.853 420.850 450.694 530.997 400.411 68
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
SegGroup_inspermissive0.637 641.000 10.923 490.593 690.561 670.746 620.143 740.504 660.766 460.485 680.442 660.372 600.530 590.714 640.815 500.775 660.673 581.000 10.431 67
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
MASCpermissive0.615 650.711 720.802 670.540 710.757 390.777 570.029 750.577 620.588 660.521 660.600 470.436 580.534 580.697 650.616 660.838 490.526 660.980 550.534 60
Chen Liu, Yasutaka Furukawa: MASC: Multi-scale Affinity with Sparse Convolution for 3D Instance Segmentation.
UNet-backbone0.605 661.000 10.909 520.764 520.603 640.704 670.415 560.301 710.548 680.461 690.394 670.267 630.386 690.857 550.649 650.817 550.504 680.959 560.356 71
3D-SISpermissive0.558 671.000 10.773 680.614 670.503 700.691 690.200 700.412 670.498 710.546 620.311 720.103 720.600 490.857 550.382 700.799 630.445 740.938 640.371 69
Ji Hou, Angela Dai, Matthias Niessner: 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans. CVPR 2019
R-PointNet0.544 680.500 750.655 740.661 630.663 570.765 590.432 550.214 740.612 610.584 550.499 640.204 670.286 730.429 720.655 640.650 750.539 650.950 570.499 64
Hier3Dcopyleft0.540 691.000 10.727 690.626 660.467 730.693 680.200 700.412 670.480 720.528 640.318 710.077 750.600 490.688 660.382 700.768 680.472 700.941 620.350 72
Tan: HCFS3D: Hierarchical Coupled Feature Selection Network for 3D Semantic and Instance Segmentation.
Region-18class0.497 700.250 770.902 530.689 610.540 680.747 600.276 660.610 550.268 760.489 670.348 680.000 760.243 760.220 750.663 630.814 570.459 720.928 660.496 65
Sem_Recon_ins0.484 710.764 710.608 760.470 730.521 690.637 740.311 610.218 730.348 750.365 740.223 730.222 660.258 740.629 680.734 600.596 760.509 670.858 720.444 66
tmp0.474 721.000 10.727 690.433 750.481 720.673 710.022 770.380 690.517 700.436 720.338 700.128 700.343 710.429 720.291 750.728 710.473 690.833 730.300 74
SemRegionNet-20cls0.470 731.000 10.727 690.447 740.481 710.678 700.024 760.380 690.518 690.440 710.339 690.128 700.350 700.429 720.212 760.711 720.465 710.833 730.290 75
ASIS0.422 740.333 760.707 720.676 620.401 740.650 720.350 590.177 750.594 640.376 730.202 740.077 740.404 680.571 700.197 770.674 740.447 730.500 760.260 76
3D-BEVIS0.401 750.667 730.687 730.419 760.137 770.587 750.188 730.235 720.359 740.211 760.093 770.080 730.311 720.571 700.382 700.754 690.300 760.874 710.357 70
Cathrin Elich, Francis Engelmann, Jonas Schult, Theodora Kontogianni, Bastian Leibe: 3D-BEVIS: Birds-Eye-View Instance Segmentation.
Sgpn_scannet0.390 760.556 740.636 750.493 720.353 750.539 760.271 680.160 760.450 730.359 750.178 750.146 690.250 750.143 760.347 740.698 730.436 750.667 750.331 73
MaskRCNN 2d->3d Proj0.261 770.903 700.081 770.008 770.233 760.175 770.280 640.106 770.150 770.203 770.175 760.480 530.218 770.143 760.542 690.404 770.153 770.393 770.049 77


This table lists the benchmark results for the 2D semantic label scenario.


Method Infoavg ioubathtubbedbookshelfcabinetchaircountercurtaindeskdoorfloorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwallwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Virtual MVFusion (R)0.745 10.861 10.839 10.881 10.672 20.512 10.422 170.898 10.723 10.714 10.954 20.454 10.509 10.773 10.895 10.756 10.820 10.653 10.935 10.891 10.728 1
Abhijit Kundu, Xiaoqi Yin, Alireza Fathi, David Ross, Brian Brewington, Thomas Funkhouser, Caroline Pantofaru: Virtual Multi-view Fusion for 3D Semantic Segmentation. ECCV 2020
BPNet_2Dcopyleft0.670 20.822 30.795 30.836 20.659 30.481 20.451 130.769 40.656 30.567 40.931 30.395 60.390 50.700 40.534 40.689 100.770 20.574 30.865 90.831 30.675 5
Wenbo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia and Tien-Tsin Wong: Bidirectional Projection Network for Cross Dimension Scene Understanding. CVPR 2021 (Oral)
MVF-GNN(2D)0.636 30.606 140.794 40.434 160.688 10.337 80.464 120.798 30.632 50.589 30.908 80.420 20.329 120.743 20.594 20.738 20.676 50.527 40.906 20.818 60.715 3
CU-Hybrid-2D Net0.636 30.825 20.820 20.179 230.648 40.463 30.549 20.742 70.676 20.628 20.961 10.420 20.379 60.684 80.381 180.732 30.723 30.599 20.827 160.851 20.634 7
CMX0.613 50.681 80.725 120.502 120.634 60.297 180.478 100.830 20.651 40.537 70.924 40.375 70.315 140.686 70.451 140.714 50.543 210.504 60.894 70.823 50.688 4
DMMF_3d0.605 60.651 90.744 100.782 30.637 50.387 40.536 30.732 80.590 70.540 60.856 210.359 110.306 150.596 140.539 30.627 200.706 40.497 80.785 210.757 190.476 22
EMSANet0.600 70.716 40.746 90.395 180.614 90.382 50.523 40.713 110.571 110.503 100.922 60.404 50.397 40.655 90.400 160.626 210.663 60.469 130.900 40.827 40.577 14
Seichter, Daniel and Fischedick, Söhnke and Köhler, Mona and Gross, Horst-Michael: EMSANet: Efficient Multi-Task RGB-D Scene Analysis for Indoor Environments. IJCNN 2022
MCA-Net0.595 80.533 200.756 80.746 40.590 100.334 100.506 70.670 150.587 80.500 120.905 100.366 100.352 90.601 130.506 80.669 160.648 90.501 70.839 150.769 150.516 21
RFBNet0.592 90.616 110.758 70.659 50.581 110.330 110.469 110.655 180.543 140.524 80.924 40.355 130.336 110.572 170.479 100.671 140.648 90.480 100.814 190.814 70.614 10
FAN_NV_RVC0.586 100.510 210.764 60.079 260.620 80.330 110.494 80.753 50.573 90.556 50.884 160.405 40.303 160.718 30.452 130.672 130.658 70.509 50.898 50.813 80.727 2
DCRedNet0.583 110.682 70.723 130.542 110.510 200.310 150.451 130.668 160.549 130.520 90.920 70.375 70.446 20.528 200.417 150.670 150.577 180.478 110.862 100.806 90.628 9
MIX6D_RVC0.582 120.695 50.687 170.225 210.632 70.328 130.550 10.748 60.623 60.494 150.890 140.350 150.254 230.688 60.454 120.716 40.597 170.489 90.881 80.768 160.575 15
SSMAcopyleft0.577 130.695 50.716 150.439 140.563 140.314 140.444 150.719 90.551 120.503 100.887 150.346 160.348 100.603 120.353 200.709 60.600 150.457 140.901 30.786 110.599 13
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. International Journal of Computer Vision, 2019
DMMF0.567 140.623 100.767 50.238 200.571 130.347 60.413 190.719 90.472 200.418 220.895 130.357 120.260 220.696 50.523 70.666 170.642 110.437 180.895 60.793 100.603 12
UNIV_CNP_RVC_UE0.566 150.569 190.686 190.435 150.524 170.294 190.421 180.712 120.543 140.463 170.872 170.320 170.363 80.611 110.477 110.686 110.627 120.443 170.862 100.775 140.639 6
EMSAFormer0.564 160.581 160.736 110.564 100.546 160.219 230.517 50.675 140.486 190.427 210.904 110.352 140.320 130.589 150.528 50.708 70.464 240.413 220.847 140.786 110.611 11
SN_RN152pyrx8_RVCcopyleft0.546 170.572 170.663 210.638 70.518 180.298 170.366 240.633 210.510 170.446 190.864 190.296 200.267 190.542 190.346 210.704 80.575 190.431 190.853 130.766 170.630 8
UDSSEG_RVC0.545 180.610 130.661 220.588 80.556 150.268 210.482 90.642 200.572 100.475 160.836 230.312 180.367 70.630 100.189 230.639 190.495 230.452 150.826 170.756 200.541 17
segfomer with 6d0.542 190.594 150.687 170.146 240.579 120.308 160.515 60.703 130.472 200.498 130.868 180.369 90.282 170.589 150.390 170.701 90.556 200.416 210.860 120.759 180.539 19
FuseNetpermissive0.535 200.570 180.681 200.182 220.512 190.290 200.431 160.659 170.504 180.495 140.903 120.308 190.428 30.523 210.365 190.676 120.621 140.470 120.762 220.779 130.541 17
Caner Hazirbas, Lingni Ma, Csaba Domokos, Daniel Cremers: FuseNet: Incorporating Depth into Semantic Segmentation via Fusion-based CNN Architecture. ACCV 2016
AdapNet++copyleft0.503 210.613 120.722 140.418 170.358 260.337 80.370 230.479 240.443 220.368 240.907 90.207 230.213 250.464 240.525 60.618 220.657 80.450 160.788 200.721 230.408 25
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. International Journal of Computer Vision, 2019
3DMV (2d proj)0.498 220.481 240.612 230.579 90.456 220.343 70.384 210.623 220.525 160.381 230.845 220.254 220.264 210.557 180.182 240.581 240.598 160.429 200.760 230.661 250.446 24
Angela Dai, Matthias Niessner: 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation. ECCV'18
MSeg1080_RVCpermissive0.485 230.505 220.709 160.092 250.427 230.241 220.411 200.654 190.385 260.457 180.861 200.053 260.279 180.503 220.481 90.645 180.626 130.365 240.748 240.725 220.529 20
John Lambert*, Zhuang Liu*, Ozan Sener, James Hays, Vladlen Koltun: MSeg: A Composite Dataset for Multi-domain Semantic Segmentation. CVPR 2020
ILC-PSPNet0.475 240.490 230.581 240.289 190.507 210.067 260.379 220.610 230.417 240.435 200.822 250.278 210.267 190.503 220.228 220.616 230.533 220.375 230.820 180.729 210.560 16
Enet (reimpl)0.376 250.264 260.452 260.452 130.365 240.181 240.143 260.456 250.409 250.346 250.769 260.164 240.218 240.359 250.123 260.403 260.381 260.313 260.571 250.685 240.472 23
Re-implementation of Adam Paszke, Abhishek Chaurasia, Sangpil Kim, Eugenio Culurciello: ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation.
ScanNet (2d proj)permissive0.330 260.293 250.521 250.657 60.361 250.161 250.250 250.004 260.440 230.183 260.836 230.125 250.060 260.319 260.132 250.417 250.412 250.344 250.541 260.427 260.109 26
Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner: ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. CVPR'17


This table lists the benchmark results for the 2D semantic instance scenario.




Method Infoavg apbathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
EMSANet (Instance)0.241 10.401 10.439 10.085 10.242 10.220 10.081 10.289 20.117 20.121 10.182 10.126 10.346 10.181 20.181 20.358 10.156 10.675 20.131 1
Seichter, Daniel and Fischedick, Söhnke and Köhler, Mona and Gross, Horst-Michael: EMSANet: Efficient Multi-Task RGB-D Scene Analysis for Indoor Environments. IJCNN 2022
UniDet_RVC0.205 20.381 20.323 30.037 30.226 30.177 30.063 20.277 30.120 10.067 30.131 30.074 30.317 20.080 30.235 10.289 30.141 30.678 10.080 3
FKNet0.204 30.334 30.358 20.038 20.234 20.184 20.025 30.318 10.042 40.088 20.141 20.053 40.300 30.207 10.171 30.292 20.149 20.636 30.109 2
MaskRCNN_ScanNetpermissive0.119 40.129 40.212 40.002 40.112 40.148 40.014 40.205 40.044 30.066 40.078 40.095 20.142 40.030 40.128 40.139 40.080 40.459 40.057 4
Re-implementation of Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross Girshick: Mask R-CNN. ICCV'17


This table lists the benchmark results for the scene type classification scenario.




Method Infoavg recallapartmentbathroombedroom / hotelbookstore / libraryconference roomcopy/mail roomhallwaykitchenlaundry roomliving room / loungemiscofficestorage / basement / garage
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort by
SE-ResNeXt-SSMA0.498 40.000 50.812 40.941 20.500 30.500 40.500 30.500 20.429 50.500 20.667 30.500 10.625 40.000 3
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. arXiv
LAST-PCL-type0.780 10.250 31.000 11.000 11.000 11.000 11.000 10.500 21.000 10.500 20.889 10.000 21.000 11.000 1
Yanmin Wu, Qiankun Gao, Renrui Zhang, and Jian Zhang: Language-Assisted 3D Scene Understanding. arxiv23.12
3DASPP-SCE0.691 30.500 10.938 30.824 41.000 11.000 10.500 31.000 10.857 30.500 20.556 40.000 20.812 30.500 2
multi-taskpermissive0.700 20.500 11.000 10.882 30.500 31.000 11.000 10.500 21.000 11.000 10.778 20.000 20.938 20.000 3
Shengyu Huang, Mikhail Usvyatsov, Konrad Schindler: Indoor Scene Recognition in 3D. IROS 2020
resnet50_scannet0.353 50.250 30.812 40.529 50.500 30.500 40.000 50.500 20.571 40.000 50.556 40.000 20.375 50.000 3