Presenting the ScanNet200 Benchmark

We present the ScanNet200 benchmark, which studies an order of magnitude more class categories than previous version of ScanNet. The scene geometry is shared within the two tasks, but the parsing of surface annotation allows for a larger vocabulary and more realistic setting for in the wild 3D understanding methods.

The ScanNet200 benchmark includes both finer-grained categories as well as a large number of previously unaddressed classes. This induces a much more challenging setting regarding the diversity of naturally observed semantic classes seen in the raw ScanNet RGB-D observations, where the data also reflects naturally encountered class imbalances. The difference in category frequencies between ScanNet and ScanNet200 can be seen in the Figure above.

ScanNet200 Benchmark

This table lists the benchmark results for the ScanNet200 3D semantic label scenario.




Method Infoavg iouhead ioucommon ioutail iouwallchairfloortabledoorcouchcabinetshelfdeskoffice chairbedpillowsinkpicturewindowtoiletbookshelfmonitorcurtainbookarmchaircoffee tableboxrefrigeratorlampkitchen cabinettowelclothestvnightstandcounterdresserstoolcushionplantceilingbathtubend tabledining tablekeyboardbagbackpacktoilet paperprintertv standwhiteboardblanketshower curtaintrash canclosetstairsmicrowavestoveshoecomputer towerbottlebinottomanbenchboardwashing machinemirrorcopierbasketsofa chairfile cabinetfanlaptopshowerpaperpersonpaper towel dispenserovenblindsrackplateblackboardpianosuitcaserailradiatorrecycling bincontainerwardrobesoap dispensertelephonebucketclockstandlightlaundry basketpipeclothes dryerguitartoilet paper holderseatspeakercolumnbicycleladderbathroom stallshower wallcupjacketstorage bincoffee makerdishwasherpaper towel rollmachinematwindowsillbartoasterbulletin boardironing boardfireplacesoap dishkitchen counterdoorframetoilet paper dispensermini fridgefire extinguisherballhatshower curtain rodwater coolerpaper cuttertrayshower doorpillarledgetoaster ovenmousetoilet seat cover dispenserfurniturecartstorage containerscaletissue boxlight switchcratepower outletdecorationsignprojectorcloset doorvacuum cleanercandleplungerstuffed animalheadphonesdish rackbroomguitar caserange hooddustpanhair dryerwater bottlehandicap barpurseventshower floorwater pitchermailboxbowlpaper bagalarm clockmusic standprojector screendividerlaundry detergentbathroom counterobjectbathroom vanitycloset walllaundry hamperbathroom stall doorceiling lighttrash bindumbbellstair railtubebathroom cabinetcd casecloset rodcoffee kettlestructureshower headkeyboard pianocase of water bottlescoat rackstorage organizerfolded chairfire alarmpower stripcalendarposterpotted plantluggagemattress
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
ALS-MinkowskiNetcopyleft0.414 20.610 20.322 30.271 20.852 10.710 20.973 10.572 40.719 30.795 20.477 60.506 20.601 30.000 140.804 50.646 30.804 20.344 20.777 10.984 10.671 10.879 20.936 10.342 50.632 70.449 40.817 30.475 100.723 20.798 10.376 80.832 20.693 10.031 90.564 10.510 130.000 10.893 30.905 10.672 160.314 10.000 70.718 10.153 30.542 20.397 30.726 30.752 80.252 80.226 20.916 20.800 10.047 160.807 30.769 10.709 30.630 30.769 10.217 100.000 30.285 10.598 40.846 100.535 10.956 40.000 70.137 110.784 20.464 70.463 130.230 120.000 10.598 30.662 90.000 40.087 20.000 10.135 30.900 20.780 110.703 20.741 10.571 20.149 90.697 70.646 20.000 30.076 20.000 10.025 110.000 40.106 60.981 10.000 10.043 70.113 40.888 20.248 150.404 40.252 60.314 10.220 70.245 20.466 70.366 20.159 20.000 40.149 80.690 20.000 30.531 50.253 30.285 60.460 10.440 50.813 10.230 30.283 60.159 110.000 10.728 10.666 50.958 10.000 10.021 50.252 80.118 50.000 70.445 30.223 100.285 10.194 30.390 20.000 10.475 40.842 70.000 10.455 30.000 10.250 70.458 80.000 10.865 10.000 10.000 10.635 10.359 50.972 10.087 30.447 10.000 10.000 90.000 10.129 20.532 60.446 80.503 50.071 130.135 120.699 40.717 20.097 20.000 10.665 10.000 20.000 21.000 10.752 60.000 30.000 10.000 10.142 90.200 10.259 11.000 10.000 1
Guangda Ji, Silvan Weder, Francis Engelmann, Marc Pollefeys, Hermann Blum: ARKit LabelMaker: A New Scale for Indoor 3D Scene Understanding. arxiv
DITR0.449 10.629 10.392 10.289 10.851 20.727 10.969 40.600 10.741 20.805 10.519 10.480 30.636 10.014 100.867 10.680 10.849 10.318 30.753 20.982 20.508 120.871 60.934 20.482 10.596 110.551 20.804 40.508 60.729 10.718 20.417 40.886 10.664 30.000 170.500 20.698 10.000 10.913 10.901 30.766 70.113 120.000 70.617 50.168 20.650 10.477 10.826 10.962 10.348 30.300 10.947 10.776 20.160 30.889 10.651 50.720 20.700 10.728 30.317 10.000 30.238 50.664 10.869 40.514 20.998 10.313 30.138 100.815 10.828 10.622 20.421 50.000 10.823 10.817 10.000 40.000 90.000 10.157 20.866 30.991 10.805 10.660 40.571 20.043 120.709 60.642 30.000 30.000 70.000 10.028 100.018 30.134 30.967 20.000 10.150 20.130 20.949 10.855 10.580 10.262 50.314 10.230 50.222 40.498 50.367 10.153 30.869 10.334 20.397 80.000 30.904 10.486 21.000 10.423 40.484 10.632 60.716 10.733 20.862 10.000 10.433 140.710 10.851 20.000 10.034 40.315 30.385 10.000 70.001 90.268 90.066 110.000 80.278 40.000 10.978 10.839 80.000 10.448 40.000 10.579 10.403 120.000 10.647 30.000 10.000 10.411 30.315 60.904 70.420 10.392 20.000 10.091 60.000 10.128 30.564 30.591 30.568 20.079 90.139 91.000 10.714 30.178 10.000 10.606 30.000 20.000 20.148 60.983 10.000 30.000 10.000 10.374 20.000 70.000 30.662 40.000 1
Karim Abou Zeid, Kadir Yilmaz, Daan de Geus, Alexander Hermans, David Adrian, Timm Linder, Bastian Leibe: DINO in the Room: Leveraging 2D Foundation Models for 3D Segmentation.
PTv3 ScanNet2000.393 30.592 30.330 20.216 30.851 20.687 60.971 20.586 20.755 10.752 70.505 20.404 70.575 50.000 140.848 20.616 40.761 30.349 10.738 30.978 30.546 60.860 80.926 30.346 40.654 30.384 70.828 10.523 40.699 30.583 60.387 70.822 30.688 20.118 40.474 30.603 50.000 10.832 80.903 20.753 90.140 100.000 70.650 30.109 50.520 30.457 20.497 100.871 40.281 40.192 50.887 40.748 30.168 20.727 70.733 20.740 10.644 20.714 50.190 130.000 30.256 30.449 100.914 10.514 20.759 150.337 10.172 60.692 70.617 30.636 10.325 70.000 10.641 20.782 20.000 40.065 30.000 10.000 60.842 40.903 20.661 40.662 30.612 10.405 20.731 40.566 40.000 30.000 70.000 10.017 150.301 10.088 70.941 30.000 10.077 40.000 100.717 80.790 20.310 120.026 170.264 40.349 10.220 50.397 120.366 20.115 130.000 40.337 10.463 60.000 30.531 50.218 40.593 20.455 20.469 20.708 30.210 40.592 40.108 160.000 10.728 10.682 30.671 80.000 10.000 110.407 10.136 40.022 30.575 10.436 40.259 30.428 10.048 60.000 10.000 50.879 50.000 10.480 20.000 10.133 90.597 20.000 10.690 20.000 10.000 10.009 160.000 150.921 30.000 90.151 50.000 10.000 90.000 10.109 80.494 110.622 20.394 90.073 120.141 70.798 20.528 80.026 50.000 10.551 50.000 20.000 20.134 70.717 80.000 30.000 10.000 10.188 40.000 70.000 30.791 30.000 1
Xiaoyang Wu, Li Jiang, Peng-Shuai Wang, Zhijian Liu, Xihui Liu, Yu Qiao, Wanli Ouyang, Tong He, Hengshuang Zhao: Point Transformer V3: Simpler, Faster, Stronger. CVPR 2024 (Oral)
OA-CNN-L_ScanNet2000.333 110.558 50.269 90.124 130.821 50.703 30.946 60.569 50.662 40.748 90.487 30.455 40.572 70.000 140.789 90.534 90.736 90.271 80.713 40.949 60.498 140.877 30.860 110.332 70.706 10.474 30.788 70.406 130.637 60.495 110.355 110.805 70.592 120.015 130.396 80.602 60.000 10.799 110.876 70.713 130.276 20.000 70.493 130.080 90.448 140.363 50.661 40.833 60.262 60.125 70.823 120.665 90.076 90.720 80.557 100.637 90.517 90.672 100.227 80.000 30.158 120.496 80.843 110.352 100.835 130.000 70.103 140.711 50.527 40.526 60.320 80.000 10.568 60.625 110.067 10.000 90.000 10.001 50.806 60.836 70.621 100.591 80.373 80.314 50.668 100.398 90.003 20.000 70.000 10.016 160.024 20.043 130.906 60.000 10.052 60.000 100.384 120.330 120.342 80.100 120.223 70.183 130.112 70.476 60.313 70.130 90.196 30.112 120.370 110.000 30.234 120.071 90.160 70.403 60.398 130.492 140.197 60.076 130.272 50.000 10.200 160.560 100.735 70.000 10.000 110.000 120.110 80.002 60.021 80.412 50.000 120.000 80.000 110.000 10.000 50.794 110.000 10.445 50.000 10.022 100.509 70.000 10.517 130.000 10.000 10.001 170.245 70.915 50.024 60.089 70.000 10.262 30.000 10.103 110.524 70.392 110.515 40.013 170.251 40.411 130.662 40.001 110.000 10.473 120.000 20.000 20.150 50.699 90.000 30.000 10.000 10.166 60.000 70.024 20.000 110.000 1
PPT-SpUNet-F.T.0.332 120.556 60.270 70.123 140.816 60.682 90.946 60.549 100.657 80.756 50.459 70.376 90.550 110.001 120.807 40.616 40.727 120.267 90.691 50.942 110.530 90.872 50.874 80.330 80.542 140.374 80.792 50.400 140.673 40.572 70.433 20.793 90.623 70.008 160.351 100.594 80.000 10.783 130.876 70.833 40.213 60.000 70.537 80.091 70.519 40.304 80.620 80.942 20.264 50.124 80.855 70.695 50.086 80.646 100.506 160.658 70.535 60.715 40.314 20.000 30.241 40.608 30.897 20.359 80.858 110.000 70.076 170.611 110.392 120.509 70.378 60.000 10.579 40.565 150.000 40.000 90.000 10.000 60.755 70.806 90.661 40.572 130.350 90.181 70.660 120.300 140.000 30.000 70.000 10.023 120.000 40.042 140.930 40.000 10.000 100.077 70.584 90.392 100.339 90.185 100.171 120.308 20.006 130.563 30.256 80.150 40.000 40.002 160.345 120.000 30.045 140.197 50.063 110.323 110.453 40.600 80.163 110.037 150.349 40.000 10.672 30.679 40.753 50.000 10.000 110.000 120.117 60.000 70.000 100.291 80.000 120.000 80.039 70.000 10.000 50.899 20.000 10.374 110.000 10.000 120.545 50.000 10.634 50.000 10.000 10.074 130.223 80.914 60.000 90.021 90.000 10.000 90.000 10.112 60.498 100.649 10.383 100.095 20.135 120.449 110.432 120.008 90.000 10.518 70.000 20.000 20.000 110.796 50.000 30.000 10.000 10.138 130.000 70.000 30.000 110.000 1
Xiaoyang Wu, Zhuotao Tian, Xin Wen, Bohao Peng, Xihui Liu, Kaicheng Yu, Hengshuang Zhao: Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt Training. CVPR 2024
ODIN - Sem200permissive0.368 40.562 40.297 40.207 40.800 100.669 130.940 100.575 30.654 90.749 80.487 30.589 10.609 20.001 120.769 120.561 80.752 60.274 50.682 60.926 130.554 40.833 140.921 40.389 20.599 100.591 10.787 80.550 20.657 50.610 40.334 130.803 80.661 40.090 60.408 70.373 150.000 10.912 20.796 170.501 170.169 80.000 70.641 40.196 10.380 170.397 30.641 50.740 90.862 10.213 30.857 60.685 70.216 10.578 160.557 100.685 50.523 80.581 160.312 30.000 30.065 150.000 170.871 30.359 80.988 20.321 20.090 160.704 60.631 20.393 150.246 110.000 10.482 80.565 150.000 40.000 90.000 10.181 10.913 10.468 160.632 80.642 50.259 110.000 170.832 10.663 10.000 30.081 10.000 10.048 20.000 40.376 10.898 70.000 10.157 10.000 100.870 30.000 170.400 50.265 40.242 50.227 60.539 10.370 140.214 130.129 100.000 40.131 100.054 170.000 30.358 90.491 10.462 40.434 30.346 150.454 150.316 20.814 10.828 20.000 10.000 170.220 170.612 110.000 10.000 110.373 20.378 20.000 70.429 40.152 110.077 90.166 40.202 50.000 10.000 50.441 140.000 10.440 60.000 10.000 120.655 10.000 10.626 70.000 10.000 10.228 90.487 10.784 160.000 90.301 30.000 10.426 20.000 10.108 90.460 130.590 40.775 10.088 60.119 150.485 90.791 10.000 120.000 10.256 170.000 20.000 20.000 110.885 30.303 10.000 10.000 10.127 160.000 70.000 30.894 20.000 1
Ayush Jain, Pushkal Katara, Nikolaos Gkanatsios, Adam W. Harley, Gabriel Sarch, Kriti Aggarwal, Vishrav Chaudhary, Katerina Fragkiadaki: ODIN: A Single Model for 2D and 3D Segmentation. CVPR 2024
PonderV2 ScanNet2000.346 60.552 80.270 80.175 90.810 70.682 90.950 50.560 70.641 100.761 30.398 130.357 100.570 80.113 20.804 50.603 60.750 70.283 40.681 70.952 50.548 50.874 40.852 130.290 120.700 20.356 110.792 50.445 120.545 130.436 120.351 120.787 100.611 80.050 80.290 140.519 120.000 10.825 100.888 50.842 30.259 30.100 20.558 70.070 120.497 70.247 140.457 110.889 30.248 90.106 100.817 130.691 60.094 70.729 60.636 60.620 120.503 110.660 130.243 70.000 30.212 70.590 50.860 80.400 50.881 90.000 70.202 20.622 100.408 110.499 80.261 100.000 10.385 100.636 100.000 40.000 90.000 10.000 60.433 160.843 60.660 60.574 120.481 40.336 40.677 90.486 60.000 30.030 30.000 10.034 60.000 40.080 80.869 100.000 10.000 100.000 100.540 100.727 30.232 170.115 110.186 100.193 90.000 140.403 110.326 60.103 140.000 40.290 40.392 90.000 30.346 100.062 100.424 50.375 70.431 60.667 40.115 140.082 120.239 70.000 10.504 120.606 80.584 120.000 10.002 90.186 100.104 100.000 70.394 50.384 60.083 80.000 80.007 90.000 10.000 50.880 40.000 10.377 100.000 10.263 60.565 30.000 10.608 90.000 10.000 10.304 70.009 110.924 20.000 90.000 110.000 10.000 90.000 10.128 30.584 20.475 70.412 80.076 110.269 30.621 60.509 90.010 70.000 10.491 110.063 10.000 20.472 40.880 40.000 30.000 10.000 10.179 50.125 20.000 30.441 100.000 1
Haoyi Zhu, Honghui Yang, Xiaoyang Wu, Di Huang, Sha Zhang, Xianglong He, Tong He, Hengshuang Zhao, Chunhua Shen, Yu Qiao, Wanli Ouyang: PonderV2: Pave the Way for 3D Foundataion Model with A Universal Pre-training Paradigm.
OctFormer ScanNet200permissive0.326 130.539 100.265 100.131 120.806 80.670 120.943 90.535 120.662 40.705 160.423 90.407 60.505 130.003 110.765 130.582 70.686 150.227 160.680 80.943 100.601 20.854 100.892 60.335 60.417 170.357 100.724 100.453 110.632 70.596 50.432 30.783 110.512 160.021 120.244 150.637 20.000 10.787 120.873 90.743 110.000 170.000 70.534 90.110 40.499 60.289 100.626 70.620 120.168 150.204 40.849 100.679 80.117 50.633 110.684 30.650 80.552 50.684 90.312 30.000 30.175 110.429 110.865 50.413 40.837 120.000 70.145 80.626 90.451 80.487 110.513 30.000 10.529 70.613 120.000 40.033 60.000 10.000 60.828 50.871 30.622 90.587 90.411 70.137 100.645 140.343 120.000 30.000 70.000 10.022 130.000 40.026 170.829 110.000 10.022 80.089 60.842 40.253 140.318 110.296 20.178 110.291 30.224 30.584 20.200 140.132 80.000 40.128 110.227 130.000 30.230 130.047 110.149 80.331 100.412 90.618 70.164 100.102 110.522 30.000 10.655 40.378 120.469 150.000 10.000 110.000 120.105 90.000 70.000 100.483 30.000 120.000 80.028 80.000 10.000 50.906 10.000 10.339 150.000 10.000 120.457 90.000 10.612 80.000 10.000 10.408 40.000 150.900 100.000 90.000 110.000 10.029 80.000 10.074 150.455 150.479 60.427 70.079 90.140 80.496 80.414 140.022 60.000 10.471 130.000 20.000 20.000 110.722 70.000 30.000 10.000 10.138 130.000 70.000 30.000 110.000 1
Peng-Shuai Wang: OctFormer: Octree-based Transformers for 3D Point Clouds. SIGGRAPH 2023
L3DETR-ScanNet_2000.336 80.533 110.279 60.155 100.801 90.689 40.946 60.539 110.660 70.759 40.380 140.333 140.583 40.000 140.788 100.529 100.740 80.261 120.679 90.940 120.525 100.860 80.883 70.226 130.613 90.397 60.720 110.512 50.565 120.620 30.417 40.775 130.629 60.158 20.298 120.579 110.000 10.835 60.883 60.927 10.114 110.079 40.511 100.073 110.508 50.312 60.629 60.861 50.192 140.098 130.908 30.636 110.032 170.563 170.514 150.664 60.505 100.697 70.225 90.000 30.264 20.411 120.860 80.321 130.960 30.058 60.109 130.776 30.526 50.557 30.303 90.000 10.339 120.712 70.000 40.014 70.000 10.000 60.638 120.856 40.641 70.579 110.107 170.119 110.661 110.416 70.000 30.000 70.000 10.007 170.000 40.067 100.910 50.000 10.000 100.000 100.463 110.448 80.294 140.324 10.293 30.211 80.108 80.448 80.068 170.141 60.000 40.330 30.699 10.000 30.256 110.192 60.000 150.355 80.418 70.209 170.146 120.679 30.101 170.000 10.503 130.687 20.671 80.000 10.000 110.174 110.117 60.000 70.122 70.515 20.104 60.259 20.312 30.000 10.000 50.765 120.000 10.369 120.000 10.183 80.422 110.000 10.646 40.000 10.000 10.565 20.001 140.125 170.010 70.002 100.000 10.487 10.000 10.075 140.548 40.420 90.233 140.082 80.138 110.430 120.427 130.000 120.000 10.549 60.000 20.000 20.074 80.409 160.000 30.000 10.000 10.152 70.051 30.000 30.598 60.000 1
Yanmin Wu, Qiankun Gao, Renrui Zhang, Jian Zhang: Language-Assisted 3D Scene Understanding. arXiv23.12
BFANet ScanNet200permissive0.360 50.553 70.293 50.193 50.827 40.689 40.970 30.528 130.661 60.753 60.436 80.378 80.469 150.042 70.810 30.654 20.760 40.266 100.659 100.973 40.574 30.849 110.897 50.382 30.546 130.372 90.698 140.491 90.617 100.526 100.436 10.764 140.476 170.101 50.409 60.585 100.000 10.835 60.901 30.810 50.102 140.000 70.688 20.096 60.483 100.264 120.612 90.591 160.358 20.161 60.863 50.707 40.128 40.814 20.669 40.629 100.563 40.651 140.258 50.000 30.194 100.494 90.806 120.394 60.953 50.000 70.233 10.757 40.508 60.556 40.476 40.000 10.573 50.741 60.000 40.000 90.000 10.000 60.000 170.852 50.678 30.616 60.460 50.338 30.710 50.534 50.000 30.025 40.000 10.043 30.000 40.056 120.493 170.000 10.000 100.109 50.785 70.590 60.298 130.282 30.143 130.262 40.053 110.526 40.337 50.215 10.000 40.135 90.510 40.000 30.596 40.043 140.511 30.321 120.459 30.772 20.124 130.060 140.266 60.000 10.574 90.568 90.653 100.000 10.093 10.298 40.239 30.000 70.516 20.129 140.284 20.000 80.431 10.000 10.000 50.848 60.000 10.492 10.000 10.376 30.522 60.000 10.469 170.000 10.000 10.330 60.151 100.875 140.000 90.254 40.000 10.000 90.000 10.088 130.661 10.481 50.255 120.105 10.139 90.666 50.641 50.000 120.000 10.614 20.000 20.000 20.000 110.921 20.000 30.000 10.000 10.497 10.000 70.000 30.000 110.000 1
Weiguang Zhao, Rui Zhang, Qiufeng Wang, Guangliang Cheng, Kaizhu Huang: BFANet: Revisiting 3D Semantic Segmentation with Boundary Feature Analysis. CVPR 2025
CeCo0.340 70.551 90.247 130.181 60.784 130.661 140.939 130.564 60.624 130.721 120.484 50.429 50.575 50.027 80.774 110.503 140.753 50.242 130.656 110.945 90.534 70.865 70.860 110.177 170.616 80.400 50.818 20.579 10.615 110.367 140.408 60.726 150.633 50.162 10.360 90.619 30.000 10.828 90.873 90.924 20.109 130.083 30.564 60.057 150.475 120.266 110.781 20.767 70.257 70.100 110.825 110.663 100.048 150.620 130.551 120.595 130.532 70.692 80.246 60.000 30.213 60.615 20.861 70.376 70.900 80.000 70.102 150.660 80.321 150.547 50.226 130.000 10.311 130.742 50.011 30.006 80.000 10.000 60.546 150.824 80.345 140.665 20.450 60.435 10.683 80.411 80.338 10.000 70.000 10.030 90.000 40.068 90.892 80.000 10.063 50.000 100.257 130.304 130.387 60.079 140.228 60.190 110.000 140.586 10.347 40.133 70.000 40.037 130.377 100.000 30.384 80.006 160.003 130.421 50.410 100.643 50.171 90.121 90.142 120.000 10.510 110.447 110.474 140.000 10.000 110.286 50.083 110.000 70.000 100.603 10.096 70.063 50.000 110.000 10.000 50.898 30.000 10.429 70.000 10.400 20.550 40.000 10.633 60.000 10.000 10.377 50.000 150.916 40.000 90.000 110.000 10.000 90.000 10.102 120.499 90.296 140.463 60.089 50.304 10.740 30.401 160.010 70.000 10.560 40.000 20.000 20.709 20.652 100.000 30.000 10.000 10.143 80.000 70.000 30.609 50.000 1
Zhisheng Zhong, Jiequan Cui, Yibo Yang, Xiaoyang Wu, Xiaojuan Qi, Xiangyu Zhang, Jiaya Jia: Understanding Imbalanced Semantic Segmentation Through Neural Collapse. CVPR 2023
GSTran0.334 100.533 120.250 120.179 80.799 110.684 70.940 100.554 90.633 110.741 100.405 110.337 120.560 90.060 50.794 80.517 130.732 110.274 50.647 120.948 70.459 160.849 110.864 90.306 90.648 50.282 140.717 120.496 70.624 90.533 80.363 90.821 40.573 140.009 150.411 40.593 90.000 10.841 50.873 90.704 140.242 50.000 70.495 110.041 160.487 80.304 80.439 130.613 130.133 170.055 160.853 80.634 120.075 120.791 50.601 90.574 160.483 130.669 110.217 100.000 30.198 80.518 60.782 140.345 110.914 60.273 50.193 30.598 140.440 90.499 80.570 10.000 10.381 110.775 40.000 40.063 50.000 10.000 60.712 80.752 130.507 120.512 160.158 160.036 130.773 20.361 110.000 30.000 70.000 10.032 70.000 40.032 150.651 150.000 10.000 100.000 100.831 50.595 40.273 160.229 70.200 90.191 100.000 140.425 90.233 120.125 110.000 40.279 50.213 150.003 10.608 30.044 120.138 90.321 120.408 110.593 100.198 50.205 80.139 130.000 10.614 70.609 70.838 40.000 10.014 60.260 60.080 120.010 50.000 100.136 130.136 40.047 60.000 110.000 10.787 30.797 100.000 10.354 140.000 10.372 40.357 140.000 10.507 160.000 10.000 10.121 110.423 30.903 80.028 40.089 70.000 10.252 40.000 10.072 170.465 120.340 120.189 160.020 160.011 160.320 160.606 70.060 30.000 10.496 90.000 20.000 20.070 90.618 130.000 30.000 10.000 10.139 110.047 40.000 30.558 80.000 1
IMFSegNet0.334 90.532 130.251 110.179 70.799 110.683 80.940 100.555 80.631 120.740 110.406 100.336 130.560 90.062 40.795 70.518 120.733 100.274 50.646 130.947 80.458 170.848 130.862 100.305 100.649 40.284 130.713 130.495 80.626 80.527 90.363 90.820 50.574 130.010 140.411 40.597 70.000 10.842 40.873 90.704 140.246 40.000 70.495 110.041 160.486 90.305 70.444 120.604 150.134 160.055 160.852 90.633 130.076 90.792 40.612 80.573 170.484 120.668 120.216 120.000 30.197 90.518 60.784 130.344 120.908 70.283 40.190 40.599 130.439 100.496 100.569 20.000 10.392 90.776 30.000 40.064 40.000 10.000 60.710 90.756 120.508 110.512 160.159 150.034 140.773 20.363 100.000 30.000 70.000 10.032 70.000 40.029 160.648 160.000 10.000 100.000 100.830 60.595 40.274 150.228 80.206 80.188 120.000 140.425 90.237 110.123 120.000 40.277 60.214 140.003 10.610 20.044 120.124 100.320 140.408 110.594 90.196 70.213 70.139 130.000 10.615 60.618 60.839 30.000 10.014 60.260 60.080 120.025 20.000 100.139 120.135 50.035 70.000 110.000 10.793 20.799 90.000 10.357 130.000 10.369 50.359 130.000 10.512 150.000 10.000 10.120 120.424 20.903 80.027 50.091 60.000 10.245 50.000 10.073 160.457 140.340 120.191 150.021 150.009 170.322 150.608 60.060 30.000 10.494 100.000 20.000 20.068 100.624 110.000 30.000 10.000 10.139 110.047 40.000 30.561 70.000 1
AWCS0.305 140.508 140.225 140.142 110.782 140.634 170.937 140.489 150.578 140.721 120.364 150.355 110.515 120.023 90.764 140.523 110.707 140.264 110.633 140.922 140.507 130.886 10.804 150.179 150.436 160.300 120.656 160.529 30.501 150.394 130.296 160.820 50.603 90.131 30.179 170.619 30.000 10.707 160.865 130.773 60.171 70.010 60.484 140.063 130.463 130.254 130.332 160.649 110.220 110.100 110.729 150.613 150.071 130.582 140.628 70.702 40.424 150.749 20.137 150.000 30.142 130.360 130.863 60.305 140.877 100.000 70.173 50.606 120.337 140.478 120.154 150.000 10.253 140.664 80.000 40.000 90.000 10.000 60.626 130.782 100.302 160.602 70.185 130.282 60.651 130.317 130.000 30.000 70.000 10.022 130.000 40.154 20.876 90.000 10.014 90.063 90.029 170.553 70.467 30.084 130.124 140.157 160.049 120.373 130.252 90.097 150.000 40.219 70.542 30.000 30.392 70.172 80.000 150.339 90.417 80.533 130.093 150.115 100.195 90.000 10.516 100.288 150.741 60.000 10.001 100.233 90.056 140.000 70.159 60.334 70.077 90.000 80.000 110.000 10.000 50.749 130.000 10.411 80.000 10.008 110.452 100.000 10.595 100.000 10.000 10.220 100.006 120.894 120.006 80.000 110.000 10.000 90.000 10.112 60.504 80.404 100.551 30.093 40.129 140.484 100.381 170.000 120.000 10.396 140.000 20.000 20.620 30.402 170.000 30.000 10.000 10.142 90.000 70.000 30.512 90.000 1
: Long-Tailed 3D Semantic Segmentation with Adaptive Weight Constraint and Sampling. ICRA 2024
LGroundpermissive0.272 150.485 150.184 150.106 150.778 150.676 110.932 150.479 170.572 150.718 140.399 120.265 150.453 160.085 30.745 150.446 150.726 130.232 150.622 150.901 150.512 110.826 150.786 160.178 160.549 120.277 150.659 150.381 150.518 140.295 170.323 140.777 120.599 100.028 100.321 110.363 160.000 10.708 150.858 140.746 100.063 150.022 50.457 150.077 100.476 110.243 150.402 140.397 170.233 100.077 150.720 170.610 160.103 60.629 120.437 170.626 110.446 140.702 60.190 130.005 10.058 160.322 140.702 160.244 150.768 140.000 70.134 120.552 150.279 160.395 140.147 160.000 10.207 150.612 130.000 40.000 90.000 10.000 60.658 110.566 140.323 150.525 150.229 120.179 80.467 170.154 160.000 30.002 50.000 10.051 10.000 40.127 40.703 120.000 10.000 100.216 10.112 160.358 110.547 20.187 90.092 160.156 170.055 100.296 150.252 90.143 50.000 40.014 140.398 70.000 30.028 160.173 70.000 150.265 160.348 140.415 160.179 80.019 160.218 80.000 10.597 80.274 160.565 130.000 10.012 80.000 120.039 160.022 30.000 100.117 150.000 120.000 80.000 110.000 10.000 50.324 160.000 10.384 90.000 10.000 120.251 170.000 10.566 110.000 10.000 10.066 140.404 40.886 130.199 20.000 110.000 10.059 70.000 10.136 10.540 50.127 170.295 110.085 70.143 60.514 70.413 150.000 120.000 10.498 80.000 20.000 20.000 110.623 120.000 30.000 10.000 10.132 150.000 70.000 30.000 110.000 1
David Rozenberszki, Or Litany, Angela Dai: Language-Grounded Indoor 3D Semantic Segmentation in the Wild. arXiv
Minkowski 34Dpermissive0.253 160.463 160.154 170.102 160.771 160.650 160.932 150.483 160.571 160.710 150.331 160.250 160.492 140.044 60.703 160.419 170.606 170.227 160.621 160.865 170.531 80.771 170.813 140.291 110.484 150.242 160.612 170.282 170.440 170.351 150.299 150.622 160.593 110.027 110.293 130.310 170.000 10.757 140.858 140.737 120.150 90.164 10.368 170.084 80.381 160.142 170.357 150.720 100.214 120.092 140.724 160.596 170.056 140.655 90.525 140.581 150.352 170.594 150.056 170.000 30.014 170.224 150.772 150.205 170.720 160.000 70.159 70.531 160.163 170.294 160.136 170.000 10.169 160.589 140.000 40.000 90.000 10.002 40.663 100.466 170.265 170.582 100.337 100.016 150.559 150.084 170.000 30.000 70.000 10.036 50.000 40.125 50.670 130.000 10.102 30.071 80.164 150.406 90.386 70.046 160.068 170.159 150.117 60.284 160.111 160.094 160.000 40.000 170.197 160.000 30.044 150.013 150.002 140.228 170.307 170.588 110.025 170.545 50.134 150.000 10.655 40.302 140.282 170.000 10.060 20.000 120.035 170.000 70.000 100.097 170.000 120.000 80.005 100.000 10.000 50.096 170.000 10.334 160.000 10.000 120.274 160.000 10.513 140.000 10.000 10.280 80.194 90.897 110.000 90.000 110.000 10.000 90.000 10.108 90.279 170.189 160.141 170.059 140.272 20.307 170.445 100.003 100.000 10.353 150.000 20.026 10.000 110.581 150.001 20.000 10.000 10.093 170.002 60.000 30.000 110.000 1
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019
CSC-Pretrainpermissive0.249 170.455 170.171 160.079 170.766 170.659 150.930 170.494 140.542 170.700 170.314 170.215 170.430 170.121 10.697 170.441 160.683 160.235 140.609 170.895 160.476 150.816 160.770 170.186 140.634 60.216 170.734 90.340 160.471 160.307 160.293 170.591 170.542 150.076 70.205 160.464 140.000 10.484 170.832 160.766 70.052 160.000 70.413 160.059 140.418 150.222 160.318 170.609 140.206 130.112 90.743 140.625 140.076 90.579 150.548 130.590 140.371 160.552 170.081 160.003 20.142 130.201 160.638 170.233 160.686 170.000 70.142 90.444 170.375 130.247 170.198 140.000 10.128 170.454 170.019 20.097 10.000 10.000 60.553 140.557 150.373 130.545 140.164 140.014 160.547 160.174 150.000 30.002 50.000 10.037 40.000 40.063 110.664 140.000 10.000 100.130 20.170 140.152 160.335 100.079 140.110 150.175 140.098 90.175 170.166 150.045 170.207 20.014 140.465 50.000 30.001 170.001 170.046 120.299 150.327 160.537 120.033 160.012 170.186 100.000 10.205 150.377 130.463 160.000 10.058 30.000 120.055 150.041 10.000 100.105 160.000 120.000 80.000 110.000 10.000 50.398 150.000 10.308 170.000 10.000 120.319 150.000 10.543 120.000 10.000 10.062 150.004 130.862 150.000 90.000 110.000 10.000 90.000 10.123 50.316 160.225 150.250 130.094 30.180 50.332 140.441 110.000 120.000 10.310 160.000 20.000 20.000 110.592 140.000 30.000 10.000 10.203 30.000 70.000 30.000 110.000 1
Ji Hou, Benjamin Graham, Matthias Nießner, Saining Xie: Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts. CVPR 2021


This table lists the benchmark results for the ScanNet200 3D semantic instance scenario.




Method Infoavg aphead apcommon aptail apchairtabledoorcouchcabinetshelfdeskoffice chairbedpillowsinkpicturewindowtoiletbookshelfmonitorcurtainbookarmchaircoffee tableboxrefrigeratorlampkitchen cabinettowelclothestvnightstandcounterdresserstoolcushionplantceilingbathtubend tabledining tablekeyboardbagbackpacktoilet paperprintertv standwhiteboardblanketshower curtaintrash canclosetstairsmicrowavestoveshoecomputer towerbottlebinottomanbenchboardwashing machinemirrorcopierbasketsofa chairfile cabinetfanlaptopshowerpaperpersonpaper towel dispenserovenblindsrackplateblackboardpianosuitcaserailradiatorrecycling bincontainerwardrobesoap dispensertelephonebucketclockstandlightlaundry basketpipeclothes dryerguitartoilet paper holderseatspeakercolumnbicycleladderbathroom stallshower wallcupjacketstorage bincoffee makerdishwasherpaper towel rollmachinematwindowsillbartoasterbulletin boardironing boardfireplacesoap dishkitchen counterdoorframetoilet paper dispensermini fridgefire extinguisherballhatshower curtain rodwater coolerpaper cuttertrayshower doorpillarledgetoaster ovenmousetoilet seat cover dispenserfurniturecartstorage containerscaletissue boxlight switchcratepower outletdecorationsignprojectorcloset doorvacuum cleanercandleplungerstuffed animalheadphonesdish rackbroomguitar caserange hooddustpanhair dryerwater bottlehandicap barpurseventshower floorwater pitchermailboxbowlpaper bagalarm clockmusic standprojector screendividerlaundry detergentbathroom counterobjectbathroom vanitycloset walllaundry hamperbathroom stall doorceiling lighttrash bindumbbellstair railtubebathroom cabinetcd casecloset rodcoffee kettlestructureshower headkeyboard pianocase of water bottlescoat rackstorage organizerfolded chairfire alarmpower stripcalendarposterpotted plantluggagemattress
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Mask3D Scannet2000.278 10.383 10.263 20.168 10.661 20.465 10.572 10.665 30.391 20.121 50.304 10.015 20.647 10.349 10.474 10.489 10.321 10.816 60.351 30.722 10.402 40.195 10.515 40.082 20.795 10.215 20.396 10.377 20.082 50.724 10.586 10.015 30.277 10.377 60.201 10.475 30.572 10.778 30.089 20.759 10.556 20.068 10.506 10.467 10.323 40.778 20.427 20.027 30.789 10.744 10.003 20.570 20.561 10.337 20.265 10.711 10.258 20.031 10.569 10.311 10.441 20.179 11.000 10.000 20.233 20.411 20.283 20.380 10.667 10.016 10.048 40.418 30.139 20.173 10.000 10.086 20.014 30.500 10.384 10.497 10.044 40.032 20.752 10.287 20.003 10.000 20.007 10.208 10.000 10.001 30.349 20.008 20.014 20.509 10.500 20.323 10.023 30.176 20.107 20.105 40.000 20.605 10.378 10.016 20.000 10.400 10.192 10.000 10.048 30.037 30.000 20.275 10.119 10.810 10.258 20.006 40.083 60.000 10.568 20.377 20.708 10.000 10.005 20.147 20.014 30.000 20.556 20.085 10.325 10.500 10.083 20.004 20.000 10.590 10.000 10.365 10.000 10.116 10.491 10.000 10.626 10.000 10.000 10.579 10.391 10.050 50.000 10.028 20.000 10.222 20.000 10.063 10.302 10.356 20.149 50.573 10.415 10.013 60.002 50.004 10.000 10.005 50.000 10.000 10.444 10.514 10.000 20.028 10.000 20.156 20.267 10.000 21.000 10.000 1
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
TD3D Scannet200permissive0.211 30.332 30.177 30.103 30.662 10.413 20.463 30.705 10.192 40.145 20.266 20.215 10.452 50.209 30.222 60.219 60.315 20.893 10.380 20.617 20.439 20.047 50.646 10.080 30.610 30.253 10.237 30.293 30.135 20.379 60.494 20.048 10.252 20.451 30.184 20.483 20.395 20.852 10.083 30.551 20.278 30.036 30.337 30.266 30.544 20.963 10.079 60.039 10.740 30.604 30.000 30.586 10.283 30.282 30.059 30.633 30.028 30.004 20.559 20.309 20.420 30.028 61.000 10.000 20.456 10.411 10.372 10.060 50.046 40.000 20.040 50.694 10.083 30.000 20.000 10.000 30.000 40.083 50.252 30.260 50.200 20.160 10.669 20.111 30.000 20.000 20.006 20.169 20.000 10.007 20.296 30.032 10.074 10.139 40.000 30.321 20.031 20.108 30.088 30.157 20.000 20.231 50.026 60.000 30.000 10.356 20.052 30.000 10.240 20.147 20.000 20.015 30.046 40.144 40.073 40.414 20.222 50.000 10.806 10.343 30.486 30.000 10.008 10.038 30.083 20.002 10.028 30.074 20.032 30.150 30.039 30.008 10.000 10.250 50.000 10.125 40.000 10.052 20.260 40.000 10.143 60.000 10.000 10.543 20.207 30.404 10.000 10.003 30.000 10.000 30.000 10.037 20.093 50.272 30.342 20.039 50.281 20.249 30.224 10.000 20.000 10.074 20.000 10.000 10.000 20.278 30.000 20.000 20.889 10.323 10.000 20.014 10.000 30.000 1
Maksim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: Top-Down Beats Bottom-Up in 3D Instance Segmentation. WACV 2024
ODIN - Ins200permissive0.265 20.349 20.268 10.163 20.485 60.366 40.549 20.492 60.421 10.229 10.265 30.003 30.609 20.297 20.320 20.327 20.251 30.848 40.314 50.526 30.324 50.138 20.529 20.178 10.440 50.186 60.306 20.546 10.160 10.494 40.476 30.016 20.231 30.594 10.000 30.615 10.357 30.630 40.141 10.167 30.665 10.054 20.360 20.451 20.610 10.769 40.640 10.032 20.746 20.698 20.040 10.389 40.550 20.371 10.257 20.617 40.310 10.000 30.481 30.022 50.463 10.160 21.000 10.125 10.193 30.267 30.253 30.156 30.000 50.000 20.332 10.606 20.444 10.000 20.000 10.281 11.000 10.417 30.344 20.238 60.218 10.000 30.655 30.506 10.000 20.052 10.000 30.091 30.000 10.035 10.370 10.000 30.000 30.250 20.903 10.037 60.031 10.221 10.197 10.285 10.037 10.191 60.200 30.083 10.000 10.200 30.115 20.000 10.250 10.552 10.278 10.077 20.107 20.389 20.674 10.565 10.278 10.000 10.361 60.333 40.361 40.000 10.000 30.438 10.451 10.000 21.000 10.074 20.204 20.250 20.250 10.000 30.000 10.493 20.000 10.083 50.000 10.000 30.317 20.000 10.481 20.000 10.000 10.188 30.333 20.345 20.000 10.333 10.000 10.333 10.000 10.035 30.266 20.478 10.506 10.054 30.205 30.119 50.067 20.000 20.000 10.210 10.000 10.000 10.000 20.389 20.097 10.000 20.000 20.111 30.000 20.000 20.889 20.000 1
LGround Inst.permissive0.154 40.275 40.108 40.060 40.573 30.381 30.434 40.654 40.190 50.141 30.097 40.000 40.503 40.180 40.252 40.242 50.242 40.881 30.448 10.494 40.429 30.078 30.364 60.024 40.654 20.213 40.222 40.239 40.099 40.616 20.363 40.000 40.092 40.444 40.000 30.383 50.209 60.815 20.030 40.000 40.166 40.002 50.295 60.099 50.364 30.778 20.177 40.001 50.427 60.585 50.000 30.470 30.268 60.205 40.045 40.642 20.007 40.000 30.333 60.148 30.407 40.130 31.000 10.000 20.156 50.189 40.097 50.169 20.000 50.000 20.056 30.400 40.000 40.000 20.000 10.000 30.556 20.278 40.203 40.323 40.019 50.000 30.402 50.026 40.000 20.000 20.000 30.044 40.000 10.000 40.037 50.000 30.000 30.181 30.000 30.127 30.006 50.028 50.023 40.115 30.000 20.327 20.267 20.000 30.000 10.000 50.028 40.000 10.000 40.000 40.000 20.003 40.048 30.135 50.222 30.089 30.278 10.000 10.514 30.333 40.611 20.000 10.000 30.000 40.000 40.000 20.000 40.037 40.000 40.000 40.000 40.000 30.000 10.322 30.000 10.209 20.000 10.000 30.278 30.000 10.302 40.000 10.000 10.143 40.148 40.000 60.000 10.000 40.000 10.000 30.000 10.015 40.064 60.000 40.272 30.031 60.000 50.257 20.028 30.000 20.000 10.041 30.000 10.000 10.000 20.222 60.000 20.000 20.000 20.000 60.000 20.000 20.000 30.000 1
David Rozenberszki, Or Litany, Angela Dai: Language-Grounded Indoor 3D Semantic Segmentation in the Wild.
Minkowski 34D Inst.permissive0.130 50.246 50.083 50.043 60.547 50.236 50.415 50.672 20.141 60.133 40.067 50.000 40.521 30.114 60.238 50.289 30.232 50.883 20.182 60.373 60.486 10.076 40.488 50.022 50.529 40.199 50.110 50.217 50.100 30.460 50.319 50.000 40.025 60.472 20.000 30.394 40.210 50.537 50.004 50.000 40.083 60.000 60.299 50.061 60.201 60.761 50.084 50.008 40.720 40.557 60.000 30.317 60.280 40.094 60.020 60.564 60.000 50.000 30.400 40.048 40.259 50.101 41.000 10.000 20.190 40.142 60.094 60.137 40.089 30.000 20.101 20.355 60.000 40.000 20.000 10.000 30.000 40.444 20.082 60.384 20.000 60.000 30.334 60.004 60.000 20.000 20.000 30.041 50.000 10.000 40.026 60.000 30.000 30.000 50.000 30.082 50.022 40.000 60.021 50.088 50.000 20.241 40.033 50.000 30.000 10.067 40.000 60.000 10.000 40.000 40.000 20.000 50.026 50.262 30.016 50.000 50.278 10.000 10.500 40.394 10.028 60.000 10.000 30.000 40.000 40.000 20.000 40.019 50.000 40.000 40.000 40.000 30.000 10.156 60.000 10.032 60.000 10.000 30.194 60.000 10.248 50.000 10.000 10.099 50.019 50.308 30.000 10.000 40.000 10.000 30.000 10.007 50.122 30.000 40.175 40.063 20.000 50.271 10.000 60.000 20.000 10.000 60.000 10.000 10.000 20.278 30.000 20.000 20.000 20.111 30.000 20.000 20.000 30.000 1
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019
CSC-Pretrain Inst.permissive0.123 60.223 60.082 60.046 50.564 40.152 60.394 60.578 50.235 30.116 60.034 60.000 40.348 60.119 50.297 30.285 40.202 60.838 50.323 40.407 50.184 60.037 60.516 30.013 60.424 60.214 30.093 60.105 60.078 60.542 30.250 60.000 40.064 50.444 40.000 30.224 60.231 40.537 50.001 60.000 40.126 50.004 40.308 40.193 40.244 50.343 60.228 30.000 60.441 50.588 40.000 30.338 50.275 50.189 50.030 50.600 50.000 50.000 30.378 50.000 60.108 60.098 51.000 10.000 20.096 60.172 50.144 40.011 60.125 20.000 20.000 60.376 50.000 40.000 20.000 10.000 30.000 40.042 60.141 50.377 30.051 30.000 30.483 40.017 50.000 20.000 20.000 30.022 60.000 10.000 40.065 40.000 30.000 30.000 50.000 30.094 40.000 60.042 40.000 60.064 60.000 20.259 30.089 40.000 30.000 10.000 50.022 50.000 10.000 40.000 40.000 20.000 50.018 60.111 60.000 60.000 50.278 10.000 10.444 50.333 40.333 50.000 10.000 30.000 40.000 40.000 20.000 40.000 60.000 40.000 40.000 40.000 30.000 10.267 40.000 10.184 30.000 10.000 30.211 50.000 10.378 30.000 10.000 10.063 60.000 60.275 40.000 10.000 40.000 10.000 30.000 10.007 60.105 40.000 40.032 60.045 40.198 40.171 40.028 30.000 20.000 10.006 40.000 10.000 10.000 20.278 30.000 20.000 20.000 20.044 50.000 20.000 20.000 30.000 1
Ji Hou, Benjamin Graham, Matthias Nießner, Saining Xie: Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts. CVPR 2021


ScanNet Benchmark

This table lists the benchmark results for the 3D semantic label scenario.


Method Infoavg ioubathtubbedbookshelfcabinetchaircountercurtaindeskdoorfloorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwallwindow
sort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
TTT-KD0.773 70.646 970.818 160.809 410.774 100.878 30.581 30.943 10.687 150.704 70.978 60.607 60.336 190.775 110.912 80.838 40.823 40.694 150.967 40.899 40.794 6
Lisa Weijler, Muhammad Jehanzeb Mirza, Leon Sick, Can Ekkazan, Pedro Hermosilla: TTT-KD: Test-Time Training for 3D Semantic Segmentation through Knowledge Distillation from Foundation Models.
DITR ScanNet0.797 20.727 760.869 10.882 10.785 60.868 70.578 50.943 10.744 10.727 30.979 10.627 20.364 90.824 10.949 20.779 150.844 10.757 10.982 10.905 20.802 3
Karim Abou Zeid, Kadir Yilmaz, Daan de Geus, Alexander Hermans, David Adrian, Timm Linder, Bastian Leibe: DINO in the Room: Leveraging 2D Foundation Models for 3D Segmentation.
PTv3-PPT-ALCcopyleft0.798 10.911 110.812 220.854 80.770 120.856 150.555 170.943 10.660 260.735 20.979 10.606 70.492 10.792 40.934 40.841 20.819 60.716 90.947 100.906 10.822 1
Guangda Ji, Silvan Weder, Francis Engelmann, Marc Pollefeys, Hermann Blum: ARKit LabelMaker: A New Scale for Indoor 3D Scene Understanding. arxiv
DiffSegNet0.758 140.725 780.789 410.843 200.762 170.856 150.562 140.920 40.657 290.658 210.958 230.589 140.337 180.782 60.879 240.787 110.779 410.678 220.926 290.880 130.799 5
ODINpermissive0.744 290.658 930.752 640.870 30.714 400.843 330.569 110.919 50.703 80.622 400.949 590.591 120.343 150.736 340.784 560.816 70.838 20.672 310.918 370.854 390.725 28
Ayush Jain, Pushkal Katara, Nikolaos Gkanatsios, Adam W. Harley, Gabriel Sarch, Kriti Aggarwal, Vishrav Chaudhary, Katerina Fragkiadaki: ODIN: A Single Model for 2D and 3D Segmentation. CVPR 2024
PTv3 ScanNet0.794 30.941 30.813 210.851 110.782 70.890 20.597 10.916 60.696 110.713 50.979 10.635 10.384 30.793 30.907 100.821 50.790 360.696 140.967 40.903 30.805 2
Xiaoyang Wu, Li Jiang, Peng-Shuai Wang, Zhijian Liu, Xihui Liu, Yu Qiao, Wanli Ouyang, Tong He, Hengshuang Zhao: Point Transformer V3: Simpler, Faster, Stronger. CVPR 2024 (Oral)
Retro-FPN0.744 290.842 300.800 300.767 610.740 320.836 410.541 230.914 70.672 220.626 370.958 230.552 330.272 530.777 90.886 220.696 520.801 240.674 290.941 140.858 330.717 33
Peng Xiang*, Xin Wen*, Yu-Shen Liu, Hui Zhang, Yi Fang, Zhizhong Han: Retrospective Feature Pyramid Network for Point Cloud Semantic Segmentation. ICCV 2023
MSP0.748 240.623 1000.804 280.859 50.745 310.824 540.501 420.912 80.690 130.685 100.956 300.567 250.320 270.768 170.918 70.720 390.802 200.676 260.921 330.881 120.779 9
PonderV20.785 40.978 10.800 300.833 290.788 40.853 200.545 210.910 90.713 30.705 60.979 10.596 90.390 20.769 150.832 450.821 50.792 350.730 20.975 20.897 60.785 7
Haoyi Zhu, Honghui Yang, Xiaoyang Wu, Di Huang, Sha Zhang, Xianglong He, Tong He, Hengshuang Zhao, Chunhua Shen, Yu Qiao, Wanli Ouyang: PonderV2: Pave the Way for 3D Foundataion Model with A Universal Pre-training Paradigm.
Swin3Dpermissive0.779 60.861 230.818 160.836 260.790 30.875 40.576 70.905 100.704 70.739 10.969 120.611 30.349 120.756 250.958 10.702 510.805 190.708 100.916 390.898 50.801 4
PPT-SpUNet-Joint0.766 90.932 50.794 360.829 310.751 260.854 180.540 250.903 110.630 390.672 170.963 160.565 260.357 100.788 50.900 140.737 310.802 200.685 200.950 80.887 80.780 8
Xiaoyang Wu, Zhuotao Tian, Xin Wen, Bohao Peng, Xihui Liu, Kaicheng Yu, Hengshuang Zhao: Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt Training. CVPR 2024
ResLFE_HDS0.772 80.939 40.824 70.854 80.771 110.840 350.564 130.900 120.686 160.677 140.961 180.537 360.348 130.769 150.903 120.785 130.815 90.676 260.939 160.880 130.772 11
Virtual MVFusion0.746 260.771 550.819 140.848 150.702 430.865 100.397 900.899 130.699 90.664 200.948 620.588 150.330 230.746 320.851 390.764 210.796 290.704 120.935 210.866 280.728 24
Abhijit Kundu, Xiaoqi Yin, Alireza Fathi, David Ross, Brian Brewington, Thomas Funkhouser, Caroline Pantofaru: Virtual Multi-view Fusion for 3D Semantic Segmentation. ECCV 2020
EQ-Net0.743 310.620 1010.799 330.849 130.730 350.822 560.493 500.897 140.664 230.681 120.955 340.562 290.378 40.760 210.903 120.738 300.801 240.673 300.907 430.877 160.745 17
Zetong Yang*, Li Jiang*, Yanan Sun, Bernt Schiele, Jiaya JIa: A Unified Query-based Paradigm for Point Cloud Understanding. CVPR 2022
OA-CNN-L_ScanNet200.756 160.783 470.826 60.858 60.776 90.837 390.548 200.896 150.649 310.675 150.962 170.586 170.335 210.771 140.802 540.770 190.787 380.691 170.936 200.880 130.761 13
DiffSeg3D20.745 280.725 780.814 200.837 250.751 260.831 460.514 360.896 150.674 200.684 110.960 190.564 270.303 340.773 120.820 480.713 450.798 270.690 190.923 310.875 200.757 14
PointTransformerV20.752 200.742 680.809 250.872 20.758 190.860 120.552 180.891 170.610 460.687 80.960 190.559 300.304 330.766 180.926 60.767 200.797 280.644 380.942 130.876 190.722 31
Xiaoyang Wu, Yixing Lao, Li Jiang, Xihui Liu, Hengshuang Zhao: Point Transformer V2: Grouped Vector Attention and Partition-based Pooling. NeurIPS 2022
PointConvFormer0.749 220.793 430.790 390.807 430.750 280.856 150.524 310.881 180.588 580.642 300.977 100.591 120.274 510.781 70.929 50.804 80.796 290.642 390.947 100.885 100.715 36
Wenxuan Wu, Qi Shan, Li Fuxin: PointConvFormer: Revenge of the Point-based Convolution.
OctFormerpermissive0.766 90.925 70.808 260.849 130.786 50.846 300.566 120.876 190.690 130.674 160.960 190.576 220.226 720.753 270.904 110.777 160.815 90.722 70.923 310.877 160.776 10
Peng-Shuai Wang: OctFormer: Octree-based Transformers for 3D Point Clouds. SIGGRAPH 2023
VMNetpermissive0.746 260.870 210.838 30.858 60.729 360.850 240.501 420.874 200.587 590.658 210.956 300.564 270.299 350.765 190.900 140.716 420.812 150.631 440.939 160.858 330.709 37
Zeyu HU, Xuyang Bai, Jiaxiang Shang, Runze Zhang, Jiayu Dong, Xin Wang, Guangyuan Sun, Hongbo Fu, Chiew-Lan Tai: VMNet: Voxel-Mesh Network for Geodesic-Aware 3D Semantic Segmentation. ICCV 2021 (Oral)
online3d0.727 380.715 830.777 480.854 80.748 290.858 130.497 470.872 210.572 650.639 320.957 280.523 430.297 370.750 300.803 530.744 280.810 160.587 660.938 180.871 250.719 32
DTC0.757 150.843 290.820 120.847 160.791 20.862 110.511 380.870 220.707 60.652 230.954 400.604 80.279 480.760 210.942 30.734 320.766 500.701 130.884 610.874 220.736 20
DMF-Net0.752 200.906 140.793 380.802 470.689 450.825 520.556 160.867 230.681 180.602 500.960 190.555 320.365 80.779 80.859 300.747 270.795 320.717 80.917 380.856 350.764 12
C.Yang, Y.Yan, W.Zhao, J.Ye, X.Yang, A.Hussain, B.Dong, K.Huang: Towards Deeper and Better Multi-view Feature Fusion for 3D Semantic Segmentation. ICONIP 2023
RPN0.736 350.776 510.790 390.851 110.754 230.854 180.491 520.866 240.596 560.686 90.955 340.536 370.342 160.624 550.869 260.787 110.802 200.628 450.927 270.875 200.704 39
PointMetaBase0.714 430.835 310.785 430.821 320.684 470.846 300.531 290.865 250.614 430.596 540.953 440.500 500.246 670.674 400.888 200.692 530.764 520.624 470.849 870.844 480.675 47
ConDaFormer0.755 170.927 60.822 100.836 260.801 10.849 250.516 350.864 260.651 300.680 130.958 230.584 190.282 450.759 230.855 350.728 340.802 200.678 220.880 660.873 230.756 16
Lunhao Duan, Shanshan Zhao, Nan Xue, Mingming Gong, Guisong Xia, Dacheng Tao: ConDaFormer : Disassembled Transformer with Local Structure Enhancement for 3D Point Cloud Understanding. Neurips, 2023
MatchingNet0.724 410.812 400.812 220.810 400.735 340.834 430.495 490.860 270.572 650.602 500.954 400.512 470.280 470.757 240.845 410.725 360.780 400.606 550.937 190.851 420.700 41
MinkowskiNetpermissive0.736 350.859 250.818 160.832 300.709 410.840 350.521 330.853 280.660 260.643 270.951 510.544 340.286 430.731 350.893 180.675 600.772 450.683 210.874 720.852 410.727 26
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019
OccuSeg+Semantic0.764 110.758 610.796 340.839 240.746 300.907 10.562 140.850 290.680 190.672 170.978 60.610 40.335 210.777 90.819 490.847 10.830 30.691 170.972 30.885 100.727 26
CU-Hybrid Net0.764 110.924 80.819 140.840 230.757 210.853 200.580 40.848 300.709 50.643 270.958 230.587 160.295 380.753 270.884 230.758 230.815 90.725 50.927 270.867 270.743 19
PicassoNet-IIpermissive0.692 500.732 720.772 500.786 530.677 490.866 90.517 340.848 300.509 850.626 370.952 490.536 370.225 740.545 800.704 710.689 570.810 160.564 750.903 470.854 390.729 23
Huan Lei, Naveed Akhtar, Mubarak Shah, and Ajmal Mian: Geometric feature learning for 3D meshes.
O-CNNpermissive0.762 130.924 80.823 80.844 190.770 120.852 220.577 60.847 320.711 40.640 310.958 230.592 110.217 780.762 200.888 200.758 230.813 130.726 40.932 250.868 260.744 18
Peng-Shuai Wang, Yang Liu, Yu-Xiao Guo, Chun-Yu Sun, Xin Tong: O-CNN: Octree-based Convolutional Neural Networks for 3D Shape Analysis. SIGGRAPH 2017
LSK3DNetpermissive0.755 170.899 160.823 80.843 200.764 160.838 380.584 20.845 330.717 20.638 330.956 300.580 210.229 710.640 480.900 140.750 260.813 130.729 30.920 350.872 240.757 14
Tuo Feng, Wenguan Wang, Fan Ma, Yi Yang: LSK3DNet: Towards Effective and Efficient 3D Perception with Large Sparse Kernels. CVPR 2024
BPNetcopyleft0.749 220.909 120.818 160.811 390.752 240.839 370.485 530.842 340.673 210.644 260.957 280.528 420.305 320.773 120.859 300.788 100.818 80.693 160.916 390.856 350.723 30
Wenbo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia, Tien-Tsin Wong: Bidirectional Projection Network for Cross Dimension Scene Understanding. CVPR 2021 (Oral)
INS-Conv-semantic0.717 420.751 640.759 580.812 380.704 420.868 70.537 260.842 340.609 480.608 460.953 440.534 390.293 390.616 580.864 280.719 410.793 330.640 400.933 230.845 470.663 50
LRPNet0.742 320.816 380.806 270.807 430.752 240.828 500.575 80.839 360.699 90.637 340.954 400.520 450.320 270.755 260.834 430.760 220.772 450.676 260.915 410.862 300.717 33
PNE0.755 170.786 450.835 50.834 280.758 190.849 250.570 100.836 370.648 320.668 190.978 60.581 200.367 70.683 390.856 330.804 80.801 240.678 220.961 60.889 70.716 35
P. Hermosilla: Point Neighborhood Embeddings.
Mix3Dpermissive0.781 50.964 20.855 20.843 200.781 80.858 130.575 80.831 380.685 170.714 40.979 10.594 100.310 300.801 20.892 190.841 20.819 60.723 60.940 150.887 80.725 28
Alexey Nekrasov, Jonas Schult, Or Litany, Bastian Leibe, Francis Engelmann: Mix3D: Out-of-Context Data Augmentation for 3D Scenes. 3DV 2021 (Oral)
SAT0.742 320.860 240.765 550.819 340.769 140.848 270.533 270.829 390.663 240.631 360.955 340.586 170.274 510.753 270.896 170.729 330.760 560.666 330.921 330.855 370.733 22
LargeKernel3D0.739 340.909 120.820 120.806 450.740 320.852 220.545 210.826 400.594 570.643 270.955 340.541 350.263 610.723 370.858 320.775 180.767 490.678 220.933 230.848 430.694 42
Yukang Chen*, Jianhui Liu*, Xiangyu Zhang, Xiaojuan Qi, Jiaya Jia: LargeKernel3D: Scaling up Kernels in 3D Sparse CNNs. CVPR 2023
StratifiedFormerpermissive0.747 250.901 150.803 290.845 180.757 210.846 300.512 370.825 410.696 110.645 250.956 300.576 220.262 620.744 330.861 290.742 290.770 480.705 110.899 510.860 320.734 21
Xin Lai*, Jianhui Liu*, Li Jiang, Liwei Wang, Hengshuang Zhao, Shu Liu, Xiaojuan Qi, Jiaya Jia: Stratified Transformer for 3D Point Cloud Segmentation. CVPR 2022
IPCA0.731 370.890 170.837 40.864 40.726 370.873 50.530 300.824 420.489 920.647 240.978 60.609 50.336 190.624 550.733 640.758 230.776 430.570 700.949 90.877 160.728 24
DGNet0.684 540.712 840.784 440.782 570.658 520.835 420.499 460.823 430.641 340.597 530.950 550.487 550.281 460.575 680.619 850.647 730.764 520.620 500.871 780.846 460.688 44
VACNN++0.684 540.728 750.757 610.776 580.690 440.804 740.464 620.816 440.577 640.587 570.945 700.508 490.276 500.671 410.710 690.663 650.750 640.589 640.881 640.832 510.653 53
One Thing One Click0.701 470.825 350.796 340.723 680.716 390.832 450.433 800.816 440.634 370.609 450.969 120.418 880.344 140.559 740.833 440.715 430.808 180.560 760.902 480.847 440.680 46
RFCR0.702 460.889 180.745 690.813 370.672 500.818 630.493 500.815 460.623 400.610 440.947 640.470 620.249 660.594 620.848 400.705 480.779 410.646 370.892 560.823 550.611 65
Jingyu Gong, Jiachen Xu, Xin Tan, Haichuan Song, Yanyun Qu, Yuan Xie, Lizhuang Ma: Omni-Supervised Point Cloud Segmentation via Gradual Receptive Field Component Reasoning. CVPR2021
PointTransformer++0.725 390.727 760.811 240.819 340.765 150.841 340.502 410.814 470.621 420.623 390.955 340.556 310.284 440.620 570.866 270.781 140.757 600.648 360.932 250.862 300.709 37
contrastBoundarypermissive0.705 440.769 580.775 490.809 410.687 460.820 590.439 780.812 480.661 250.591 560.945 700.515 460.171 970.633 520.856 330.720 390.796 290.668 320.889 580.847 440.689 43
Liyao Tang, Yibing Zhan, Zhe Chen, Baosheng Yu, Dacheng Tao: Contrastive Boundary Learning for Point Cloud Segmentation. CVPR2022
joint point-basedpermissive0.634 780.614 1020.778 470.667 880.633 650.825 520.420 830.804 490.467 970.561 600.951 510.494 510.291 400.566 710.458 990.579 960.764 520.559 780.838 890.814 610.598 74
Hung-Yueh Chiang, Yen-Liang Lin, Yueh-Cheng Liu, Winston H. Hsu: A Unified Point-Based Framework for 3D Segmentation. 3DV 2019
Superpoint Network0.683 570.851 270.728 770.800 490.653 550.806 720.468 590.804 490.572 650.602 500.946 670.453 720.239 700.519 850.822 460.689 570.762 550.595 610.895 540.827 530.630 62
Feature-Geometry Netpermissive0.685 530.866 220.748 660.819 340.645 600.794 790.450 680.802 510.587 590.604 480.945 700.464 640.201 870.554 760.840 420.723 380.732 710.602 570.907 430.822 570.603 72
ClickSeg_Semantic0.703 450.774 530.800 300.793 520.760 180.847 290.471 570.802 510.463 990.634 350.968 140.491 530.271 550.726 360.910 90.706 470.815 90.551 820.878 670.833 490.570 82
Feature_GeometricNetpermissive0.690 510.884 190.754 620.795 500.647 580.818 630.422 820.802 510.612 450.604 480.945 700.462 650.189 920.563 730.853 370.726 350.765 510.632 430.904 450.821 580.606 69
Kangcheng Liu, Ben M. Chen: https://arxiv.org/abs/2012.09439. arXiv Preprint
JSENetpermissive0.699 480.881 200.762 560.821 320.667 510.800 760.522 320.792 540.613 440.607 470.935 900.492 520.205 840.576 670.853 370.691 540.758 580.652 350.872 750.828 520.649 54
Zeyu HU, Mingmin Zhen, Xuyang BAI, Hongbo Fu, Chiew-lan Tai: JSENet: Joint Semantic Segmentation and Edge Detection Network for 3D Point Clouds. ECCV 2020
PointConvpermissive0.666 630.781 480.759 580.699 760.644 610.822 560.475 550.779 550.564 710.504 820.953 440.428 820.203 860.586 650.754 600.661 660.753 610.588 650.902 480.813 630.642 57
Wenxuan Wu, Zhongang Qi, Li Fuxin: PointConv: Deep Convolutional Networks on 3D Point Clouds. CVPR 2019
PointContrast_LA_SEM0.683 570.757 620.784 440.786 530.639 620.824 540.408 850.775 560.604 510.541 650.934 940.532 400.269 570.552 770.777 570.645 760.793 330.640 400.913 420.824 540.671 48
KP-FCNN0.684 540.847 280.758 600.784 550.647 580.814 660.473 560.772 570.605 500.594 550.935 900.450 730.181 950.587 630.805 520.690 550.785 390.614 510.882 630.819 590.632 61
H. Thomas, C. Qi, J. Deschaud, B. Marcotegui, F. Goulette, L. Guibas.: KPConv: Flexible and Deformable Convolution for Point Clouds. ICCV 2019
PointASNLpermissive0.666 630.703 860.781 460.751 670.655 540.830 470.471 570.769 580.474 950.537 670.951 510.475 600.279 480.635 500.698 740.675 600.751 620.553 810.816 940.806 650.703 40
Xu Yan, Chaoda Zheng, Zhen Li, Sheng Wang, Shuguang Cui: PointASNL: Robust Point Clouds Processing using Nonlocal Neural Networks with Adaptive Sampling. CVPR 2020
Supervoxel-CNN0.635 770.656 940.711 790.719 710.613 680.757 960.444 750.765 590.534 770.566 590.928 980.478 590.272 530.636 490.531 930.664 640.645 990.508 970.864 820.792 790.611 65
O3DSeg0.668 620.822 360.771 520.496 1110.651 570.833 440.541 230.761 600.555 740.611 430.966 150.489 540.370 60.388 1040.580 880.776 170.751 620.570 700.956 70.817 600.646 56
VI-PointConv0.676 590.770 570.754 620.783 560.621 660.814 660.552 180.758 610.571 680.557 610.954 400.529 410.268 590.530 830.682 750.675 600.719 740.603 560.888 590.833 490.665 49
Xingyi Li, Wenxuan Wu, Xiaoli Z. Fern, Li Fuxin: The Devils in the Point Clouds: Studying the Robustness of Point Cloud Convolutions.
HPGCNN0.656 670.698 880.743 710.650 920.564 840.820 590.505 400.758 610.631 380.479 860.945 700.480 580.226 720.572 690.774 580.690 550.735 690.614 510.853 860.776 890.597 75
Jisheng Dang, Qingyong Hu, Yulan Guo, Jun Yang: HPGCNN.
SAFNet-segpermissive0.654 680.752 630.734 750.664 890.583 790.815 650.399 890.754 630.639 350.535 690.942 800.470 620.309 310.665 420.539 910.650 690.708 790.635 420.857 850.793 760.642 57
Linqing Zhao, Jiwen Lu, Jie Zhou: Similarity-Aware Fusion Network for 3D Semantic Segmentation. IROS 2021
SparseConvNet0.725 390.647 960.821 110.846 170.721 380.869 60.533 270.754 630.603 520.614 420.955 340.572 240.325 250.710 380.870 250.724 370.823 40.628 450.934 220.865 290.683 45
SALANet0.670 610.816 380.770 530.768 600.652 560.807 710.451 650.747 650.659 280.545 640.924 1000.473 610.149 1070.571 700.811 510.635 800.746 650.623 480.892 560.794 740.570 82
ROSMRF3D0.673 600.789 440.748 660.763 630.635 640.814 660.407 870.747 650.581 630.573 580.950 550.484 560.271 550.607 590.754 600.649 700.774 440.596 590.883 620.823 550.606 69
PointSPNet0.637 750.734 710.692 910.714 730.576 810.797 780.446 700.743 670.598 550.437 970.942 800.403 910.150 1060.626 540.800 550.649 700.697 830.557 790.846 880.777 880.563 86
FusionNet0.688 520.704 850.741 730.754 650.656 530.829 480.501 420.741 680.609 480.548 630.950 550.522 440.371 50.633 520.756 590.715 430.771 470.623 480.861 830.814 610.658 51
Feihu Zhang, Jin Fang, Benjamin Wah, Philip Torr: Deep FusionNet for Point Cloud Semantic Segmentation. ECCV 2020
RandLA-Netpermissive0.645 690.778 490.731 760.699 760.577 800.829 480.446 700.736 690.477 940.523 750.945 700.454 690.269 570.484 940.749 630.618 830.738 670.599 580.827 910.792 790.621 64
FusionAwareConv0.630 830.604 1040.741 730.766 620.590 750.747 980.501 420.734 700.503 870.527 710.919 1040.454 690.323 260.550 790.420 1030.678 590.688 870.544 860.896 530.795 730.627 63
Jiazhao Zhang, Chenyang Zhu, Lintao Zheng, Kai Xu: Fusion-Aware Point Convolution for Online Semantic 3D Scene Segmentation. CVPR 2020
LAP-D0.594 930.720 800.692 910.637 980.456 1030.773 900.391 950.730 710.587 590.445 960.940 840.381 960.288 410.434 1000.453 1010.591 920.649 970.581 680.777 980.749 980.610 67
3DSM_DMMF0.631 800.626 990.745 690.801 480.607 690.751 970.506 390.729 720.565 700.491 840.866 1140.434 770.197 900.595 610.630 840.709 460.705 810.560 760.875 700.740 990.491 103
One-Thing-One-Click0.693 490.743 670.794 360.655 910.684 470.822 560.497 470.719 730.622 410.617 410.977 100.447 750.339 170.750 300.664 810.703 500.790 360.596 590.946 120.855 370.647 55
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
SConv0.636 760.830 330.697 870.752 660.572 830.780 870.445 720.716 740.529 780.530 700.951 510.446 760.170 980.507 890.666 800.636 790.682 890.541 890.886 600.799 690.594 76
PointMRNet0.640 720.717 820.701 840.692 790.576 810.801 750.467 610.716 740.563 720.459 920.953 440.429 810.169 990.581 660.854 360.605 860.710 760.550 830.894 550.793 760.575 80
SD-DETR0.576 970.746 650.609 1100.445 1160.517 950.643 1110.366 990.714 760.456 1000.468 900.870 1130.432 780.264 600.558 750.674 760.586 950.688 870.482 1030.739 1030.733 1010.537 94
DPC0.592 940.720 800.700 850.602 1030.480 990.762 950.380 980.713 770.585 620.437 970.940 840.369 980.288 410.434 1000.509 970.590 940.639 1020.567 740.772 990.755 960.592 77
Francis Engelmann, Theodora Kontogianni, Bastian Leibe: Dilated Point Convolutions: On the Receptive Field Size of Point Convolutions on 3D Point Clouds. ICRA 2020
3DWSSS0.425 1150.525 1100.647 1020.522 1080.324 1150.488 1200.077 1210.712 780.353 1130.401 1010.636 1210.281 1100.176 960.340 1070.565 900.175 1200.551 1100.398 1150.370 1210.602 1160.361 114
PointNet2-SFPN0.631 800.771 550.692 910.672 840.524 930.837 390.440 770.706 790.538 760.446 940.944 760.421 870.219 770.552 770.751 620.591 920.737 680.543 880.901 500.768 910.557 89
PanopticFusion-label0.529 1030.491 1130.688 940.604 1020.386 1080.632 1120.225 1180.705 800.434 1050.293 1120.815 1160.348 1020.241 690.499 900.669 780.507 1020.649 970.442 1110.796 960.602 1160.561 87
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
DCM-Net0.658 660.778 490.702 830.806 450.619 670.813 690.468 590.693 810.494 880.524 730.941 820.449 740.298 360.510 870.821 470.675 600.727 730.568 730.826 920.803 670.637 59
Jonas Schult*, Francis Engelmann*, Theodora Kontogianni, Bastian Leibe: DualConvMesh-Net: Joint Geodesic and Euclidean Convolutions on 3D Meshes. CVPR 2020 [Oral]
HPEIN0.618 890.729 740.668 970.647 940.597 730.766 920.414 840.680 820.520 810.525 720.946 670.432 780.215 790.493 920.599 870.638 780.617 1040.570 700.897 520.806 650.605 71
Li Jiang, Hengshuang Zhao, Shu Liu, Xiaoyong Shen, Chi-Wing Fu, Jiaya Jia: Hierarchical Point-Edge Interaction Network for Point Cloud Semantic Segmentation. ICCV 2019
MVPNetpermissive0.641 700.831 320.715 780.671 860.590 750.781 850.394 910.679 830.642 330.553 620.937 870.462 650.256 630.649 450.406 1040.626 810.691 860.666 330.877 680.792 790.608 68
Maximilian Jaritz, Jiayuan Gu, Hao Su: Multi-view PointNet for 3D Scene Understanding. GMDL Workshop, ICCV 2019
TextureNetpermissive0.566 990.672 920.664 990.671 860.494 970.719 1010.445 720.678 840.411 1080.396 1020.935 900.356 1000.225 740.412 1020.535 920.565 980.636 1030.464 1050.794 970.680 1090.568 84
Jingwei Huang, Haotian Zhang, Li Yi, Thomas Funkerhouser, Matthias Niessner, Leonidas Guibas: TextureNet: Consistent Local Parametrizations for Learning from High-Resolution Signals on Meshes. CVPR
dtc_net0.625 860.703 860.751 650.794 510.535 910.848 270.480 540.676 850.528 790.469 890.944 760.454 690.004 1190.464 960.636 830.704 490.758 580.548 850.924 300.787 830.492 102
PointConv-SFPN0.641 700.776 510.703 820.721 700.557 870.826 510.451 650.672 860.563 720.483 850.943 790.425 850.162 1020.644 470.726 650.659 670.709 780.572 690.875 700.786 840.559 88
Weakly-Openseg v30.625 860.924 80.787 420.620 990.555 890.811 700.393 920.666 870.382 1100.520 760.953 440.250 1140.208 810.604 600.670 770.644 770.742 660.538 910.919 360.803 670.513 100
APCF-Net0.631 800.742 680.687 960.672 840.557 870.792 820.408 850.665 880.545 750.508 790.952 490.428 820.186 930.634 510.702 720.620 820.706 800.555 800.873 730.798 710.581 78
Haojia, Lin: Adaptive Pyramid Context Fusion for Point Cloud Perception. GRSL
DVVNet0.562 1000.648 950.700 850.770 590.586 780.687 1050.333 1040.650 890.514 840.475 880.906 1080.359 990.223 760.340 1070.442 1020.422 1110.668 940.501 980.708 1060.779 860.534 95
AttAN0.609 910.760 600.667 980.649 930.521 940.793 800.457 640.648 900.528 790.434 990.947 640.401 920.153 1050.454 970.721 680.648 720.717 750.536 920.904 450.765 920.485 104
Gege Zhang, Qinghua Ma, Licheng Jiao, Fang Liu and Qigong Sun: AttAN: Attention Adversarial Networks for 3D Point Cloud Semantic Segmentation. IJCAI2020
SIConv0.625 860.830 330.694 890.757 640.563 850.772 910.448 690.647 910.520 810.509 780.949 590.431 800.191 910.496 910.614 860.647 730.672 930.535 930.876 690.783 850.571 81
SPH3D-GCNpermissive0.610 900.858 260.772 500.489 1120.532 920.792 820.404 880.643 920.570 690.507 810.935 900.414 890.046 1160.510 870.702 720.602 880.705 810.549 840.859 840.773 900.534 95
Huan Lei, Naveed Akhtar, and Ajmal Mian: Spherical Kernel for Efficient Graph Convolution on 3D Point Clouds. TPAMI 2020
SegGroup_sempermissive0.627 850.818 370.747 680.701 750.602 710.764 930.385 970.629 930.490 900.508 790.931 970.409 900.201 870.564 720.725 660.618 830.692 850.539 900.873 730.794 740.548 92
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
Online SegFusion0.515 1050.607 1030.644 1040.579 1050.434 1050.630 1130.353 1010.628 940.440 1030.410 1000.762 1190.307 1060.167 1000.520 840.403 1050.516 1010.565 1070.447 1090.678 1090.701 1060.514 99
Davide Menini, Suryansh Kumar, Martin R. Oswald, Erik Sandstroem, Cristian Sminchisescu, Luc van Gool: A Real-Time Learning Framework for Joint 3D Reconstruction and Semantic Segmentation. Robotics and Automation Letters Submission
CCRFNet0.589 950.766 590.659 1010.683 810.470 1020.740 1000.387 960.620 950.490 900.476 870.922 1020.355 1010.245 680.511 860.511 960.571 970.643 1000.493 1010.872 750.762 930.600 73
subcloud_weak0.516 1040.676 900.591 1130.609 1000.442 1040.774 890.335 1030.597 960.422 1070.357 1070.932 960.341 1030.094 1120.298 1090.528 950.473 1070.676 910.495 1000.602 1140.721 1040.349 116
DenSeR0.628 840.800 410.625 1060.719 710.545 900.806 720.445 720.597 960.448 1020.519 770.938 860.481 570.328 240.489 930.499 980.657 680.759 570.592 620.881 640.797 720.634 60
3DMV, FTSDF0.501 1060.558 1080.608 1110.424 1180.478 1000.690 1040.246 1140.586 980.468 960.450 930.911 1060.394 930.160 1030.438 980.212 1140.432 1100.541 1120.475 1040.742 1020.727 1020.477 106
wsss-transformer0.600 920.634 980.743 710.697 780.601 720.781 850.437 790.585 990.493 890.446 940.933 950.394 930.011 1180.654 440.661 820.603 870.733 700.526 940.832 900.761 940.480 105
3DMV0.484 1080.484 1140.538 1160.643 960.424 1060.606 1160.310 1050.574 1000.433 1060.378 1030.796 1170.301 1070.214 800.537 820.208 1150.472 1080.507 1160.413 1140.693 1070.602 1160.539 93
Angela Dai, Matthias Niessner: 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation. ECCV'18
PPCNN++permissive0.663 650.746 650.708 800.722 690.638 630.820 590.451 650.566 1010.599 540.541 650.950 550.510 480.313 290.648 460.819 490.616 850.682 890.590 630.869 790.810 640.656 52
Pyunghwan Ahn, Juyoung Yang, Eojindl Yi, Chanho Lee, Junmo Kim: Projection-based Point Convolution for Efficient Point Cloud Segmentation. IEEE Access
PointMTL0.632 790.731 730.688 940.675 830.591 740.784 840.444 750.565 1020.610 460.492 830.949 590.456 680.254 640.587 630.706 700.599 890.665 950.612 540.868 800.791 820.579 79
SQN_0.1%0.569 980.676 900.696 880.657 900.497 960.779 880.424 810.548 1030.515 830.376 1040.902 1110.422 860.357 100.379 1050.456 1000.596 910.659 960.544 860.685 1080.665 1120.556 90
Pointnet++ & Featurepermissive0.557 1010.735 700.661 1000.686 800.491 980.744 990.392 930.539 1040.451 1010.375 1050.946 670.376 970.205 840.403 1030.356 1070.553 990.643 1000.497 990.824 930.756 950.515 98
PD-Net0.638 740.797 420.769 540.641 970.590 750.820 590.461 630.537 1050.637 360.536 680.947 640.388 950.206 830.656 430.668 790.647 730.732 710.585 670.868 800.793 760.473 108
FPConvpermissive0.639 730.785 460.760 570.713 740.603 700.798 770.392 930.534 1060.603 520.524 730.948 620.457 670.250 650.538 810.723 670.598 900.696 840.614 510.872 750.799 690.567 85
Yiqun Lin, Zizheng Yan, Haibin Huang, Dong Du, Ligang Liu, Shuguang Cui, Xiaoguang Han: FPConv: Learning Local Flattening for Point Convolution. CVPR 2020
ROSMRF0.580 960.772 540.707 810.681 820.563 850.764 930.362 1000.515 1070.465 980.465 910.936 890.427 840.207 820.438 980.577 890.536 1000.675 920.486 1020.723 1050.779 860.524 97
GMLPs0.538 1020.495 1120.693 900.647 940.471 1010.793 800.300 1070.477 1080.505 860.358 1060.903 1100.327 1040.081 1130.472 950.529 940.448 1090.710 760.509 950.746 1010.737 1000.554 91
PCNN0.498 1070.559 1070.644 1040.560 1070.420 1070.711 1030.229 1160.414 1090.436 1040.352 1080.941 820.324 1050.155 1040.238 1140.387 1060.493 1030.529 1130.509 950.813 950.751 970.504 101
SPLAT Netcopyleft0.393 1160.472 1160.511 1170.606 1010.311 1170.656 1070.245 1150.405 1100.328 1160.197 1190.927 990.227 1170.000 1210.001 1220.249 1110.271 1190.510 1140.383 1170.593 1150.699 1070.267 118
Hang Su, Varun Jampani, Deqing Sun, Subhransu Maji, Evangelos Kalogerakis, Ming-Hsuan Yang, Jan Kautz: SPLATNet: Sparse Lattice Networks for Point Cloud Processing. CVPR 2018
SurfaceConvPF0.442 1120.505 1110.622 1080.380 1190.342 1140.654 1080.227 1170.397 1110.367 1120.276 1140.924 1000.240 1150.198 890.359 1060.262 1100.366 1130.581 1050.435 1120.640 1110.668 1100.398 111
Hao Pan, Shilin Liu, Yang Liu, Xin Tong: Convolutional Neural Networks on 3D Surfaces Using Parallel Frames.
DGCNN_reproducecopyleft0.446 1110.474 1150.623 1070.463 1140.366 1110.651 1090.310 1050.389 1120.349 1140.330 1090.937 870.271 1110.126 1090.285 1100.224 1130.350 1160.577 1060.445 1100.625 1120.723 1030.394 112
Yue Wang, Yongbin Sun, Ziwei Liu, Sanjay E. Sarma, Michael M. Bronstein, Justin M. Solomon: Dynamic Graph CNN for Learning on Point Clouds. TOG 2019
PointCNN with RGBpermissive0.458 1090.577 1060.611 1090.356 1200.321 1160.715 1020.299 1090.376 1130.328 1160.319 1100.944 760.285 1090.164 1010.216 1170.229 1120.484 1050.545 1110.456 1070.755 1000.709 1050.475 107
Yangyan Li, Rui Bu, Mingchao Sun, Baoquan Chen: PointCNN. NeurIPS 2018
GrowSP++0.323 1190.114 1210.589 1140.499 1100.147 1210.555 1170.290 1110.336 1140.290 1180.262 1160.865 1150.102 1210.000 1210.037 1200.000 1220.000 1220.462 1180.381 1180.389 1200.664 1130.473 108
PNET20.442 1120.548 1090.548 1150.597 1040.363 1120.628 1140.300 1070.292 1150.374 1110.307 1110.881 1120.268 1120.186 930.238 1140.204 1160.407 1120.506 1170.449 1080.667 1100.620 1150.462 110
SSC-UNetpermissive0.308 1200.353 1180.290 1210.278 1210.166 1200.553 1180.169 1200.286 1160.147 1210.148 1210.908 1070.182 1190.064 1150.023 1210.018 1210.354 1150.363 1190.345 1190.546 1180.685 1080.278 117
Tangent Convolutionspermissive0.438 1140.437 1170.646 1030.474 1130.369 1100.645 1100.353 1010.258 1170.282 1190.279 1130.918 1050.298 1080.147 1080.283 1110.294 1090.487 1040.562 1080.427 1130.619 1130.633 1140.352 115
Maxim Tatarchenko, Jaesik Park, Vladlen Koltun, Qian-Yi Zhou: Tangent convolutions for dense prediction in 3d. CVPR 2018
PointNet++permissive0.339 1180.584 1050.478 1190.458 1150.256 1190.360 1210.250 1130.247 1180.278 1200.261 1170.677 1200.183 1180.117 1100.212 1180.145 1190.364 1140.346 1210.232 1210.548 1160.523 1200.252 119
Charles R. Qi, Li Yi, Hao Su, Leonidas J. Guibas: pointnet++: deep hierarchical feature learning on point sets in a metric space.
ScanNet+FTSDF0.383 1170.297 1190.491 1180.432 1170.358 1130.612 1150.274 1120.116 1190.411 1080.265 1150.904 1090.229 1160.079 1140.250 1120.185 1170.320 1170.510 1140.385 1160.548 1160.597 1190.394 112
FCPNpermissive0.447 1100.679 890.604 1120.578 1060.380 1090.682 1060.291 1100.106 1200.483 930.258 1180.920 1030.258 1130.025 1170.231 1160.325 1080.480 1060.560 1090.463 1060.725 1040.666 1110.231 120
Dario Rethage, Johanna Wald, Jürgen Sturm, Nassir Navab, Federico Tombari: Fully-Convolutional Point Networks for Large-Scale Point Clouds. ECCV 2018
ERROR0.054 1220.000 1220.041 1220.172 1220.030 1220.062 1230.001 1220.035 1210.004 1220.051 1220.143 1220.019 1220.003 1200.041 1190.050 1200.003 1210.054 1220.018 1220.005 1230.264 1220.082 122
ScanNetpermissive0.306 1210.203 1200.366 1200.501 1090.311 1170.524 1190.211 1190.002 1220.342 1150.189 1200.786 1180.145 1200.102 1110.245 1130.152 1180.318 1180.348 1200.300 1200.460 1190.437 1210.182 121
Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner: ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. CVPR'17
MVF-GNN0.014 1230.000 1220.000 1230.000 1230.007 1230.086 1220.000 1230.000 1230.001 1230.000 1230.029 1230.001 1230.000 1210.000 1230.000 1220.000 1220.000 1230.018 1220.015 1220.115 1230.000 123


This table lists the benchmark results for the 3D semantic instance scenario.




Method Infoavg apbathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
DENet0.413 400.741 300.520 360.237 560.284 400.523 550.097 310.691 10.138 450.209 550.229 570.238 430.390 460.707 330.310 320.448 600.470 290.892 280.310 33
SIM3D0.617 30.952 40.629 170.539 110.426 160.768 110.302 60.681 20.425 90.473 160.511 160.701 20.717 10.821 60.467 150.774 10.559 150.914 180.448 3
SoftGrouppermissive0.504 260.667 440.579 240.372 420.381 230.694 260.072 330.677 30.303 260.387 240.531 130.319 320.582 150.754 200.318 310.643 260.492 270.907 220.388 14
Thang Vu, Kookhoi Kim, Tung M. Luu, Xuan Thanh Nguyen, Chang D. Yoo: SoftGroup for 3D Instance Segmentaiton on Point Clouds. CVPR 2022 [Oral]
RPGN0.428 390.630 520.508 430.367 430.249 480.658 350.016 560.673 40.131 480.234 450.383 400.270 380.434 350.748 220.274 390.609 300.406 380.842 470.267 46
Shichao Dong, Guosheng Lin, Tzu-Yi Hung: Learning Regional Purity for Instance Segmentation on 3D Point Clouds. ECCV 2022
SoftGroup++0.513 240.704 370.578 260.398 360.363 300.704 240.061 370.647 50.297 300.378 260.537 100.343 280.614 130.828 50.295 350.710 140.505 260.875 340.394 11
DD-UNet+Group0.436 350.630 520.508 440.480 170.310 380.624 450.065 350.638 60.174 410.256 420.384 390.194 490.428 370.759 190.289 360.574 400.400 390.849 420.291 38
H. Liu, R. Liu, K. Yang, J. Zhang, K. Peng, R. Stiefelhagen: HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor. ICCVW 2021
SSEC0.465 320.667 440.578 250.502 130.362 310.641 400.035 460.605 70.291 310.323 320.451 290.296 340.417 430.677 370.245 440.501 530.506 250.900 250.366 18
EV3D0.615 50.946 50.652 130.555 60.433 140.773 60.271 130.604 80.447 50.506 70.544 70.698 30.716 20.775 160.480 90.747 50.572 130.925 140.435 6
Spherical Mask(CtoF)0.616 40.946 50.654 120.555 60.434 130.769 100.271 120.604 80.447 50.505 80.549 30.698 30.716 20.775 160.480 90.747 50.575 110.925 140.436 5
CSC-Pretrained0.405 420.738 320.465 500.331 490.205 540.655 360.051 410.601 100.092 550.211 540.329 450.198 480.459 310.775 140.195 510.524 510.400 400.878 310.184 56
Mask3D0.566 170.926 80.597 200.408 340.420 180.737 180.239 170.598 110.386 140.458 200.549 30.568 190.716 20.601 450.480 90.646 250.575 110.922 160.364 19
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
PointRel0.622 10.926 80.710 30.541 100.502 20.772 70.314 40.598 110.425 80.504 100.565 20.650 70.716 20.809 70.476 120.747 40.618 20.963 40.364 20
: Relation3D (PointRel): Enhancing Relation Modeling for Point Cloud Instance Segmentation.
GraphCut0.552 201.000 10.611 190.438 270.392 220.714 230.139 250.598 130.327 230.389 230.510 170.598 120.427 400.754 210.463 180.761 30.588 90.903 230.329 30
IPCA-Inst0.520 230.889 140.551 310.548 90.418 190.665 330.064 360.585 140.260 330.277 380.471 250.500 240.644 90.785 110.369 260.591 350.511 220.878 320.362 23
PE0.396 430.667 440.467 490.446 250.243 490.624 460.022 520.577 150.106 520.219 480.340 430.239 420.487 270.475 560.225 460.541 480.350 490.818 500.273 44
Biao Zhang, Peter Wonka: Point Cloud Instance Segmentation using Probabilistic Embeddings. CVPR 2021
MG-Former0.587 100.852 170.639 150.454 220.393 210.758 140.338 20.572 160.480 30.527 30.491 220.671 60.527 220.867 10.485 60.601 310.590 80.938 120.390 12
PointGroup0.407 410.639 510.496 450.415 330.243 500.645 390.021 530.570 170.114 510.211 530.359 420.217 470.428 380.660 400.256 420.562 440.341 510.860 380.291 37
Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia: PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation. CVPR 2020 [oral]
SPFormerpermissive0.549 210.745 280.640 140.484 160.395 200.739 170.311 50.566 180.335 210.468 180.492 210.555 200.478 280.747 230.436 200.712 120.540 210.893 270.343 27
Sun Jiahao, Qing Chunmei, Tan Junpeng, Xu Xiangmin: Superpoint Transformer for 3D Scene Instance Segmentation. AAAI 2023 [Oral]
Competitor-SPFormer0.580 140.721 350.705 40.593 40.444 120.786 50.286 90.564 190.376 160.498 120.534 120.546 210.390 450.785 120.577 20.708 150.579 100.954 80.388 13
GICN0.341 550.580 560.371 560.344 470.198 550.469 580.052 400.564 200.093 530.212 510.212 590.127 570.347 500.537 470.206 480.525 500.329 540.729 580.241 50
MAFT0.596 80.889 140.721 20.448 230.460 90.768 120.251 160.558 210.408 110.504 90.539 90.616 110.618 120.858 30.482 80.684 190.551 180.931 130.450 2
TopoSeg0.479 300.704 370.564 280.467 200.366 280.633 410.068 340.554 220.262 320.328 300.447 310.323 300.534 200.722 270.288 370.614 290.482 280.912 190.358 25
HAISpermissive0.457 340.704 370.561 290.457 210.364 290.673 300.046 450.547 230.194 400.308 330.426 320.288 350.454 320.711 300.262 410.563 430.434 350.889 290.344 26
Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang: Hierarchical Aggregation for 3D Instance Segmentation. ICCV 2021
DualGroup0.469 310.815 210.552 300.398 350.374 250.683 290.130 260.539 240.310 240.327 310.407 340.276 370.447 330.535 490.342 290.659 220.455 310.900 260.301 35
ISBNetpermissive0.559 190.939 70.655 110.383 400.426 170.763 130.180 210.534 250.386 130.499 110.509 180.621 100.427 390.704 340.467 140.649 230.571 140.948 100.401 9
Tuan Duc Ngo, Binh-Son Hua, Khoi Nguyen: ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution. CVPR 2023
KmaxOneFormerNetpermissive0.581 130.745 280.692 80.551 80.458 100.798 30.264 150.531 260.369 190.513 50.531 140.632 90.494 250.798 90.567 30.648 240.558 170.950 90.362 22
UniPerception0.588 90.963 30.667 100.493 150.472 70.750 150.229 190.528 270.468 40.498 130.542 80.643 80.530 210.661 390.463 160.695 180.599 30.972 10.420 7
Queryformer0.583 120.926 80.702 50.393 370.504 10.733 210.276 110.527 280.373 170.479 150.534 110.533 230.697 70.720 290.436 210.745 70.592 70.958 70.363 21
Mask-Group0.434 370.778 250.516 390.471 190.330 340.658 340.029 480.526 290.249 340.256 410.400 350.309 330.384 480.296 650.368 270.575 390.425 360.877 330.362 24
Min Zhong, Xinghao Chen, Xiaokang Chen, Gang Zeng, Yunhe Wang: MaskGroup: Hierarchical Point Grouping and Masking for 3D Instance Segmentation. ICME 2022
TD3Dpermissive0.489 280.852 170.511 410.434 290.322 350.735 200.101 300.512 300.355 200.349 280.468 260.283 360.514 240.676 380.268 400.671 210.510 230.908 210.329 31
Maksim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: Top-Down Beats Bottom-Up in 3D Instance Segmentation. WACV 2024
DKNet0.532 220.815 210.624 180.517 120.377 240.749 160.107 270.509 310.304 250.437 210.475 230.581 150.539 190.775 150.339 300.640 270.506 240.901 240.385 15
Yizheng Wu, Min Shi, Shuaiyuan Du, Hao Lu, Zhiguo Cao, Weicai Zhong: 3D Instances as 1D Kernels. ECCV 2022
Competitor-MAFT0.618 20.866 160.724 10.628 10.484 40.803 20.300 70.509 320.496 10.539 10.547 60.703 10.668 80.708 320.463 170.708 160.595 40.959 60.418 8
DCD0.614 60.892 130.633 160.434 280.495 30.810 10.292 80.501 330.408 100.525 40.582 10.688 50.625 100.801 80.608 10.672 200.649 10.965 30.476 1
One_Thing_One_Clickpermissive0.326 560.472 620.361 570.232 570.183 570.555 530.000 690.498 340.038 640.195 560.226 580.362 270.168 630.469 570.251 430.553 460.335 530.846 440.117 64
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
RWSeg0.348 540.475 610.456 520.320 500.275 440.476 570.020 540.491 350.056 620.212 520.320 460.261 390.302 530.520 500.182 530.557 450.285 610.867 360.197 53
ExtMask3D0.598 70.852 170.692 70.433 310.461 80.791 40.264 140.488 360.493 20.508 60.528 150.594 130.706 60.791 100.483 70.734 90.595 50.911 200.437 4
PBNetpermissive0.573 150.926 80.575 270.619 20.472 60.736 190.239 180.487 370.383 150.459 190.506 190.533 220.585 140.767 180.404 240.717 100.559 160.969 20.381 16
Weiguang Zhao, Yuyao Yan, Chaolong Yang, Jianan Ye, Xi Yang, Kaizhu Huang: Divide and Conquer: 3D Instance Segmentation With Point-Wise Binarization. ICCV 2023
AOIA0.387 460.704 370.515 400.385 390.225 530.669 310.005 630.482 380.126 490.181 580.269 540.221 460.426 410.478 550.218 470.592 340.371 440.851 390.242 49
TST3D0.569 160.778 250.675 90.598 30.451 110.727 220.280 100.476 390.395 120.472 170.457 280.583 140.580 160.777 130.462 190.735 80.547 200.919 170.333 28
Duc Tran Dang Trung, Byeongkeun Kang, Yeejin Lee: MSTA3D: Multi-scale Twin-attention for 3D Instance Segmentation. ACM Multimedia 2024
InsSSM0.586 111.000 10.593 210.440 260.480 50.771 80.345 10.437 400.444 70.495 140.548 50.579 160.621 110.720 280.409 230.712 110.593 60.960 50.395 10
Lei Yao, Yi Wang, Moyun Liu, Lap-Pui Chau: SGIFormer: Semantic-guided and Geometric-enhanced Interleaving Transformer for 3D Instance Segmentation. TCSVT, 2024
DANCENET0.504 260.926 80.579 230.472 180.367 270.626 430.165 230.432 410.221 350.408 220.449 300.411 260.564 170.746 240.421 220.707 170.438 330.846 430.288 40
PCJC0.375 490.704 370.542 330.284 530.197 560.649 370.006 600.426 420.138 460.242 430.304 490.183 520.388 470.629 420.141 620.546 470.344 500.738 570.283 41
MTML0.282 590.577 570.380 540.182 610.107 650.430 600.001 660.422 430.057 610.179 590.162 620.070 620.229 570.511 530.161 560.491 540.313 560.650 650.162 58
Jean Lahoud, Bernard Ghanem, Marc Pollefeys, Martin R. Oswald: 3D Instance Segmentation via Multi-task Metric Learning. ICCV 2019 [oral]
3D-BoNet0.253 620.519 590.324 620.251 550.137 620.345 680.031 470.419 440.069 570.162 620.131 640.052 630.202 610.338 630.147 610.301 690.303 600.651 640.178 57
Bo Yang, Jianan Wang, Ronald Clark, Qingyong Hu, Sen Wang, Andrew Markham, Niki Trigoni: Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds. NeurIPS 2019 Spotlight
Occipital-SCS0.320 570.679 430.352 580.334 480.229 510.436 590.025 490.412 450.058 600.161 630.240 560.085 590.262 540.496 540.187 520.467 570.328 550.775 520.231 51
OneFormer3Dcopyleft0.566 170.781 240.697 60.562 50.431 150.770 90.331 30.400 460.373 180.529 20.504 200.568 180.475 290.732 260.470 130.762 20.550 190.871 350.379 17
Maxim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: OneFormer3D: One Transformer for Unified Point Cloud Segmentation.
SSEN0.384 470.852 170.494 470.192 590.226 520.648 380.022 510.398 470.299 290.277 370.317 470.231 450.194 620.514 520.196 490.586 370.444 320.843 460.184 55
Dongsu Zhang, Junha Chun, Sang Kyun Cha, Young Min Kim: Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with Deep Metric Learning. Arxiv
INS-Conv-instance0.435 360.716 360.495 460.355 440.331 330.689 280.102 290.394 480.208 380.280 360.395 360.250 400.544 180.741 250.309 330.536 490.391 420.842 480.258 47
ODIN - Inspermissive0.463 330.738 320.589 220.344 460.358 320.560 520.139 240.393 490.331 220.373 270.392 370.496 250.493 260.709 310.377 250.599 330.359 470.752 550.332 29
Ayush Jain, Pushkal Katara, Nikolaos Gkanatsios, Adam W. Harley, Gabriel Sarch, Kriti Aggarwal, Vishrav Chaudhary, Katerina Fragkiadaki: ODIN: A Single Model for 2D and 3D Segmentation. CVPR 2024
SSTNetpermissive0.506 250.738 320.549 320.497 140.316 360.693 270.178 220.377 500.198 390.330 290.463 270.576 170.515 230.857 40.494 40.637 280.457 300.943 110.290 39
Zhihao Liang, Zhihao Li, Songcen Xu, Mingkui Tan, Kui Jia: Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks. ICCV2021
MASCpermissive0.254 610.463 630.249 670.113 630.167 580.412 630.000 680.374 510.073 560.173 610.243 550.130 560.228 580.368 610.160 570.356 640.208 640.711 610.136 62
Chen Liu, Yasutaka Furukawa: MASC: Multi-scale Affinity with Sparse Convolution for 3D Instance Segmentation.
PanopticFusion-inst0.214 650.250 710.330 610.275 540.103 660.228 740.000 690.345 520.024 660.088 670.203 610.186 510.167 640.367 620.125 630.221 720.112 740.666 630.162 59
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
3D-MPA0.355 520.457 640.484 480.299 510.277 430.591 500.047 440.332 530.212 370.217 490.278 500.193 500.413 440.410 590.195 500.574 410.352 480.849 410.213 52
Francis Engelmann, Martin Bokeloh, Alireza Fathi, Bastian Leibe, Matthias Nießner: 3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation. CVPR 2020
NeuralBF0.353 530.593 540.511 420.375 410.264 450.597 480.008 580.332 540.160 430.229 470.274 530.000 760.206 590.678 360.155 590.485 550.422 370.816 510.254 48
Weiwei Sun, Daniel Rebain, Renjie Liao, Vladimir Tankovich, Soroosh Yazdani, Kwang Moo Yi, Andrea Tagliasacchi: NeuralBF: Neural Bilateral Filtering for Top-down Instance Segmentation on Point Clouds. WACV 2023
OccuSeg+instance0.486 290.802 230.536 340.428 320.369 260.702 250.205 200.331 550.301 270.379 250.474 240.327 290.437 340.862 20.485 50.601 320.394 410.846 450.273 43
Lei Han, Tian Zheng, Lan Xu, Lu Fang: OccuSeg: Occupancy-aware 3D Instance Segmentation. CVPR2020
OSIS0.392 450.778 250.530 350.220 580.278 420.567 510.083 320.330 560.299 280.270 390.310 480.143 550.260 550.624 430.277 380.568 420.361 460.865 370.301 34
SPG_WSIS0.251 630.380 660.274 650.289 520.144 600.413 620.000 690.311 570.065 580.113 650.130 650.029 680.204 600.388 600.108 650.459 580.311 570.769 530.127 63
Region-18class0.146 690.175 750.321 630.080 660.062 670.357 650.000 690.307 580.002 730.066 700.044 700.000 760.018 740.036 750.054 670.447 610.133 690.472 680.060 71
ClickSeg_Instance0.366 500.654 480.375 550.184 600.302 390.592 490.050 430.300 590.093 540.283 340.277 510.249 410.426 420.615 440.299 340.504 520.367 450.832 490.191 54
Box2Mask0.433 380.741 300.463 510.433 300.283 410.625 440.103 280.298 600.125 500.260 400.424 330.322 310.472 300.701 350.363 280.711 130.309 590.882 300.272 45
Julian Chibane, Francis Engelmann, Tuan Anh Tran, Gerard Pons-Moll: Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation Using Bounding Boxes. ECCV 2022
SegGroup_inspermissive0.246 640.556 580.335 600.062 710.115 640.490 560.000 690.297 610.018 680.186 570.142 630.083 600.233 560.216 670.153 600.469 560.251 620.744 560.083 67
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
SphereSeg0.357 510.651 490.411 530.345 450.264 460.630 420.059 380.289 620.212 360.240 440.336 440.158 540.305 520.557 460.159 580.455 590.341 520.726 590.294 36
Mask3D_evaluation0.382 480.593 540.520 370.390 380.314 370.600 470.018 550.287 630.151 440.281 350.387 380.169 530.429 360.654 410.172 550.578 380.384 430.670 620.278 42
SALoss-ResNet0.262 600.667 440.335 590.067 690.123 630.427 610.022 500.280 640.058 590.216 500.211 600.039 650.142 650.519 510.106 660.338 660.310 580.721 600.138 61
Zhidong Liang, Ming Yang, Hao Li, Chunxiang Wang: 3D Instance Embedding Learning With a Structure-Aware Loss Function for Point Cloud Segmentation. IEEE Robotics and Automation Letters (IROS2020)
Dyco3Dcopyleft0.395 440.642 500.518 380.447 240.259 470.666 320.050 420.251 650.166 420.231 460.362 410.232 440.331 510.535 480.229 450.587 360.438 340.850 400.317 32
Tong He; Chunhua Shen; Anton van den Hengel: DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution. CVPR2021
Sparse R-CNN0.292 580.704 370.213 680.153 620.154 590.551 540.053 390.212 660.132 470.174 600.274 520.070 610.363 490.441 580.176 540.424 620.234 630.758 540.161 60
SemRegionNet-20cls0.121 700.296 690.203 690.071 670.058 690.349 660.000 690.150 670.019 670.054 720.034 730.017 710.052 700.042 740.013 750.209 730.183 650.371 700.057 72
3D-SISpermissive0.161 660.407 650.155 730.068 680.043 720.346 670.001 650.134 680.005 710.088 660.106 670.037 660.135 670.321 640.028 720.339 650.116 730.466 690.093 66
Ji Hou, Angela Dai, Matthias Niessner: 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans. CVPR 2019
Hier3Dcopyleft0.117 710.222 730.161 720.054 730.027 740.289 710.000 690.124 690.001 750.079 680.061 690.027 690.141 660.240 660.005 760.310 680.129 700.153 760.081 68
Tan: HCFS3D: Hierarchical Coupled Feature Selection Network for 3D Semantic and Instance Segmentation.
tmp0.113 730.333 680.151 740.056 720.053 700.344 690.000 690.105 700.016 690.049 730.035 720.020 700.053 690.048 730.013 740.183 750.173 670.344 730.054 73
UNet-backbone0.161 660.519 590.259 660.084 650.059 680.325 700.002 640.093 710.009 700.077 690.064 680.045 640.044 720.161 690.045 680.331 670.180 660.566 660.033 76
ASIS0.085 750.037 760.080 760.066 700.047 710.282 720.000 690.052 720.002 740.047 740.026 740.001 750.046 710.194 680.031 710.264 700.140 680.167 750.047 75
Sem_Recon_ins0.098 740.295 700.187 700.015 760.036 730.213 750.005 620.038 730.003 720.056 710.037 710.036 670.015 750.051 720.044 690.209 740.098 750.354 720.071 69
R-PointNet0.158 680.356 670.173 710.113 640.140 610.359 640.012 570.023 740.039 630.134 640.123 660.008 720.089 680.149 700.117 640.221 710.128 710.563 670.094 65
3D-BEVIS0.117 710.250 710.308 640.020 750.009 770.269 730.006 610.008 750.029 650.037 750.014 760.003 740.036 730.147 710.042 700.381 630.118 720.362 710.069 70
Cathrin Elich, Francis Engelmann, Jonas Schult, Theodora Kontogianni, Bastian Leibe: 3D-BEVIS: Birds-Eye-View Instance Segmentation.
Sgpn_scannet0.049 760.023 770.134 750.031 740.013 760.144 760.006 590.008 760.000 760.028 760.017 750.003 730.009 770.000 760.021 730.122 760.095 760.175 740.054 74
MaskRCNN 2d->3d Proj0.022 770.185 740.000 770.000 770.015 750.000 770.000 670.006 770.000 760.010 770.006 770.107 580.012 760.000 760.002 770.027 770.004 770.022 770.001 77


This table lists the benchmark results for the 2D semantic label scenario.


Method Infoavg ioubathtubbedbookshelfcabinetchaircountercurtaindeskdoorfloorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwallwindow
sort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Virtual MVFusion (R)0.745 10.861 10.839 10.881 10.672 20.512 10.422 170.898 10.723 10.714 10.954 20.454 10.509 10.773 10.895 10.756 10.820 10.653 10.935 10.891 10.728 1
Abhijit Kundu, Xiaoqi Yin, Alireza Fathi, David Ross, Brian Brewington, Thomas Funkhouser, Caroline Pantofaru: Virtual Multi-view Fusion for 3D Semantic Segmentation. ECCV 2020
CMX0.613 50.681 80.725 120.502 120.634 60.297 180.478 100.830 20.651 40.537 70.924 40.375 70.315 140.686 70.451 140.714 50.543 210.504 60.894 70.823 50.688 4
MVF-GNN(2D)0.636 30.606 140.794 40.434 160.688 10.337 80.464 120.798 30.632 50.589 30.908 80.420 20.329 120.743 20.594 20.738 20.676 50.527 40.906 20.818 60.715 3
BPNet_2Dcopyleft0.670 20.822 30.795 30.836 20.659 30.481 20.451 130.769 40.656 30.567 40.931 30.395 60.390 50.700 40.534 40.689 100.770 20.574 30.865 90.831 30.675 5
Wenbo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia and Tien-Tsin Wong: Bidirectional Projection Network for Cross Dimension Scene Understanding. CVPR 2021 (Oral)
FAN_NV_RVC0.586 100.510 210.764 60.079 260.620 80.330 110.494 80.753 50.573 90.556 50.884 160.405 40.303 160.718 30.452 130.672 130.658 70.509 50.898 50.813 80.727 2
MIX6D_RVC0.582 120.695 50.687 170.225 210.632 70.328 130.550 10.748 60.623 60.494 150.890 140.350 150.254 230.688 60.454 120.716 40.597 170.489 90.881 80.768 160.575 15
CU-Hybrid-2D Net0.636 30.825 20.820 20.179 230.648 40.463 30.549 20.742 70.676 20.628 20.961 10.420 20.379 60.684 80.381 180.732 30.723 30.599 20.827 160.851 20.634 7
DMMF_3d0.605 60.651 90.744 100.782 30.637 50.387 40.536 30.732 80.590 70.540 60.856 210.359 110.306 150.596 140.539 30.627 200.706 40.497 80.785 210.757 190.476 22
DMMF0.567 140.623 100.767 50.238 200.571 130.347 60.413 190.719 90.472 200.418 220.895 130.357 120.260 220.696 50.523 70.666 170.642 110.437 180.895 60.793 100.603 12
SSMAcopyleft0.577 130.695 50.716 150.439 140.563 140.314 140.444 150.719 90.551 120.503 100.887 150.346 160.348 100.603 120.353 200.709 60.600 150.457 140.901 30.786 110.599 13
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. International Journal of Computer Vision, 2019
EMSANet0.600 70.716 40.746 90.395 180.614 90.382 50.523 40.713 110.571 110.503 100.922 60.404 50.397 40.655 90.400 160.626 210.663 60.469 130.900 40.827 40.577 14
Seichter, Daniel and Fischedick, Söhnke and Köhler, Mona and Gross, Horst-Michael: EMSANet: Efficient Multi-Task RGB-D Scene Analysis for Indoor Environments. IJCNN 2022
UNIV_CNP_RVC_UE0.566 150.569 190.686 190.435 150.524 170.294 190.421 180.712 120.543 140.463 170.872 170.320 170.363 80.611 110.477 110.686 110.627 120.443 170.862 100.775 140.639 6
segfomer with 6d0.542 190.594 150.687 170.146 240.579 120.308 160.515 60.703 130.472 200.498 130.868 180.369 90.282 170.589 150.390 170.701 90.556 200.416 210.860 120.759 180.539 19
EMSAFormer0.564 160.581 160.736 110.564 100.546 160.219 230.517 50.675 140.486 190.427 210.904 110.352 140.320 130.589 150.528 50.708 70.464 240.413 220.847 140.786 110.611 11
MCA-Net0.595 80.533 200.756 80.746 40.590 100.334 100.506 70.670 150.587 80.500 120.905 100.366 100.352 90.601 130.506 80.669 160.648 90.501 70.839 150.769 150.516 21
DCRedNet0.583 110.682 70.723 130.542 110.510 200.310 150.451 130.668 160.549 130.520 90.920 70.375 70.446 20.528 200.417 150.670 150.577 180.478 110.862 100.806 90.628 9
FuseNetpermissive0.535 200.570 180.681 200.182 220.512 190.290 200.431 160.659 170.504 180.495 140.903 120.308 190.428 30.523 210.365 190.676 120.621 140.470 120.762 220.779 130.541 17
Caner Hazirbas, Lingni Ma, Csaba Domokos, Daniel Cremers: FuseNet: Incorporating Depth into Semantic Segmentation via Fusion-based CNN Architecture. ACCV 2016
RFBNet0.592 90.616 110.758 70.659 50.581 110.330 110.469 110.655 180.543 140.524 80.924 40.355 130.336 110.572 170.479 100.671 140.648 90.480 100.814 190.814 70.614 10
MSeg1080_RVCpermissive0.485 230.505 220.709 160.092 250.427 230.241 220.411 200.654 190.385 260.457 180.861 200.053 260.279 180.503 220.481 90.645 180.626 130.365 240.748 240.725 220.529 20
John Lambert*, Zhuang Liu*, Ozan Sener, James Hays, Vladlen Koltun: MSeg: A Composite Dataset for Multi-domain Semantic Segmentation. CVPR 2020
UDSSEG_RVC0.545 180.610 130.661 220.588 80.556 150.268 210.482 90.642 200.572 100.475 160.836 230.312 180.367 70.630 100.189 230.639 190.495 230.452 150.826 170.756 200.541 17
SN_RN152pyrx8_RVCcopyleft0.546 170.572 170.663 210.638 70.518 180.298 170.366 240.633 210.510 170.446 190.864 190.296 200.267 190.542 190.346 210.704 80.575 190.431 190.853 130.766 170.630 8
3DMV (2d proj)0.498 220.481 240.612 230.579 90.456 220.343 70.384 210.623 220.525 160.381 230.845 220.254 220.264 210.557 180.182 240.581 240.598 160.429 200.760 230.661 250.446 24
Angela Dai, Matthias Niessner: 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation. ECCV'18
ILC-PSPNet0.475 240.490 230.581 240.289 190.507 210.067 260.379 220.610 230.417 240.435 200.822 250.278 210.267 190.503 220.228 220.616 230.533 220.375 230.820 180.729 210.560 16
AdapNet++copyleft0.503 210.613 120.722 140.418 170.358 260.337 80.370 230.479 240.443 220.368 240.907 90.207 230.213 250.464 240.525 60.618 220.657 80.450 160.788 200.721 230.408 25
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. International Journal of Computer Vision, 2019
Enet (reimpl)0.376 250.264 260.452 260.452 130.365 240.181 240.143 260.456 250.409 250.346 250.769 260.164 240.218 240.359 250.123 260.403 260.381 260.313 260.571 250.685 240.472 23
Re-implementation of Adam Paszke, Abhishek Chaurasia, Sangpil Kim, Eugenio Culurciello: ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation.
ScanNet (2d proj)permissive0.330 260.293 250.521 250.657 60.361 250.161 250.250 250.004 260.440 230.183 260.836 230.125 250.060 260.319 260.132 250.417 250.412 250.344 250.541 260.427 260.109 26
Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner: ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. CVPR'17


This table lists the benchmark results for the 2D semantic instance scenario.




Method Infoavg apbathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
FKNet0.204 30.334 30.358 20.038 20.234 20.184 20.025 30.318 10.042 40.088 20.141 20.053 40.300 30.207 10.171 30.292 20.149 20.636 30.109 2
EMSANet (Instance)0.241 10.401 10.439 10.085 10.242 10.220 10.081 10.289 20.117 20.121 10.182 10.126 10.346 10.181 20.181 20.358 10.156 10.675 20.131 1
Seichter, Daniel and Fischedick, Söhnke and Köhler, Mona and Gross, Horst-Michael: EMSANet: Efficient Multi-Task RGB-D Scene Analysis for Indoor Environments. IJCNN 2022
UniDet_RVC0.205 20.381 20.323 30.037 30.226 30.177 30.063 20.277 30.120 10.067 30.131 30.074 30.317 20.080 30.235 10.289 30.141 30.678 10.080 3
MaskRCNN_ScanNetpermissive0.119 40.129 40.212 40.002 40.112 40.148 40.014 40.205 40.044 30.066 40.078 40.095 20.142 40.030 40.128 40.139 40.080 40.459 40.057 4
Re-implementation of Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross Girshick: Mask R-CNN. ICCV'17


This table lists the benchmark results for the scene type classification scenario.




Method Infoavg recallapartmentbathroombedroom / hotelbookstore / libraryconference roomcopy/mail roomhallwaykitchenlaundry roomliving room / loungemiscofficestorage / basement / garage
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort by
LAST-PCL-type0.780 10.250 31.000 11.000 11.000 11.000 11.000 10.500 21.000 10.500 20.889 10.000 21.000 11.000 1
Yanmin Wu, Qiankun Gao, Renrui Zhang, and Jian Zhang: Language-Assisted 3D Scene Understanding. arxiv23.12
multi-taskpermissive0.700 20.500 11.000 10.882 30.500 31.000 11.000 10.500 21.000 11.000 10.778 20.000 20.938 20.000 3
Shengyu Huang, Mikhail Usvyatsov, Konrad Schindler: Indoor Scene Recognition in 3D. IROS 2020
SE-ResNeXt-SSMA0.498 40.000 50.812 40.941 20.500 30.500 40.500 30.500 20.429 50.500 20.667 30.500 10.625 40.000 3
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. arXiv
3DASPP-SCE0.691 30.500 10.938 30.824 41.000 11.000 10.500 31.000 10.857 30.500 20.556 40.000 20.812 30.500 2
resnet50_scannet0.353 50.250 30.812 40.529 50.500 30.500 40.000 50.500 20.571 40.000 50.556 40.000 20.375 50.000 3