Presenting the ScanNet200 Benchmark

We present the ScanNet200 benchmark, which studies an order of magnitude more class categories than previous version of ScanNet. The scene geometry is shared within the two tasks, but the parsing of surface annotation allows for a larger vocabulary and more realistic setting for in the wild 3D understanding methods.

The ScanNet200 benchmark includes both finer-grained categories as well as a large number of previously unaddressed classes. This induces a much more challenging setting regarding the diversity of naturally observed semantic classes seen in the raw ScanNet RGB-D observations, where the data also reflects naturally encountered class imbalances. The difference in category frequencies between ScanNet and ScanNet200 can be seen in the Figure above.

ScanNet200 Benchmark

This table lists the benchmark results for the ScanNet200 3D semantic label scenario.




Method Infoavg iouhead ioucommon ioutail iouwallchairfloortabledoorcouchcabinetshelfdeskoffice chairbedpillowsinkpicturewindowtoiletbookshelfmonitorcurtainbookarmchaircoffee tableboxrefrigeratorlampkitchen cabinettowelclothestvnightstandcounterdresserstoolcushionplantceilingbathtubend tabledining tablekeyboardbagbackpacktoilet paperprintertv standwhiteboardblanketshower curtaintrash canclosetstairsmicrowavestoveshoecomputer towerbottlebinottomanbenchboardwashing machinemirrorcopierbasketsofa chairfile cabinetfanlaptopshowerpaperpersonpaper towel dispenserovenblindsrackplateblackboardpianosuitcaserailradiatorrecycling bincontainerwardrobesoap dispensertelephonebucketclockstandlightlaundry basketpipeclothes dryerguitartoilet paper holderseatspeakercolumnbicycleladderbathroom stallshower wallcupjacketstorage bincoffee makerdishwasherpaper towel rollmachinematwindowsillbartoasterbulletin boardironing boardfireplacesoap dishkitchen counterdoorframetoilet paper dispensermini fridgefire extinguisherballhatshower curtain rodwater coolerpaper cuttertrayshower doorpillarledgetoaster ovenmousetoilet seat cover dispenserfurniturecartstorage containerscaletissue boxlight switchcratepower outletdecorationsignprojectorcloset doorvacuum cleanercandleplungerstuffed animalheadphonesdish rackbroomguitar caserange hooddustpanhair dryerwater bottlehandicap barpurseventshower floorwater pitchermailboxbowlpaper bagalarm clockmusic standprojector screendividerlaundry detergentbathroom counterobjectbathroom vanitycloset walllaundry hamperbathroom stall doorceiling lighttrash bindumbbellstair railtubebathroom cabinetcd casecloset rodcoffee kettlestructureshower headkeyboard pianocase of water bottlescoat rackstorage organizerfolded chairfire alarmpower stripcalendarposterpotted plantluggagemattress
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted by
Voltpermissive0.416 20.619 20.318 40.269 30.850 40.735 10.958 50.639 10.753 20.773 30.504 30.542 20.631 20.000 140.795 70.686 10.834 20.335 30.721 40.982 20.625 20.884 20.905 50.237 130.653 40.429 50.679 150.462 110.709 30.680 30.475 10.893 10.652 50.000 170.392 90.541 120.000 10.865 40.900 50.952 10.000 170.000 70.700 20.138 40.528 30.501 10.678 40.842 60.357 30.227 20.909 30.719 40.093 80.924 10.614 80.682 60.635 30.696 80.238 80.000 30.143 130.606 40.898 20.430 40.988 20.356 10.136 120.881 10.609 40.583 30.588 10.000 10.624 30.635 110.000 40.087 20.000 10.000 60.904 20.903 20.747 20.696 20.410 80.272 70.737 40.603 40.000 30.097 10.000 10.007 170.000 40.063 110.981 10.000 10.066 50.000 100.891 20.431 90.380 80.261 60.265 40.274 40.069 100.425 90.401 10.151 40.631 20.005 160.324 130.000 30.778 20.251 40.000 150.421 50.499 10.725 30.223 40.277 70.862 10.000 10.728 10.351 140.855 20.000 10.020 60.407 10.218 40.000 70.997 10.329 80.218 40.000 80.000 110.000 10.000 50.930 10.000 10.551 10.000 10.518 20.493 80.000 10.962 10.000 10.000 10.414 30.576 10.934 20.188 30.398 20.000 10.000 90.000 10.040 180.616 20.553 50.438 70.082 80.141 70.437 120.888 10.000 120.000 10.754 10.000 20.000 21.000 10.752 60.000 30.000 10.000 10.142 90.000 70.000 30.791 30.000 1
Kadir Yilmaz, Adrian Kruse, Tristan Höfer, Daan de Geus, Bastian Leibe: Volume Transformer: Revisiting Vanilla Transformers for 3D Scene Understanding.
OA-CNN-L_ScanNet2000.333 120.558 60.269 100.124 140.821 60.703 40.946 70.569 60.662 50.748 100.487 40.455 50.572 80.000 140.789 100.534 100.736 100.271 90.713 50.949 70.498 150.877 40.860 120.332 70.706 10.474 30.788 70.406 140.637 70.495 120.355 120.805 80.592 130.015 130.396 80.602 60.000 10.799 120.876 80.713 140.276 20.000 70.493 140.080 100.448 150.363 60.661 50.833 70.262 70.125 80.823 130.665 100.076 100.720 90.557 110.637 100.517 100.672 110.227 90.000 30.158 120.496 90.843 120.352 110.835 140.000 80.103 150.711 60.527 50.526 70.320 90.000 10.568 70.625 120.067 10.000 100.000 10.001 50.806 70.836 80.621 110.591 90.373 90.314 50.668 110.398 100.003 20.000 80.000 10.016 160.024 20.043 140.906 70.000 10.052 70.000 100.384 130.330 130.342 90.100 130.223 80.183 140.112 70.476 60.313 80.130 100.196 40.112 120.370 110.000 30.234 130.071 100.160 70.403 70.398 140.492 150.197 70.076 140.272 60.000 10.200 170.560 100.735 80.000 10.000 120.000 130.110 90.002 60.021 90.412 50.000 130.000 80.000 110.000 10.000 50.794 120.000 10.445 60.000 10.022 110.509 70.000 10.517 140.000 10.000 10.001 180.245 80.915 60.024 70.089 80.000 10.262 30.000 10.103 110.524 80.392 120.515 40.013 180.251 40.411 140.662 50.001 110.000 10.473 130.000 20.000 20.150 60.699 100.000 30.000 10.000 10.166 60.000 70.024 20.000 120.000 1
CSC-Pretrainpermissive0.249 180.455 180.171 170.079 180.766 180.659 160.930 180.494 150.542 180.700 180.314 180.215 180.430 180.121 10.697 180.441 170.683 170.235 150.609 180.895 170.476 160.816 170.770 180.186 150.634 70.216 180.734 90.340 170.471 170.307 170.293 180.591 180.542 160.076 70.205 170.464 150.000 10.484 180.832 170.766 80.052 160.000 70.413 170.059 150.418 160.222 170.318 180.609 150.206 140.112 100.743 150.625 150.076 100.579 160.548 140.590 150.371 170.552 180.081 170.003 20.142 140.201 170.638 180.233 170.686 180.000 80.142 90.444 180.375 140.247 180.198 150.000 10.128 180.454 180.019 20.097 10.000 10.000 60.553 150.557 160.373 140.545 150.164 150.014 170.547 170.174 160.000 30.002 60.000 10.037 40.000 40.063 110.664 150.000 10.000 110.130 20.170 150.152 170.335 110.079 150.110 160.175 150.098 90.175 180.166 160.045 180.207 30.014 140.465 50.000 30.001 180.001 180.046 120.299 160.327 170.537 130.033 170.012 180.186 110.000 10.205 160.377 130.463 170.000 10.058 30.000 130.055 160.041 10.000 110.105 170.000 130.000 80.000 110.000 10.000 50.398 160.000 10.308 180.000 10.000 130.319 160.000 10.543 130.000 10.000 10.062 160.004 140.862 160.000 100.000 120.000 10.000 90.000 10.123 50.316 170.225 160.250 140.094 30.180 50.332 150.441 120.000 120.000 10.310 170.000 20.000 20.000 120.592 150.000 30.000 10.000 10.203 30.000 70.000 30.000 120.000 1
Ji Hou, Benjamin Graham, Matthias Nießner, Saining Xie: Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts. CVPR 2021
LGroundpermissive0.272 160.485 160.184 160.106 160.778 160.676 120.932 160.479 180.572 160.718 150.399 130.265 160.453 170.085 30.745 160.446 160.726 140.232 160.622 160.901 160.512 120.826 160.786 170.178 170.549 130.277 160.659 160.381 160.518 150.295 180.323 150.777 130.599 110.028 100.321 120.363 170.000 10.708 160.858 150.746 110.063 150.022 50.457 160.077 110.476 120.243 160.402 150.397 180.233 110.077 160.720 180.610 170.103 60.629 130.437 180.626 120.446 150.702 60.190 140.005 10.058 170.322 150.702 170.244 160.768 150.000 80.134 130.552 160.279 170.395 150.147 170.000 10.207 160.612 140.000 40.000 100.000 10.000 60.658 120.566 150.323 160.525 160.229 130.179 90.467 180.154 170.000 30.002 60.000 10.051 10.000 40.127 40.703 130.000 10.000 110.216 10.112 170.358 120.547 20.187 100.092 170.156 180.055 110.296 160.252 100.143 60.000 50.014 140.398 70.000 30.028 170.173 80.000 150.265 170.348 150.415 170.179 90.019 170.218 90.000 10.597 90.274 170.565 140.000 10.012 90.000 130.039 170.022 30.000 110.117 160.000 130.000 80.000 110.000 10.000 50.324 170.000 10.384 100.000 10.000 130.251 180.000 10.566 120.000 10.000 10.066 150.404 50.886 140.199 20.000 120.000 10.059 70.000 10.136 10.540 60.127 180.295 120.085 70.143 60.514 70.413 160.000 120.000 10.498 90.000 20.000 20.000 120.623 130.000 30.000 10.000 10.132 160.000 70.000 30.000 120.000 1
David Rozenberszki, Or Litany, Angela Dai: Language-Grounded Indoor 3D Semantic Segmentation in the Wild. arXiv
AWCS0.305 150.508 150.225 150.142 120.782 150.634 180.937 150.489 160.578 150.721 130.364 160.355 120.515 130.023 90.764 150.523 120.707 150.264 120.633 150.922 150.507 140.886 10.804 160.179 160.436 170.300 130.656 170.529 30.501 160.394 140.296 170.820 60.603 100.131 30.179 180.619 30.000 10.707 170.865 140.773 70.171 70.010 60.484 150.063 140.463 140.254 140.332 170.649 120.220 120.100 120.729 160.613 160.071 140.582 150.628 70.702 40.424 160.749 20.137 160.000 30.142 140.360 140.863 70.305 150.877 110.000 80.173 50.606 130.337 150.478 130.154 160.000 10.253 150.664 80.000 40.000 100.000 10.000 60.626 140.782 110.302 170.602 80.185 140.282 60.651 140.317 140.000 30.000 80.000 10.022 130.000 40.154 20.876 100.000 10.014 100.063 90.029 180.553 70.467 30.084 140.124 150.157 170.049 130.373 140.252 100.097 160.000 50.219 70.542 30.000 30.392 80.172 90.000 150.339 100.417 90.533 140.093 160.115 110.195 100.000 10.516 110.288 160.741 70.000 10.001 110.233 100.056 150.000 70.159 70.334 70.077 100.000 80.000 110.000 10.000 50.749 140.000 10.411 90.000 10.008 120.452 110.000 10.595 110.000 10.000 10.220 110.006 130.894 130.006 90.000 120.000 10.000 90.000 10.112 60.504 90.404 110.551 30.093 40.129 150.484 100.381 180.000 120.000 10.396 150.000 20.000 20.620 40.402 180.000 30.000 10.000 10.142 90.000 70.000 30.512 100.000 1
: Long-Tailed 3D Semantic Segmentation with Adaptive Weight Constraint and Sampling. ICRA 2024
CeCo0.340 80.551 100.247 140.181 70.784 140.661 150.939 140.564 70.624 140.721 130.484 60.429 60.575 60.027 80.774 120.503 150.753 60.242 140.656 120.945 100.534 80.865 80.860 120.177 180.616 90.400 60.818 20.579 10.615 120.367 150.408 70.726 160.633 60.162 10.360 100.619 30.000 10.828 100.873 100.924 30.109 130.083 30.564 70.057 160.475 130.266 120.781 20.767 80.257 80.100 120.825 120.663 110.048 160.620 140.551 130.595 140.532 80.692 90.246 60.000 30.213 60.615 20.861 80.376 80.900 90.000 80.102 160.660 90.321 160.547 60.226 140.000 10.311 140.742 50.011 30.006 90.000 10.000 60.546 160.824 90.345 150.665 30.450 60.435 10.683 90.411 90.338 10.000 80.000 10.030 90.000 40.068 90.892 90.000 10.063 60.000 100.257 140.304 140.387 60.079 150.228 70.190 120.000 150.586 10.347 50.133 80.000 50.037 130.377 100.000 30.384 90.006 170.003 130.421 50.410 110.643 60.171 100.121 100.142 130.000 10.510 120.447 110.474 150.000 10.000 120.286 60.083 120.000 70.000 110.603 10.096 80.063 50.000 110.000 10.000 50.898 40.000 10.429 80.000 10.400 30.550 40.000 10.633 70.000 10.000 10.377 60.000 160.916 50.000 100.000 120.000 10.000 90.000 10.102 120.499 100.296 150.463 60.089 50.304 10.740 30.401 170.010 70.000 10.560 50.000 20.000 20.709 30.652 110.000 30.000 10.000 10.143 80.000 70.000 30.609 60.000 1
Zhisheng Zhong, Jiequan Cui, Yibo Yang, Xiaoyang Wu, Xiaojuan Qi, Xiangyu Zhang, Jiaya Jia: Understanding Imbalanced Semantic Segmentation Through Neural Collapse. CVPR 2023
OctFormer ScanNet200permissive0.326 140.539 110.265 110.131 130.806 90.670 130.943 100.535 130.662 50.705 170.423 100.407 70.505 140.003 110.765 140.582 80.686 160.227 170.680 90.943 110.601 30.854 110.892 70.335 60.417 180.357 110.724 100.453 120.632 80.596 60.432 40.783 120.512 170.021 120.244 160.637 20.000 10.787 130.873 100.743 120.000 170.000 70.534 100.110 50.499 70.289 110.626 80.620 130.168 160.204 50.849 110.679 90.117 50.633 120.684 30.650 90.552 60.684 100.312 30.000 30.175 110.429 120.865 60.413 50.837 130.000 80.145 80.626 100.451 90.487 120.513 40.000 10.529 80.613 130.000 40.033 70.000 10.000 60.828 60.871 40.622 100.587 100.411 70.137 110.645 150.343 130.000 30.000 80.000 10.022 130.000 40.026 180.829 120.000 10.022 90.089 60.842 50.253 150.318 120.296 20.178 120.291 30.224 30.584 20.200 150.132 90.000 50.128 110.227 140.000 30.230 140.047 120.149 80.331 110.412 100.618 80.164 110.102 120.522 40.000 10.655 50.378 120.469 160.000 10.000 120.000 130.105 100.000 70.000 110.483 30.000 130.000 80.028 80.000 10.000 50.906 20.000 10.339 160.000 10.000 130.457 100.000 10.612 90.000 10.000 10.408 50.000 160.900 110.000 100.000 120.000 10.029 80.000 10.074 150.455 160.479 70.427 80.079 100.140 90.496 80.414 150.022 60.000 10.471 140.000 20.000 20.000 120.722 80.000 30.000 10.000 10.138 140.000 70.000 30.000 120.000 1
Peng-Shuai Wang: OctFormer: Octree-based Transformers for 3D Point Clouds. SIGGRAPH 2023
PPT-SpUNet-F.T.0.332 130.556 70.270 80.123 150.816 70.682 100.946 70.549 110.657 90.756 60.459 80.376 100.550 120.001 120.807 40.616 50.727 130.267 100.691 60.942 120.530 100.872 60.874 90.330 80.542 150.374 90.792 50.400 150.673 50.572 80.433 30.793 100.623 80.008 160.351 110.594 80.000 10.783 140.876 80.833 50.213 60.000 70.537 90.091 80.519 50.304 90.620 90.942 20.264 60.124 90.855 80.695 60.086 90.646 110.506 170.658 80.535 70.715 40.314 20.000 30.241 40.608 30.897 30.359 90.858 120.000 80.076 180.611 120.392 130.509 80.378 70.000 10.579 50.565 160.000 40.000 100.000 10.000 60.755 80.806 100.661 50.572 140.350 100.181 80.660 130.300 150.000 30.000 80.000 10.023 120.000 40.042 150.930 50.000 10.000 110.077 70.584 100.392 110.339 100.185 110.171 130.308 20.006 140.563 30.256 90.150 50.000 50.002 170.345 120.000 30.045 150.197 60.063 110.323 120.453 50.600 90.163 120.037 160.349 50.000 10.672 40.679 40.753 60.000 10.000 120.000 130.117 70.000 70.000 110.291 90.000 130.000 80.039 70.000 10.000 50.899 30.000 10.374 120.000 10.000 130.545 50.000 10.634 60.000 10.000 10.074 140.223 90.914 70.000 100.021 100.000 10.000 90.000 10.112 60.498 110.649 10.383 110.095 20.135 130.449 110.432 130.008 90.000 10.518 80.000 20.000 20.000 120.796 50.000 30.000 10.000 10.138 140.000 70.000 30.000 120.000 1
Xiaoyang Wu, Zhuotao Tian, Xin Wen, Bohao Peng, Xihui Liu, Kaicheng Yu, Hengshuang Zhao: Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt Training. CVPR 2024
L3DETR-ScanNet_2000.336 90.533 120.279 70.155 110.801 100.689 50.946 70.539 120.660 80.759 50.380 150.333 150.583 50.000 140.788 110.529 110.740 90.261 130.679 100.940 130.525 110.860 90.883 80.226 140.613 100.397 70.720 110.512 50.565 130.620 40.417 50.775 140.629 70.158 20.298 130.579 110.000 10.835 70.883 70.927 20.114 110.079 40.511 110.073 120.508 60.312 70.629 70.861 50.192 150.098 140.908 40.636 120.032 180.563 180.514 160.664 70.505 110.697 70.225 100.000 30.264 20.411 130.860 90.321 140.960 40.058 70.109 140.776 40.526 60.557 40.303 100.000 10.339 130.712 70.000 40.014 80.000 10.000 60.638 130.856 50.641 80.579 120.107 180.119 120.661 120.416 80.000 30.000 80.000 10.007 170.000 40.067 100.910 60.000 10.000 110.000 100.463 120.448 80.294 150.324 10.293 30.211 90.108 80.448 80.068 180.141 70.000 50.330 30.699 10.000 30.256 120.192 70.000 150.355 90.418 80.209 180.146 130.679 30.101 180.000 10.503 140.687 20.671 90.000 10.000 120.174 120.117 70.000 70.122 80.515 20.104 70.259 20.312 30.000 10.000 50.765 130.000 10.369 130.000 10.183 90.422 120.000 10.646 50.000 10.000 10.565 20.001 150.125 180.010 80.002 110.000 10.487 10.000 10.075 140.548 50.420 100.233 150.082 80.138 120.430 130.427 140.000 120.000 10.549 70.000 20.000 20.074 90.409 170.000 30.000 10.000 10.152 70.051 30.000 30.598 70.000 1
Yanmin Wu, Qiankun Gao, Renrui Zhang, Jian Zhang: Language-Assisted 3D Scene Understanding. arXiv23.12
IMFSegNet0.334 100.532 140.251 120.179 80.799 120.683 90.940 110.555 90.631 130.740 120.406 110.336 140.560 100.062 40.795 70.518 130.733 110.274 60.646 140.947 90.458 180.848 140.862 110.305 100.649 50.284 140.713 130.495 80.626 90.527 100.363 100.820 60.574 140.010 140.411 40.597 70.000 10.842 50.873 100.704 150.246 40.000 70.495 120.041 170.486 100.305 80.444 130.604 160.134 170.055 170.852 100.633 140.076 100.792 50.612 90.573 180.484 130.668 130.216 130.000 30.197 90.518 70.784 140.344 130.908 80.283 50.190 40.599 140.439 110.496 110.569 30.000 10.392 100.776 30.000 40.064 50.000 10.000 60.710 100.756 130.508 120.512 170.159 160.034 150.773 20.363 110.000 30.000 80.000 10.032 70.000 40.029 170.648 170.000 10.000 110.000 100.830 70.595 40.274 160.228 90.206 90.188 130.000 150.425 90.237 120.123 130.000 50.277 60.214 150.003 10.610 30.044 130.124 100.320 150.408 120.594 100.196 80.213 80.139 140.000 10.615 70.618 60.839 40.000 10.014 70.260 70.080 130.025 20.000 110.139 130.135 60.035 70.000 110.000 10.793 20.799 100.000 10.357 140.000 10.369 60.359 140.000 10.512 160.000 10.000 10.120 130.424 30.903 90.027 60.091 70.000 10.245 50.000 10.073 160.457 150.340 130.191 160.021 160.009 180.322 160.608 70.060 30.000 10.494 110.000 20.000 20.068 110.624 120.000 30.000 10.000 10.139 120.047 40.000 30.561 80.000 1
PTv3 ScanNet2000.393 40.592 40.330 20.216 40.851 20.687 70.971 20.586 30.755 10.752 80.505 20.404 80.575 60.000 140.848 20.616 50.761 40.349 10.738 30.978 40.546 70.860 90.926 30.346 40.654 30.384 80.828 10.523 40.699 40.583 70.387 80.822 40.688 20.118 40.474 30.603 50.000 10.832 90.903 20.753 100.140 100.000 70.650 40.109 60.520 40.457 30.497 110.871 40.281 50.192 60.887 50.748 30.168 20.727 80.733 20.740 10.644 20.714 50.190 140.000 30.256 30.449 110.914 10.514 20.759 160.337 20.172 60.692 80.617 30.636 10.325 80.000 10.641 20.782 20.000 40.065 40.000 10.000 60.842 50.903 20.661 50.662 40.612 10.405 20.731 50.566 50.000 30.000 80.000 10.017 150.301 10.088 70.941 40.000 10.077 40.000 100.717 90.790 20.310 130.026 180.264 50.349 10.220 50.397 130.366 30.115 140.000 50.337 10.463 60.000 30.531 60.218 50.593 20.455 20.469 30.708 40.210 50.592 40.108 170.000 10.728 10.682 30.671 90.000 10.000 120.407 10.136 50.022 30.575 20.436 40.259 30.428 10.048 60.000 10.000 50.879 60.000 10.480 30.000 10.133 100.597 20.000 10.690 30.000 10.000 10.009 170.000 160.921 40.000 100.151 60.000 10.000 90.000 10.109 80.494 120.622 20.394 100.073 130.141 70.798 20.528 90.026 50.000 10.551 60.000 20.000 20.134 80.717 90.000 30.000 10.000 10.188 40.000 70.000 30.791 30.000 1
Xiaoyang Wu, Li Jiang, Peng-Shuai Wang, Zhijian Liu, Xihui Liu, Yu Qiao, Wanli Ouyang, Tong He, Hengshuang Zhao: Point Transformer V3: Simpler, Faster, Stronger. CVPR 2024 (Oral)
PonderV2 ScanNet2000.346 70.552 90.270 90.175 100.810 80.682 100.950 60.560 80.641 110.761 40.398 140.357 110.570 90.113 20.804 50.603 70.750 80.283 50.681 80.952 60.548 60.874 50.852 140.290 120.700 20.356 120.792 50.445 130.545 140.436 130.351 130.787 110.611 90.050 80.290 150.519 130.000 10.825 110.888 60.842 40.259 30.100 20.558 80.070 130.497 80.247 150.457 120.889 30.248 100.106 110.817 140.691 70.094 70.729 70.636 60.620 130.503 120.660 140.243 70.000 30.212 70.590 60.860 90.400 60.881 100.000 80.202 20.622 110.408 120.499 90.261 110.000 10.385 110.636 100.000 40.000 100.000 10.000 60.433 170.843 70.660 70.574 130.481 40.336 40.677 100.486 70.000 30.030 40.000 10.034 60.000 40.080 80.869 110.000 10.000 110.000 100.540 110.727 30.232 180.115 120.186 110.193 100.000 150.403 120.326 70.103 150.000 50.290 40.392 90.000 30.346 110.062 110.424 50.375 80.431 70.667 50.115 150.082 130.239 80.000 10.504 130.606 80.584 130.000 10.002 100.186 110.104 110.000 70.394 60.384 60.083 90.000 80.007 90.000 10.000 50.880 50.000 10.377 110.000 10.263 70.565 30.000 10.608 100.000 10.000 10.304 80.009 120.924 30.000 100.000 120.000 10.000 90.000 10.128 30.584 30.475 80.412 90.076 120.269 30.621 60.509 100.010 70.000 10.491 120.063 10.000 20.472 50.880 40.000 30.000 10.000 10.179 50.125 20.000 30.441 110.000 1
Haoyi Zhu, Honghui Yang, Xiaoyang Wu, Di Huang, Sha Zhang, Xianglong He, Tong He, Hengshuang Zhao, Chunhua Shen, Yu Qiao, Wanli Ouyang: PonderV2: Pave the Way for 3D Foundataion Model with A Universal Pre-training Paradigm.
ODIN - Sem200permissive0.368 50.562 50.297 50.207 50.800 110.669 140.940 110.575 40.654 100.749 90.487 40.589 10.609 30.001 120.769 130.561 90.752 70.274 60.682 70.926 140.554 50.833 150.921 40.389 20.599 110.591 10.787 80.550 20.657 60.610 50.334 140.803 90.661 40.090 60.408 70.373 160.000 10.912 20.796 180.501 180.169 80.000 70.641 50.196 10.380 180.397 40.641 60.740 100.862 10.213 40.857 70.685 80.216 10.578 170.557 110.685 50.523 90.581 170.312 30.000 30.065 160.000 180.871 40.359 90.988 20.321 30.090 170.704 70.631 20.393 160.246 120.000 10.482 90.565 160.000 40.000 100.000 10.181 10.913 10.468 170.632 90.642 60.259 120.000 180.832 10.663 10.000 30.081 20.000 10.048 20.000 40.376 10.898 80.000 10.157 10.000 100.870 40.000 180.400 50.265 40.242 60.227 70.539 10.370 150.214 140.129 110.000 50.131 100.054 180.000 30.358 100.491 10.462 40.434 30.346 160.454 160.316 20.814 10.828 30.000 10.000 180.220 180.612 120.000 10.000 120.373 30.378 20.000 70.429 50.152 120.077 100.166 40.202 50.000 10.000 50.441 150.000 10.440 70.000 10.000 130.655 10.000 10.626 80.000 10.000 10.228 100.487 20.784 170.000 100.301 40.000 10.426 20.000 10.108 90.460 140.590 40.775 10.088 60.119 160.485 90.791 20.000 120.000 10.256 180.000 20.000 20.000 120.885 30.303 10.000 10.000 10.127 170.000 70.000 30.894 20.000 1
Ayush Jain, Pushkal Katara, Nikolaos Gkanatsios, Adam W. Harley, Gabriel Sarch, Kriti Aggarwal, Vishrav Chaudhary, Katerina Fragkiadaki: ODIN: A Single Model for 2D and 3D Segmentation. CVPR 2024
ALS-MinkowskiNetcopyleft0.414 30.610 30.322 30.271 20.852 10.710 30.973 10.572 50.719 40.795 20.477 70.506 30.601 40.000 140.804 50.646 40.804 30.344 20.777 10.984 10.671 10.879 30.936 10.342 50.632 80.449 40.817 30.475 100.723 20.798 10.376 90.832 30.693 10.031 90.564 10.510 140.000 10.893 30.905 10.672 170.314 10.000 70.718 10.153 30.542 20.397 40.726 30.752 90.252 90.226 30.916 20.800 10.047 170.807 40.769 10.709 30.630 40.769 10.217 110.000 30.285 10.598 50.846 110.535 10.956 50.000 80.137 110.784 30.464 80.463 140.230 130.000 10.598 40.662 90.000 40.087 20.000 10.135 30.900 30.780 120.703 30.741 10.571 20.149 100.697 80.646 20.000 30.076 30.000 10.025 110.000 40.106 60.981 10.000 10.043 80.113 40.888 30.248 160.404 40.252 70.314 10.220 80.245 20.466 70.366 30.159 20.000 50.149 80.690 20.000 30.531 60.253 30.285 60.460 10.440 60.813 10.230 30.283 60.159 120.000 10.728 10.666 50.958 10.000 10.021 50.252 90.118 60.000 70.445 40.223 110.285 10.194 30.390 20.000 10.475 40.842 80.000 10.455 40.000 10.250 80.458 90.000 10.865 20.000 10.000 10.635 10.359 60.972 10.087 40.447 10.000 10.000 90.000 10.129 20.532 70.446 90.503 50.071 140.135 130.699 40.717 30.097 20.000 10.665 20.000 20.000 21.000 10.752 60.000 30.000 10.000 10.142 90.200 10.259 11.000 10.000 1
Guangda Ji, Silvan Weder, Francis Engelmann, Marc Pollefeys, Hermann Blum: ARKit LabelMaker: A New Scale for Indoor 3D Scene Understanding. CVPR 2025
BFANet ScanNet200permissive0.360 60.553 80.293 60.193 60.827 50.689 50.970 30.528 140.661 70.753 70.436 90.378 90.469 160.042 70.810 30.654 30.760 50.266 110.659 110.973 50.574 40.849 120.897 60.382 30.546 140.372 100.698 140.491 90.617 110.526 110.436 20.764 150.476 180.101 50.409 60.585 100.000 10.835 70.901 30.810 60.102 140.000 70.688 30.096 70.483 110.264 130.612 100.591 170.358 20.161 70.863 60.707 50.128 40.814 30.669 40.629 110.563 50.651 150.258 50.000 30.194 100.494 100.806 130.394 70.953 60.000 80.233 10.757 50.508 70.556 50.476 50.000 10.573 60.741 60.000 40.000 100.000 10.000 60.000 180.852 60.678 40.616 70.460 50.338 30.710 60.534 60.000 30.025 50.000 10.043 30.000 40.056 130.493 180.000 10.000 110.109 50.785 80.590 60.298 140.282 30.143 140.262 50.053 120.526 40.337 60.215 10.000 50.135 90.510 40.000 30.596 50.043 150.511 30.321 130.459 40.772 20.124 140.060 150.266 70.000 10.574 100.568 90.653 110.000 10.093 10.298 50.239 30.000 70.516 30.129 150.284 20.000 80.431 10.000 10.000 50.848 70.000 10.492 20.000 10.376 40.522 60.000 10.469 180.000 10.000 10.330 70.151 110.875 150.000 100.254 50.000 10.000 90.000 10.088 130.661 10.481 60.255 130.105 10.139 100.666 50.641 60.000 120.000 10.614 30.000 20.000 20.000 120.921 20.000 30.000 10.000 10.497 10.000 70.000 30.000 120.000 1
Weiguang Zhao, Rui Zhang, Qiufeng Wang, Guangliang Cheng, Kaizhu Huang: BFANet: Revisiting 3D Semantic Segmentation with Boundary Feature Analysis. CVPR 2025
DITR0.449 10.629 10.392 10.289 10.851 20.727 20.969 40.600 20.741 30.805 10.519 10.480 40.636 10.014 100.867 10.680 20.849 10.318 40.753 20.982 20.508 130.871 70.934 20.482 10.596 120.551 20.804 40.508 60.729 10.718 20.417 50.886 20.664 30.000 170.500 20.698 10.000 10.913 10.901 30.766 80.113 120.000 70.617 60.168 20.650 10.477 20.826 10.962 10.348 40.300 10.947 10.776 20.160 30.889 20.651 50.720 20.700 10.728 30.317 10.000 30.238 50.664 10.869 50.514 20.998 10.313 40.138 100.815 20.828 10.622 20.421 60.000 10.823 10.817 10.000 40.000 100.000 10.157 20.866 40.991 10.805 10.660 50.571 20.043 130.709 70.642 30.000 30.000 80.000 10.028 100.018 30.134 30.967 30.000 10.150 20.130 20.949 10.855 10.580 10.262 50.314 10.230 60.222 40.498 50.367 20.153 30.869 10.334 20.397 80.000 30.904 10.486 21.000 10.423 40.484 20.632 70.716 10.733 20.862 10.000 10.433 150.710 10.851 30.000 10.034 40.315 40.385 10.000 70.001 100.268 100.066 120.000 80.278 40.000 10.978 10.839 90.000 10.448 50.000 10.579 10.403 130.000 10.647 40.000 10.000 10.411 40.315 70.904 80.420 10.392 30.000 10.091 60.000 10.128 30.564 40.591 30.568 20.079 100.139 101.000 10.714 40.178 10.000 10.606 40.000 20.000 20.148 70.983 10.000 30.000 10.000 10.374 20.000 70.000 30.662 50.000 1
Karim Abou Zeid, Kadir Yilmaz, Daan de Geus, Alexander Hermans, David Adrian, Timm Linder, Bastian Leibe: DINO in the Room: Leveraging 2D Foundation Models for 3D Segmentation. 3DV 2026
GSTran0.334 110.533 130.250 130.179 90.799 120.684 80.940 110.554 100.633 120.741 110.405 120.337 130.560 100.060 50.794 90.517 140.732 120.274 60.647 130.948 80.459 170.849 120.864 100.306 90.648 60.282 150.717 120.496 70.624 100.533 90.363 100.821 50.573 150.009 150.411 40.593 90.000 10.841 60.873 100.704 150.242 50.000 70.495 120.041 170.487 90.304 90.439 140.613 140.133 180.055 170.853 90.634 130.075 130.791 60.601 100.574 170.483 140.669 120.217 110.000 30.198 80.518 70.782 150.345 120.914 70.273 60.193 30.598 150.440 100.499 90.570 20.000 10.381 120.775 40.000 40.063 60.000 10.000 60.712 90.752 140.507 130.512 170.158 170.036 140.773 20.361 120.000 30.000 80.000 10.032 70.000 40.032 160.651 160.000 10.000 110.000 100.831 60.595 40.273 170.229 80.200 100.191 110.000 150.425 90.233 130.125 120.000 50.279 50.213 160.003 10.608 40.044 130.138 90.321 130.408 120.593 110.198 60.205 90.139 140.000 10.614 80.609 70.838 50.000 10.014 70.260 70.080 130.010 50.000 110.136 140.136 50.047 60.000 110.000 10.787 30.797 110.000 10.354 150.000 10.372 50.357 150.000 10.507 170.000 10.000 10.121 120.423 40.903 90.028 50.089 80.000 10.252 40.000 10.072 170.465 130.340 130.189 170.020 170.011 170.320 170.606 80.060 30.000 10.496 100.000 20.000 20.070 100.618 140.000 30.000 10.000 10.139 120.047 40.000 30.558 90.000 1
Minkowski 34Dpermissive0.253 170.463 170.154 180.102 170.771 170.650 170.932 160.483 170.571 170.710 160.331 170.250 170.492 150.044 60.703 170.419 180.606 180.227 170.621 170.865 180.531 90.771 180.813 150.291 110.484 160.242 170.612 180.282 180.440 180.351 160.299 160.622 170.593 120.027 110.293 140.310 180.000 10.757 150.858 150.737 130.150 90.164 10.368 180.084 90.381 170.142 180.357 160.720 110.214 130.092 150.724 170.596 180.056 150.655 100.525 150.581 160.352 180.594 160.056 180.000 30.014 180.224 160.772 160.205 180.720 170.000 80.159 70.531 170.163 180.294 170.136 180.000 10.169 170.589 150.000 40.000 100.000 10.002 40.663 110.466 180.265 180.582 110.337 110.016 160.559 160.084 180.000 30.000 80.000 10.036 50.000 40.125 50.670 140.000 10.102 30.071 80.164 160.406 100.386 70.046 170.068 180.159 160.117 60.284 170.111 170.094 170.000 50.000 180.197 170.000 30.044 160.013 160.002 140.228 180.307 180.588 120.025 180.545 50.134 160.000 10.655 50.302 150.282 180.000 10.060 20.000 130.035 180.000 70.000 110.097 180.000 130.000 80.005 100.000 10.000 50.096 180.000 10.334 170.000 10.000 130.274 170.000 10.513 150.000 10.000 10.280 90.194 100.897 120.000 100.000 120.000 10.000 90.000 10.108 90.279 180.189 170.141 180.059 150.272 20.307 180.445 110.003 100.000 10.353 160.000 20.026 10.000 120.581 160.001 20.000 10.000 10.093 180.002 60.000 30.000 120.000 1
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019


This table lists the benchmark results for the ScanNet200 3D semantic instance scenario.




Method Infoavg aphead apcommon aptail apchairtabledoorcouchcabinetshelfdeskoffice chairbedpillowsinkpicturewindowtoiletbookshelfmonitorcurtainbookarmchaircoffee tableboxrefrigeratorlampkitchen cabinettowelclothestvnightstandcounterdresserstoolcushionplantceilingbathtubend tabledining tablekeyboardbagbackpacktoilet paperprintertv standwhiteboardblanketshower curtaintrash canclosetstairsmicrowavestoveshoecomputer towerbottlebinottomanbenchboardwashing machinemirrorcopierbasketsofa chairfile cabinetfanlaptopshowerpaperpersonpaper towel dispenserovenblindsrackplateblackboardpianosuitcaserailradiatorrecycling bincontainerwardrobesoap dispensertelephonebucketclockstandlightlaundry basketpipeclothes dryerguitartoilet paper holderseatspeakercolumnbicycleladderbathroom stallshower wallcupjacketstorage bincoffee makerdishwasherpaper towel rollmachinematwindowsillbartoasterbulletin boardironing boardfireplacesoap dishkitchen counterdoorframetoilet paper dispensermini fridgefire extinguisherballhatshower curtain rodwater coolerpaper cuttertrayshower doorpillarledgetoaster ovenmousetoilet seat cover dispenserfurniturecartstorage containerscaletissue boxlight switchcratepower outletdecorationsignprojectorcloset doorvacuum cleanercandleplungerstuffed animalheadphonesdish rackbroomguitar caserange hooddustpanhair dryerwater bottlehandicap barpurseventshower floorwater pitchermailboxbowlpaper bagalarm clockmusic standprojector screendividerlaundry detergentbathroom counterobjectbathroom vanitycloset walllaundry hamperbathroom stall doorceiling lighttrash bindumbbellstair railtubebathroom cabinetcd casecloset rodcoffee kettlestructureshower headkeyboard pianocase of water bottlescoat rackstorage organizerfolded chairfire alarmpower stripcalendarposterpotted plantluggagemattress
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted by
Volt-SPFormerpermissive0.367 10.475 10.359 10.248 10.678 30.494 20.736 10.689 30.416 30.170 40.484 10.008 40.663 30.575 10.524 10.787 10.418 20.928 20.550 10.684 40.470 40.308 20.685 10.193 20.799 10.565 10.365 20.560 20.144 30.682 20.556 40.052 20.663 10.417 80.000 40.527 30.609 21.000 10.299 10.000 60.831 10.051 40.635 20.524 10.650 31.000 10.442 30.235 10.873 10.817 10.004 40.383 70.693 20.469 20.348 30.682 30.380 20.012 40.400 50.240 50.664 10.284 21.000 10.125 10.329 20.660 10.717 10.318 20.250 20.029 10.340 10.748 10.333 40.407 20.000 10.017 50.556 21.000 10.552 10.549 10.238 10.099 30.821 20.515 30.000 20.000 30.014 10.232 20.111 10.013 40.333 40.002 30.000 50.139 60.389 50.822 10.029 50.551 10.247 30.230 30.000 20.719 10.378 30.500 20.778 10.400 20.117 40.000 20.388 10.439 40.278 20.192 40.241 20.537 40.588 30.466 20.333 10.000 21.000 10.395 31.000 10.000 10.013 30.000 40.254 30.000 30.556 20.710 10.000 40.500 10.304 20.000 30.000 30.864 10.000 10.502 10.000 10.500 30.588 20.000 10.655 20.000 10.000 10.652 20.764 10.112 60.250 20.278 20.000 10.222 30.000 10.050 40.528 20.533 10.345 40.638 10.167 60.066 80.117 20.019 20.000 10.113 30.000 10.000 20.444 10.556 10.000 30.028 20.000 20.156 30.000 30.167 11.000 10.000 1
Kadir Yilmaz, Adrian Kruse, Tristan Höfer, Daan de Geus, Bastian Leibe: Volume Transformer: Revisiting Vanilla Transformers for 3D Scene Understanding.
CompetitorFormer-2000.328 30.439 20.303 30.223 30.771 10.456 40.663 20.673 40.259 50.182 30.455 20.373 10.722 20.504 30.450 40.774 20.469 10.945 10.380 30.820 10.479 30.312 10.641 30.143 40.786 30.346 30.356 30.534 40.120 50.658 30.655 10.049 30.464 20.428 70.014 30.465 60.650 10.850 40.076 60.083 50.808 20.044 50.543 30.271 50.712 21.000 10.454 20.183 20.831 20.730 40.010 30.471 40.575 30.421 30.390 20.663 40.192 50.047 10.820 10.243 40.441 40.303 11.000 10.000 40.277 30.620 20.427 20.312 30.000 70.011 30.123 40.569 40.430 20.562 10.000 10.353 20.083 40.500 20.358 40.396 40.120 40.082 40.868 10.518 20.000 20.004 20.001 40.137 50.000 20.019 30.366 20.000 40.083 20.500 20.444 40.119 60.099 10.110 50.400 10.178 40.000 20.689 20.400 10.125 30.065 20.314 50.384 10.044 10.256 30.484 30.333 10.345 10.243 10.632 30.487 40.013 60.333 10.000 21.000 10.472 20.835 20.000 10.116 20.000 40.500 10.000 30.069 40.237 30.000 40.500 10.267 30.000 30.050 20.452 50.000 10.475 20.000 10.677 20.400 40.000 10.555 40.000 10.000 10.679 10.060 70.171 51.000 10.103 30.000 10.667 10.000 10.088 10.296 40.305 40.444 20.221 40.208 30.192 50.069 30.140 10.000 10.043 50.000 10.043 10.111 30.556 10.000 30.054 10.000 20.322 20.025 20.000 31.000 10.000 1
DINO3D-Scannet200copyleft0.346 20.437 30.353 20.229 20.729 20.536 10.659 30.733 10.431 10.264 10.388 30.001 60.764 10.529 20.462 30.669 30.411 30.925 30.371 50.766 20.545 10.263 30.574 40.257 10.714 40.504 20.325 40.726 10.206 10.618 40.628 20.066 10.297 30.558 20.000 40.732 10.594 30.940 20.199 20.558 20.752 30.174 10.687 10.470 20.921 10.764 70.345 50.142 30.731 60.780 20.138 10.514 30.712 10.556 10.417 10.719 10.407 10.042 20.292 90.456 10.245 80.266 31.000 10.042 30.247 40.446 30.373 30.241 40.049 50.000 40.328 30.536 50.417 30.000 40.000 10.764 10.000 60.500 20.406 20.520 20.045 60.442 10.803 30.681 10.000 20.000 30.000 50.251 10.000 20.027 20.083 60.000 40.303 10.306 30.889 20.551 20.094 20.264 20.361 20.253 20.000 20.611 30.400 10.516 10.000 30.599 10.279 20.000 20.346 20.642 10.111 40.282 20.183 30.664 20.750 10.378 40.333 10.500 10.514 50.593 10.708 30.000 10.238 10.000 40.250 40.111 10.000 60.484 20.000 40.250 40.585 10.000 30.063 10.487 40.000 10.365 30.000 10.772 10.639 10.000 10.769 10.000 10.000 10.545 40.655 20.000 80.250 20.014 50.000 10.222 30.000 10.082 20.618 10.156 60.384 30.436 30.130 70.246 40.049 50.009 30.000 10.192 20.000 10.000 20.000 40.477 40.028 20.000 40.000 20.156 30.000 30.000 31.000 10.000 1
Jinyuan Qu, Hongyang Li, Xingyu Chen, Shilong Liu, Yukai Shi, Tianhe Ren, Ruitao Jing and Lei Zhang: SegDINO3D: 3D Instance Segmentation Empowered by Both Image-Level and Object-Level 2D Features. AAAI 2026
ODIN - Ins200permissive0.265 50.349 50.268 40.163 50.485 90.366 70.549 50.492 90.421 20.229 20.265 60.003 50.609 50.297 50.320 50.327 50.251 60.848 70.314 80.526 60.324 80.138 50.529 50.178 30.440 80.186 90.306 50.546 30.160 20.494 70.476 60.016 50.231 60.594 10.000 40.615 20.357 60.630 70.141 30.167 40.665 40.054 30.360 50.451 40.610 40.769 60.640 10.032 50.746 40.698 50.040 20.389 60.550 50.371 40.257 50.617 70.310 30.000 60.481 40.022 80.463 20.160 51.000 10.125 10.193 60.267 60.253 60.156 60.000 70.000 40.332 20.606 30.444 10.000 40.000 10.281 31.000 10.417 60.344 50.238 90.218 20.000 60.655 60.506 40.000 20.052 10.000 50.091 60.000 20.035 10.370 10.000 40.000 50.250 40.903 10.037 90.031 30.221 30.197 40.285 10.037 10.191 90.200 60.083 40.000 30.200 60.115 50.000 20.250 40.552 20.278 20.077 50.107 50.389 50.674 20.565 10.278 40.000 20.361 90.333 70.361 70.000 10.000 60.438 10.451 20.000 31.000 10.074 50.204 20.250 40.250 40.000 30.000 30.493 30.000 10.083 80.000 10.000 60.317 50.000 10.481 50.000 10.000 10.188 60.333 40.345 20.000 40.333 10.000 10.333 20.000 10.035 60.266 50.478 20.506 10.054 60.205 40.119 70.067 40.000 50.000 10.210 10.000 10.000 20.000 40.389 50.097 10.000 40.000 20.111 60.000 30.000 30.889 50.000 1
TD3D Scannet200permissive0.211 60.332 60.177 60.103 60.662 40.413 50.463 60.705 20.192 70.145 50.266 50.215 20.452 80.209 60.222 90.219 90.315 50.893 40.380 40.617 50.439 50.047 80.646 20.080 60.610 60.253 40.237 60.293 60.135 40.379 90.494 50.048 40.252 50.451 40.184 20.483 40.395 50.852 30.083 50.551 30.278 60.036 60.337 60.266 60.544 50.963 30.079 90.039 40.740 50.604 60.000 60.586 10.283 60.282 60.059 60.633 60.028 60.004 50.559 30.309 30.420 50.028 91.000 10.000 40.456 10.411 40.372 40.060 80.046 60.000 40.040 80.694 20.083 60.000 40.000 10.000 60.000 60.083 80.252 60.260 80.200 30.160 20.669 50.111 60.000 20.000 30.006 30.169 40.000 20.007 50.296 50.032 10.074 30.139 60.000 60.321 40.031 40.108 60.088 60.157 50.000 20.231 80.026 90.000 60.000 30.356 40.052 60.000 20.240 50.147 50.000 50.015 60.046 70.144 70.073 70.414 30.222 80.000 20.806 30.343 60.486 60.000 10.008 40.038 30.083 50.002 20.028 50.074 50.032 30.150 60.039 60.008 10.000 30.250 80.000 10.125 70.000 10.052 50.260 70.000 10.143 90.000 10.000 10.543 50.207 50.404 10.000 40.003 60.000 10.000 60.000 10.037 50.093 80.272 50.342 50.039 80.281 20.249 30.224 10.000 50.000 10.074 40.000 10.000 20.000 40.278 60.000 30.000 40.889 10.323 10.000 30.014 20.000 60.000 1
Maksim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: Top-Down Beats Bottom-Up in 3D Instance Segmentation. WACV 2024
Mask3D Scannet2000.278 40.383 40.263 50.168 40.661 50.465 30.572 40.665 60.391 40.121 80.304 40.015 30.647 40.349 40.474 20.489 40.321 40.816 90.351 60.722 30.402 70.195 40.515 70.082 50.795 20.215 50.396 10.377 50.082 80.724 10.586 30.015 60.277 40.377 90.201 10.475 50.572 40.778 60.089 40.759 10.556 50.068 20.506 40.467 30.323 70.778 40.427 40.027 60.789 30.744 30.003 50.570 20.561 40.337 50.265 40.711 20.258 40.031 30.569 20.311 20.441 30.179 41.000 10.000 40.233 50.411 50.283 50.380 10.667 10.016 20.048 70.418 60.139 50.173 30.000 10.086 40.014 50.500 20.384 30.497 30.044 70.032 50.752 40.287 50.003 10.000 30.007 20.208 30.000 20.001 60.349 30.008 20.014 40.509 10.500 30.323 30.023 60.176 40.107 50.105 70.000 20.605 40.378 30.016 50.000 30.400 20.192 30.000 20.048 60.037 60.000 50.275 30.119 40.810 10.258 50.006 70.083 90.000 20.568 40.377 50.708 30.000 10.005 50.147 20.014 60.000 30.556 20.085 40.325 10.500 10.083 50.004 20.000 30.590 20.000 10.365 40.000 10.116 40.491 30.000 10.626 30.000 10.000 10.579 30.391 30.050 70.000 40.028 40.000 10.222 30.000 10.063 30.302 30.356 30.149 80.573 20.415 10.013 90.002 80.004 40.000 10.005 80.000 10.000 20.444 10.514 30.000 30.028 20.000 20.156 30.267 10.000 31.000 10.000 1
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
Minkowski 34D Inst.permissive0.130 80.246 80.083 80.043 90.547 80.236 80.415 80.672 50.141 90.133 70.067 80.000 70.521 60.114 90.238 80.289 60.232 80.883 50.182 90.373 90.486 20.076 70.488 80.022 80.529 70.199 80.110 80.217 80.100 60.460 80.319 80.000 70.025 90.472 30.000 40.394 70.210 80.537 80.004 80.000 60.083 90.000 90.299 80.061 90.201 90.761 80.084 80.008 70.720 70.557 90.000 60.317 90.280 70.094 90.020 90.564 90.000 80.000 60.400 50.048 70.259 70.101 71.000 10.000 40.190 70.142 90.094 90.137 70.089 40.000 40.101 50.355 90.000 70.000 40.000 10.000 60.000 60.444 50.082 90.384 50.000 90.000 60.334 90.004 90.000 20.000 30.000 50.041 80.000 20.000 70.026 90.000 40.000 50.000 80.000 60.082 80.022 70.000 90.021 80.088 80.000 20.241 70.033 80.000 60.000 30.067 70.000 90.000 20.000 70.000 70.000 50.000 80.026 80.262 60.016 80.000 80.278 40.000 20.500 70.394 40.028 90.000 10.000 60.000 40.000 70.000 30.000 60.019 80.000 40.000 70.000 70.000 30.000 30.156 90.000 10.032 90.000 10.000 60.194 90.000 10.248 80.000 10.000 10.099 80.019 80.308 30.000 40.000 70.000 10.000 60.000 10.007 80.122 60.000 70.175 70.063 50.000 80.271 10.000 90.000 50.000 10.000 90.000 10.000 20.000 40.278 60.000 30.000 40.000 20.111 60.000 30.000 30.000 60.000 1
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019
CSC-Pretrain Inst.permissive0.123 90.223 90.082 90.046 80.564 70.152 90.394 90.578 80.235 60.116 90.034 90.000 70.348 90.119 80.297 60.285 70.202 90.838 80.323 70.407 80.184 90.037 90.516 60.013 90.424 90.214 60.093 90.105 90.078 90.542 60.250 90.000 70.064 80.444 50.000 40.224 90.231 70.537 80.001 90.000 60.126 80.004 70.308 70.193 70.244 80.343 90.228 60.000 90.441 80.588 70.000 60.338 80.275 80.189 80.030 80.600 80.000 80.000 60.378 70.000 90.108 90.098 81.000 10.000 40.096 90.172 80.144 70.011 90.125 30.000 40.000 90.376 80.000 70.000 40.000 10.000 60.000 60.042 90.141 80.377 60.051 50.000 60.483 70.017 80.000 20.000 30.000 50.022 90.000 20.000 70.065 70.000 40.000 50.000 80.000 60.094 70.000 90.042 70.000 90.064 90.000 20.259 60.089 70.000 60.000 30.000 80.022 80.000 20.000 70.000 70.000 50.000 80.018 90.111 90.000 90.000 80.278 40.000 20.444 80.333 70.333 80.000 10.000 60.000 40.000 70.000 30.000 60.000 90.000 40.000 70.000 70.000 30.000 30.267 70.000 10.184 60.000 10.000 60.211 80.000 10.378 60.000 10.000 10.063 90.000 90.275 40.000 40.000 70.000 10.000 60.000 10.007 90.105 70.000 70.032 90.045 70.198 50.171 60.028 60.000 50.000 10.006 70.000 10.000 20.000 40.278 60.000 30.000 40.000 20.044 80.000 30.000 30.000 60.000 1
Ji Hou, Benjamin Graham, Matthias Nießner, Saining Xie: Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts. CVPR 2021
LGround Inst.permissive0.154 70.275 70.108 70.060 70.573 60.381 60.434 70.654 70.190 80.141 60.097 70.000 70.503 70.180 70.252 70.242 80.242 70.881 60.448 20.494 70.429 60.078 60.364 90.024 70.654 50.213 70.222 70.239 70.099 70.616 50.363 70.000 70.092 70.444 50.000 40.383 80.209 90.815 50.030 70.000 60.166 70.002 80.295 90.099 80.364 60.778 40.177 70.001 80.427 90.585 80.000 60.470 50.268 90.205 70.045 70.642 50.007 70.000 60.333 80.148 60.407 60.130 61.000 10.000 40.156 80.189 70.097 80.169 50.000 70.000 40.056 60.400 70.000 70.000 40.000 10.000 60.556 20.278 70.203 70.323 70.019 80.000 60.402 80.026 70.000 20.000 30.000 50.044 70.000 20.000 70.037 80.000 40.000 50.181 50.000 60.127 50.006 80.028 80.023 70.115 60.000 20.327 50.267 50.000 60.000 30.000 80.028 70.000 20.000 70.000 70.000 50.003 70.048 60.135 80.222 60.089 50.278 40.000 20.514 50.333 70.611 50.000 10.000 60.000 40.000 70.000 30.000 60.037 70.000 40.000 70.000 70.000 30.000 30.322 60.000 10.209 50.000 10.000 60.278 60.000 10.302 70.000 10.000 10.143 70.148 60.000 80.000 40.000 70.000 10.000 60.000 10.015 70.064 90.000 70.272 60.031 90.000 80.257 20.028 60.000 50.000 10.041 60.000 10.000 20.000 40.222 90.000 30.000 40.000 20.000 90.000 30.000 30.000 60.000 1
David Rozenberszki, Or Litany, Angela Dai: Language-Grounded Indoor 3D Semantic Segmentation in the Wild.


ScanNet Benchmark

This table lists the benchmark results for the 3D semantic label scenario.


Method Infoavg ioubathtubbedbookshelfcabinetchaircountercurtaindeskdoorfloorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwallwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Volt ScanNetpermissive0.805 10.932 50.846 30.801 490.775 100.862 110.604 10.955 10.779 10.722 40.980 10.635 10.352 120.799 30.941 40.887 10.807 200.748 20.973 30.911 10.798 6
Kadir Yilmaz, Adrian Kruse, Tristan Höfer, Daan de Geus, Bastian Leibe: Volume Transformer: Revisiting Vanilla Transformers for 3D Scene Understanding.
PTv3-PPT-ALCcopyleft0.798 20.911 120.812 240.854 80.770 130.856 160.555 180.943 20.660 270.735 20.979 20.606 80.492 10.792 50.934 50.841 30.819 60.716 100.947 110.906 20.822 1
Guangda Ji, Silvan Weder, Francis Engelmann, Marc Pollefeys, Hermann Blum: ARKit LabelMaker: A New Scale for Indoor 3D Scene Understanding. CVPR 2025
DITR ScanNet0.797 30.727 780.869 10.882 10.785 60.868 70.578 60.943 20.744 20.727 30.979 20.627 30.364 90.824 10.949 20.779 160.844 10.757 10.982 10.905 30.802 3
Karim Abou Zeid, Kadir Yilmaz, Daan de Geus, Alexander Hermans, David Adrian, Timm Linder, Bastian Leibe: DINO in the Room: Leveraging 2D Foundation Models for 3D Segmentation. 3DV 2026
PTv3 ScanNet0.794 40.941 30.813 230.851 110.782 70.890 20.597 20.916 70.696 120.713 60.979 20.635 10.384 30.793 40.907 110.821 60.790 380.696 150.967 50.903 40.805 2
Xiaoyang Wu, Li Jiang, Peng-Shuai Wang, Zhijian Liu, Xihui Liu, Yu Qiao, Wanli Ouyang, Tong He, Hengshuang Zhao: Point Transformer V3: Simpler, Faster, Stronger. CVPR 2024 (Oral)
PonderV20.785 50.978 10.800 320.833 300.788 40.853 210.545 220.910 100.713 40.705 70.979 20.596 100.390 20.769 160.832 460.821 60.792 370.730 30.975 20.897 70.785 8
Haoyi Zhu, Honghui Yang, Xiaoyang Wu, Di Huang, Sha Zhang, Xianglong He, Tong He, Hengshuang Zhao, Chunhua Shen, Yu Qiao, Wanli Ouyang: PonderV2: Pave the Way for 3D Foundataion Model with A Universal Pre-training Paradigm.
Mix3Dpermissive0.781 60.964 20.855 20.843 200.781 80.858 140.575 90.831 410.685 180.714 50.979 20.594 110.310 320.801 20.892 200.841 30.819 60.723 70.940 160.887 90.725 30
Alexey Nekrasov, Jonas Schult, Or Litany, Bastian Leibe, Francis Engelmann: Mix3D: Out-of-Context Data Augmentation for 3D Scenes. 3DV 2021 (Oral)
Swin3Dpermissive0.779 70.861 250.818 180.836 270.790 30.875 40.576 80.905 110.704 80.739 10.969 130.611 40.349 130.756 260.958 10.702 530.805 210.708 110.916 400.898 60.801 4
TTT-KD0.773 80.646 990.818 180.809 420.774 110.878 30.581 40.943 20.687 160.704 80.978 70.607 70.336 210.775 120.912 90.838 50.823 40.694 160.967 50.899 50.794 7
Lisa Weijler, Muhammad Jehanzeb Mirza, Leon Sick, Can Ekkazan, Pedro Hermosilla: TTT-KD: Test-Time Training for 3D Semantic Segmentation through Knowledge Distillation from Foundation Models.
ResLFE_HDS0.772 90.939 40.824 80.854 80.771 120.840 360.564 140.900 130.686 170.677 150.961 190.537 370.348 140.769 160.903 130.785 140.815 90.676 270.939 170.880 140.772 12
OctFormerpermissive0.766 100.925 80.808 280.849 130.786 50.846 310.566 130.876 200.690 140.674 180.960 200.576 230.226 750.753 280.904 120.777 170.815 90.722 80.923 320.877 180.776 11
Peng-Shuai Wang: OctFormer: Octree-based Transformers for 3D Point Clouds. SIGGRAPH 2023
PPT-SpUNet-Joint0.766 100.932 50.794 380.829 320.751 270.854 190.540 260.903 120.630 400.672 190.963 170.565 270.357 100.788 60.900 150.737 320.802 220.685 210.950 90.887 90.780 9
Xiaoyang Wu, Zhuotao Tian, Xin Wen, Bohao Peng, Xihui Liu, Kaicheng Yu, Hengshuang Zhao: Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt Training. CVPR 2024
CU-Hybrid Net0.764 120.924 90.819 150.840 230.757 220.853 210.580 50.848 330.709 60.643 290.958 250.587 170.295 400.753 280.884 240.758 240.815 90.725 60.927 280.867 290.743 21
OccuSeg+Semantic0.764 120.758 630.796 360.839 240.746 310.907 10.562 150.850 320.680 200.672 190.978 70.610 50.335 230.777 100.819 500.847 20.830 30.691 180.972 40.885 110.727 28
O-CNNpermissive0.762 140.924 90.823 90.844 190.770 130.852 230.577 70.847 350.711 50.640 330.958 250.592 120.217 810.762 210.888 210.758 240.813 130.726 50.932 260.868 280.744 20
Peng-Shuai Wang, Yang Liu, Yu-Xiao Guo, Chun-Yu Sun, Xin Tong: O-CNN: Octree-based Convolutional Neural Networks for 3D Shape Analysis. SIGGRAPH 2017
DiffSegNet0.758 150.725 800.789 430.843 200.762 180.856 160.562 150.920 50.657 300.658 230.958 250.589 150.337 200.782 70.879 250.787 120.779 430.678 230.926 300.880 140.799 5
DTC0.757 160.843 310.820 130.847 160.791 20.862 110.511 400.870 240.707 70.652 250.954 420.604 90.279 510.760 220.942 30.734 330.766 520.701 140.884 630.874 240.736 22
OA-CNN-L_ScanNet200.756 170.783 490.826 70.858 60.776 90.837 410.548 210.896 160.649 320.675 170.962 180.586 180.335 230.771 150.802 550.770 200.787 400.691 180.936 210.880 140.761 15
PNE0.755 180.786 470.835 60.834 290.758 200.849 260.570 110.836 400.648 330.668 210.978 70.581 210.367 70.683 410.856 340.804 90.801 260.678 230.961 70.889 80.716 37
P. Hermosilla: Point Neighborhood Embeddings.
LSK3DNetpermissive0.755 180.899 180.823 90.843 200.764 170.838 390.584 30.845 360.717 30.638 350.956 320.580 220.229 740.640 510.900 150.750 270.813 130.729 40.920 360.872 260.757 16
Tuo Feng, Wenguan Wang, Fan Ma, Yi Yang: LSK3DNet: Towards Effective and Efficient 3D Perception with Large Sparse Kernels. CVPR 2024
ConDaFormer0.755 180.927 70.822 110.836 270.801 10.849 260.516 370.864 290.651 310.680 140.958 250.584 200.282 480.759 240.855 360.728 350.802 220.678 230.880 680.873 250.756 18
Lunhao Duan, Shanshan Zhao, Nan Xue, Mingming Gong, Guisong Xia, Dacheng Tao: ConDaFormer : Disassembled Transformer with Local Structure Enhancement for 3D Point Cloud Understanding. Neurips, 2023
DMF-Net0.752 210.906 160.793 400.802 480.689 480.825 540.556 170.867 250.681 190.602 520.960 200.555 330.365 80.779 90.859 310.747 280.795 340.717 90.917 390.856 370.764 14
C.Yang, Y.Yan, W.Zhao, J.Ye, X.Yang, A.Hussain, B.Dong, K.Huang: Towards Deeper and Better Multi-view Feature Fusion for 3D Semantic Segmentation. ICONIP 2023
PointTransformerV20.752 210.742 700.809 270.872 20.758 200.860 130.552 190.891 180.610 470.687 90.960 200.559 310.304 350.766 190.926 70.767 210.797 300.644 400.942 140.876 210.722 33
Xiaoyang Wu, Yixing Lao, Li Jiang, Xihui Liu, Hengshuang Zhao: Point Transformer V2: Grouped Vector Attention and Partition-based Pooling. NeurIPS 2022
PointConvFormer0.749 230.793 450.790 410.807 440.750 290.856 160.524 330.881 190.588 600.642 320.977 110.591 130.274 540.781 80.929 60.804 90.796 310.642 410.947 110.885 110.715 38
Wenxuan Wu, Qi Shan, Li Fuxin: PointConvFormer: Revenge of the Point-based Convolution.
BPNetcopyleft0.749 230.909 140.818 180.811 400.752 250.839 380.485 550.842 370.673 220.644 280.957 300.528 440.305 340.773 130.859 310.788 110.818 80.693 170.916 400.856 370.723 32
Wenbo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia, Tien-Tsin Wong: Bidirectional Projection Network for Cross Dimension Scene Understanding. CVPR 2021 (Oral)
MSP0.748 250.623 1020.804 300.859 50.745 320.824 560.501 440.912 90.690 140.685 110.956 320.567 260.320 290.768 180.918 80.720 400.802 220.676 270.921 340.881 130.779 10
StratifiedFormerpermissive0.747 260.901 170.803 310.845 180.757 220.846 310.512 390.825 440.696 120.645 270.956 320.576 230.262 650.744 340.861 300.742 300.770 500.705 120.899 520.860 340.734 23
Xin Lai*, Jianhui Liu*, Li Jiang, Liwei Wang, Hengshuang Zhao, Shu Liu, Xiaojuan Qi, Jiaya Jia: Stratified Transformer for 3D Point Cloud Segmentation. CVPR 2022
VMNetpermissive0.746 270.870 230.838 40.858 60.729 370.850 250.501 440.874 210.587 610.658 230.956 320.564 280.299 370.765 200.900 150.716 430.812 150.631 460.939 170.858 350.709 39
Zeyu HU, Xuyang Bai, Jiaxiang Shang, Runze Zhang, Jiayu Dong, Xin Wang, Guangyuan Sun, Hongbo Fu, Chiew-Lan Tai: VMNet: Voxel-Mesh Network for Geodesic-Aware 3D Semantic Segmentation. ICCV 2021 (Oral)
Virtual MVFusion0.746 270.771 570.819 150.848 150.702 440.865 100.397 930.899 140.699 100.664 220.948 640.588 160.330 250.746 330.851 400.764 220.796 310.704 130.935 220.866 300.728 26
Abhijit Kundu, Xiaoqi Yin, Alireza Fathi, David Ross, Brian Brewington, Thomas Funkhouser, Caroline Pantofaru: Virtual Multi-view Fusion for 3D Semantic Segmentation. ECCV 2020
DiffSeg3D20.745 290.725 800.814 220.837 250.751 270.831 480.514 380.896 160.674 210.684 120.960 200.564 280.303 360.773 130.820 490.713 460.798 290.690 200.923 320.875 220.757 16
ODINpermissive0.744 300.658 950.752 660.870 30.714 410.843 340.569 120.919 60.703 90.622 420.949 610.591 130.343 160.736 350.784 570.816 80.838 20.672 320.918 380.854 410.725 30
Ayush Jain, Pushkal Katara, Nikolaos Gkanatsios, Adam W. Harley, Gabriel Sarch, Kriti Aggarwal, Vishrav Chaudhary, Katerina Fragkiadaki: ODIN: A Single Model for 2D and 3D Segmentation. CVPR 2024
Retro-FPN0.744 300.842 320.800 320.767 630.740 330.836 430.541 240.914 80.672 230.626 390.958 250.552 340.272 560.777 100.886 230.696 540.801 260.674 300.941 150.858 350.717 35
Peng Xiang*, Xin Wen*, Yu-Shen Liu, Hui Zhang, Yi Fang, Zhizhong Han: Retrospective Feature Pyramid Network for Point Cloud Semantic Segmentation. ICCV 2023
EQ-Net0.743 320.620 1030.799 350.849 130.730 360.822 580.493 520.897 150.664 240.681 130.955 360.562 300.378 40.760 220.903 130.738 310.801 260.673 310.907 440.877 180.745 19
Zetong Yang*, Li Jiang*, Yanan Sun, Bernt Schiele, Jiaya JIa: A Unified Query-based Paradigm for Point Cloud Understanding. CVPR 2022
SAT0.742 330.860 260.765 570.819 350.769 150.848 280.533 280.829 420.663 250.631 380.955 360.586 180.274 540.753 280.896 180.729 340.760 580.666 340.921 340.855 390.733 24
LRPNet0.742 330.816 400.806 290.807 440.752 250.828 520.575 90.839 390.699 100.637 360.954 420.520 480.320 290.755 270.834 440.760 230.772 470.676 270.915 420.862 320.717 35
LargeKernel3D0.739 350.909 140.820 130.806 460.740 330.852 230.545 220.826 430.594 590.643 290.955 360.541 360.263 640.723 390.858 330.775 190.767 510.678 230.933 240.848 450.694 44
Yukang Chen*, Jianhui Liu*, Xiangyu Zhang, Xiaojuan Qi, Jiaya Jia: LargeKernel3D: Scaling up Kernels in 3D Sparse CNNs. CVPR 2023
RPN0.736 360.776 530.790 410.851 110.754 240.854 190.491 540.866 270.596 580.686 100.955 360.536 380.342 170.624 580.869 270.787 120.802 220.628 470.927 280.875 220.704 41
MinkowskiNetpermissive0.736 360.859 270.818 180.832 310.709 420.840 360.521 350.853 310.660 270.643 290.951 530.544 350.286 460.731 370.893 190.675 630.772 470.683 220.874 750.852 430.727 28
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019
IPCA0.731 380.890 190.837 50.864 40.726 380.873 50.530 320.824 450.489 950.647 260.978 70.609 60.336 210.624 580.733 650.758 240.776 450.570 730.949 100.877 180.728 26
MS-SFA-net0.730 390.910 130.819 150.837 250.698 450.838 390.532 300.872 220.605 510.676 160.959 240.535 400.341 180.649 470.598 890.708 480.810 160.664 360.895 550.879 170.771 13
online3d0.727 400.715 850.777 500.854 80.748 300.858 140.497 490.872 220.572 680.639 340.957 300.523 450.297 390.750 310.803 540.744 290.810 160.587 690.938 190.871 270.719 34
SparseConvNet0.725 410.647 980.821 120.846 170.721 390.869 60.533 280.754 660.603 540.614 440.955 360.572 250.325 270.710 400.870 260.724 380.823 40.628 470.934 230.865 310.683 47
PointTransformer++0.725 410.727 780.811 260.819 350.765 160.841 350.502 430.814 500.621 430.623 410.955 360.556 320.284 470.620 600.866 280.781 150.757 620.648 380.932 260.862 320.709 39
MatchingNet0.724 430.812 420.812 240.810 410.735 350.834 450.495 510.860 300.572 680.602 520.954 420.512 500.280 500.757 250.845 420.725 370.780 420.606 570.937 200.851 440.700 43
INS-Conv-semantic0.717 440.751 660.759 600.812 390.704 430.868 70.537 270.842 370.609 490.608 480.953 460.534 410.293 410.616 610.864 290.719 420.793 350.640 420.933 240.845 490.663 53
PointMetaBase0.714 450.835 330.785 450.821 330.684 500.846 310.531 310.865 280.614 440.596 560.953 460.500 530.246 700.674 420.888 210.692 550.764 540.624 490.849 900.844 500.675 49
contrastBoundarypermissive0.705 460.769 600.775 510.809 420.687 490.820 610.439 810.812 510.661 260.591 580.945 720.515 490.171 1000.633 550.856 340.720 400.796 310.668 330.889 600.847 460.689 45
Liyao Tang, Yibing Zhan, Zhe Chen, Baosheng Yu, Dacheng Tao: Contrastive Boundary Learning for Point Cloud Segmentation. CVPR2022
ClickSeg_Semantic0.703 470.774 550.800 320.793 540.760 190.847 300.471 590.802 540.463 1020.634 370.968 150.491 560.271 580.726 380.910 100.706 490.815 90.551 850.878 690.833 510.570 85
RFCR0.702 480.889 200.745 720.813 380.672 530.818 650.493 520.815 490.623 410.610 460.947 660.470 650.249 690.594 650.848 410.705 500.779 430.646 390.892 580.823 570.611 68
Jingyu Gong, Jiachen Xu, Xin Tan, Haichuan Song, Yanyun Qu, Yuan Xie, Lizhuang Ma: Omni-Supervised Point Cloud Segmentation via Gradual Receptive Field Component Reasoning. CVPR2021
One Thing One Click0.701 490.825 370.796 360.723 700.716 400.832 470.433 830.816 470.634 380.609 470.969 130.418 910.344 150.559 770.833 450.715 440.808 190.560 790.902 490.847 460.680 48
JSENetpermissive0.699 500.881 220.762 580.821 330.667 540.800 780.522 340.792 570.613 450.607 490.935 920.492 550.205 870.576 700.853 380.691 570.758 600.652 370.872 780.828 540.649 57
Zeyu HU, Mingmin Zhen, Xuyang BAI, Hongbo Fu, Chiew-lan Tai: JSENet: Joint Semantic Segmentation and Edge Detection Network for 3D Point Clouds. ECCV 2020
One-Thing-One-Click0.693 510.743 690.794 380.655 930.684 500.822 580.497 490.719 760.622 420.617 430.977 110.447 780.339 190.750 310.664 820.703 520.790 380.596 620.946 130.855 390.647 58
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
PicassoNet-IIpermissive0.692 520.732 740.772 520.786 550.677 520.866 90.517 360.848 330.509 880.626 390.952 510.536 380.225 770.545 830.704 720.689 600.810 160.564 780.903 480.854 410.729 25
Huan Lei, Naveed Akhtar, Mubarak Shah, and Ajmal Mian: Geometric feature learning for 3D meshes.
Feature_GeometricNetpermissive0.690 530.884 210.754 640.795 520.647 610.818 650.422 850.802 540.612 460.604 500.945 720.462 680.189 950.563 760.853 380.726 360.765 530.632 450.904 460.821 600.606 72
Kangcheng Liu, Ben M. Chen: https://arxiv.org/abs/2012.09439. arXiv Preprint
FusionNet0.688 540.704 870.741 760.754 670.656 560.829 500.501 440.741 710.609 490.548 660.950 570.522 470.371 50.633 550.756 600.715 440.771 490.623 500.861 860.814 630.658 54
Feihu Zhang, Jin Fang, Benjamin Wah, Philip Torr: Deep FusionNet for Point Cloud Semantic Segmentation. ECCV 2020
Feature-Geometry Netpermissive0.685 550.866 240.748 690.819 350.645 630.794 810.450 710.802 540.587 610.604 500.945 720.464 670.201 900.554 790.840 430.723 390.732 730.602 600.907 440.822 590.603 75
KP-FCNN0.684 560.847 300.758 620.784 570.647 610.814 680.473 580.772 600.605 510.594 570.935 920.450 760.181 980.587 660.805 530.690 580.785 410.614 530.882 650.819 610.632 64
H. Thomas, C. Qi, J. Deschaud, B. Marcotegui, F. Goulette, L. Guibas.: KPConv: Flexible and Deformable Convolution for Point Clouds. ICCV 2019
VACNN++0.684 560.728 770.757 630.776 600.690 460.804 760.464 640.816 470.577 670.587 590.945 720.508 520.276 530.671 430.710 700.663 680.750 660.589 670.881 660.832 530.653 56
DGNet0.684 560.712 860.784 460.782 590.658 550.835 440.499 480.823 460.641 350.597 550.950 570.487 580.281 490.575 710.619 860.647 760.764 540.620 520.871 810.846 480.688 46
Superpoint Network0.683 590.851 290.728 800.800 510.653 580.806 740.468 610.804 520.572 680.602 520.946 690.453 750.239 730.519 880.822 470.689 600.762 570.595 640.895 550.827 550.630 65
PointContrast_LA_SEM0.683 590.757 640.784 460.786 550.639 650.824 560.408 880.775 590.604 530.541 680.934 960.532 420.269 600.552 800.777 580.645 790.793 350.640 420.913 430.824 560.671 50
VI-PointConv0.676 610.770 590.754 640.783 580.621 690.814 680.552 190.758 640.571 710.557 640.954 420.529 430.268 620.530 860.682 760.675 630.719 760.603 590.888 610.833 510.665 52
Xingyi Li, Wenxuan Wu, Xiaoli Z. Fern, Li Fuxin: The Devils in the Point Clouds: Studying the Robustness of Point Cloud Convolutions.
ROSMRF3D0.673 620.789 460.748 690.763 650.635 670.814 680.407 900.747 680.581 650.573 610.950 570.484 590.271 580.607 620.754 610.649 730.774 460.596 620.883 640.823 570.606 72
SALANet0.670 630.816 400.770 550.768 620.652 590.807 730.451 680.747 680.659 290.545 670.924 1020.473 640.149 1100.571 730.811 520.635 830.746 670.623 500.892 580.794 770.570 85
O3DSeg0.668 640.822 380.771 540.496 1140.651 600.833 460.541 240.761 630.555 770.611 450.966 160.489 570.370 60.388 1070.580 900.776 180.751 640.570 730.956 80.817 620.646 59
PointConvpermissive0.666 650.781 500.759 600.699 780.644 640.822 580.475 570.779 580.564 740.504 850.953 460.428 850.203 890.586 680.754 610.661 690.753 630.588 680.902 490.813 650.642 60
Wenxuan Wu, Zhongang Qi, Li Fuxin: PointConv: Deep Convolutional Networks on 3D Point Clouds. CVPR 2019
PointASNLpermissive0.666 650.703 880.781 480.751 690.655 570.830 490.471 590.769 610.474 980.537 700.951 530.475 630.279 510.635 530.698 750.675 630.751 640.553 840.816 970.806 670.703 42
Xu Yan, Chaoda Zheng, Zhen Li, Sheng Wang, Shuguang Cui: PointASNL: Robust Point Clouds Processing using Nonlocal Neural Networks with Adaptive Sampling. CVPR 2020
PPCNN++permissive0.663 670.746 670.708 830.722 710.638 660.820 610.451 680.566 1040.599 560.541 680.950 570.510 510.313 310.648 490.819 500.616 880.682 910.590 660.869 820.810 660.656 55
Pyunghwan Ahn, Juyoung Yang, Eojindl Yi, Chanho Lee, Junmo Kim: Projection-based Point Convolution for Efficient Point Cloud Segmentation. IEEE Access
DCM-Net0.658 680.778 510.702 860.806 460.619 700.813 710.468 610.693 840.494 910.524 760.941 840.449 770.298 380.510 900.821 480.675 630.727 750.568 760.826 950.803 700.637 62
Jonas Schult*, Francis Engelmann*, Theodora Kontogianni, Bastian Leibe: DualConvMesh-Net: Joint Geodesic and Euclidean Convolutions on 3D Meshes. CVPR 2020 [Oral]
MVF-GNN0.658 680.558 1100.751 670.655 930.690 460.722 1030.453 670.867 250.579 660.576 600.893 1140.523 450.293 410.733 360.571 920.692 550.659 980.606 570.875 720.804 690.668 51
HPGCNN0.656 700.698 900.743 740.650 950.564 870.820 610.505 420.758 640.631 390.479 890.945 720.480 610.226 750.572 720.774 590.690 580.735 710.614 530.853 890.776 920.597 78
Jisheng Dang, Qingyong Hu, Yulan Guo, Jun Yang: HPGCNN.
SAFNet-segpermissive0.654 710.752 650.734 780.664 910.583 820.815 670.399 920.754 660.639 360.535 720.942 820.470 650.309 330.665 440.539 940.650 720.708 810.635 440.857 880.793 790.642 60
Linqing Zhao, Jiwen Lu, Jie Zhou: Similarity-Aware Fusion Network for 3D Semantic Segmentation. IROS 2021
RandLA-Netpermissive0.645 720.778 510.731 790.699 780.577 830.829 500.446 730.736 720.477 970.523 780.945 720.454 720.269 600.484 970.749 640.618 860.738 690.599 610.827 940.792 820.621 67
PointConv-SFPN0.641 730.776 530.703 850.721 720.557 900.826 530.451 680.672 890.563 750.483 880.943 810.425 880.162 1050.644 500.726 660.659 700.709 800.572 720.875 720.786 870.559 91
MVPNetpermissive0.641 730.831 340.715 810.671 880.590 780.781 870.394 940.679 860.642 340.553 650.937 890.462 680.256 660.649 470.406 1070.626 840.691 880.666 340.877 700.792 820.608 71
Maximilian Jaritz, Jiayuan Gu, Hao Su: Multi-view PointNet for 3D Scene Understanding. GMDL Workshop, ICCV 2019
PointMRNet0.640 750.717 840.701 870.692 810.576 840.801 770.467 630.716 770.563 750.459 950.953 460.429 840.169 1020.581 690.854 370.605 890.710 780.550 860.894 570.793 790.575 83
FPConvpermissive0.639 760.785 480.760 590.713 760.603 730.798 790.392 960.534 1090.603 540.524 760.948 640.457 700.250 680.538 840.723 680.598 930.696 860.614 530.872 780.799 720.567 88
Yiqun Lin, Zizheng Yan, Haibin Huang, Dong Du, Ligang Liu, Shuguang Cui, Xiaoguang Han: FPConv: Learning Local Flattening for Point Convolution. CVPR 2020
PD-Net0.638 770.797 440.769 560.641 1000.590 780.820 610.461 650.537 1080.637 370.536 710.947 660.388 980.206 860.656 450.668 800.647 760.732 730.585 700.868 830.793 790.473 111
PointSPNet0.637 780.734 730.692 940.714 750.576 840.797 800.446 730.743 700.598 570.437 1000.942 820.403 940.150 1090.626 570.800 560.649 730.697 850.557 820.846 910.777 910.563 89
SConv0.636 790.830 350.697 900.752 680.572 860.780 890.445 750.716 770.529 810.530 730.951 530.446 790.170 1010.507 920.666 810.636 820.682 910.541 920.886 620.799 720.594 79
Supervoxel-CNN0.635 800.656 960.711 820.719 730.613 710.757 980.444 780.765 620.534 800.566 620.928 1000.478 620.272 560.636 520.531 960.664 670.645 1020.508 1000.864 850.792 820.611 68
joint point-basedpermissive0.634 810.614 1040.778 490.667 900.633 680.825 540.420 860.804 520.467 1000.561 630.951 530.494 540.291 430.566 740.458 1020.579 990.764 540.559 810.838 920.814 630.598 77
Hung-Yueh Chiang, Yen-Liang Lin, Yueh-Cheng Liu, Winston H. Hsu: A Unified Point-Based Framework for 3D Segmentation. 3DV 2019
PointMTL0.632 820.731 750.688 970.675 850.591 770.784 860.444 780.565 1050.610 470.492 860.949 610.456 710.254 670.587 660.706 710.599 920.665 970.612 560.868 830.791 850.579 82
3DSM_DMMF0.631 830.626 1010.745 720.801 490.607 720.751 990.506 410.729 750.565 730.491 870.866 1170.434 800.197 930.595 640.630 850.709 470.705 830.560 790.875 720.740 1020.491 106
APCF-Net0.631 830.742 700.687 990.672 860.557 900.792 840.408 880.665 910.545 780.508 820.952 510.428 850.186 960.634 540.702 730.620 850.706 820.555 830.873 760.798 740.581 81
Haojia, Lin: Adaptive Pyramid Context Fusion for Point Cloud Perception. GRSL
PointNet2-SFPN0.631 830.771 570.692 940.672 860.524 960.837 410.440 800.706 820.538 790.446 970.944 780.421 900.219 800.552 800.751 630.591 950.737 700.543 910.901 510.768 940.557 92
FusionAwareConv0.630 860.604 1060.741 760.766 640.590 780.747 1000.501 440.734 730.503 900.527 740.919 1060.454 720.323 280.550 820.420 1060.678 620.688 890.544 890.896 540.795 760.627 66
Jiazhao Zhang, Chenyang Zhu, Lintao Zheng, Kai Xu: Fusion-Aware Point Convolution for Online Semantic 3D Scene Segmentation. CVPR 2020
DenSeR0.628 870.800 430.625 1090.719 730.545 930.806 740.445 750.597 990.448 1050.519 800.938 880.481 600.328 260.489 960.499 1010.657 710.759 590.592 650.881 660.797 750.634 63
SegGroup_sempermissive0.627 880.818 390.747 710.701 770.602 740.764 950.385 1000.629 960.490 930.508 820.931 990.409 930.201 900.564 750.725 670.618 860.692 870.539 930.873 760.794 770.548 95
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
dtc_net0.625 890.703 880.751 670.794 530.535 940.848 280.480 560.676 880.528 820.469 920.944 780.454 720.004 1220.464 990.636 840.704 510.758 600.548 880.924 310.787 860.492 105
SIConv0.625 890.830 350.694 920.757 660.563 880.772 930.448 720.647 940.520 840.509 810.949 610.431 830.191 940.496 940.614 870.647 760.672 950.535 960.876 710.783 880.571 84
Weakly-Openseg v30.625 890.924 90.787 440.620 1020.555 920.811 720.393 950.666 900.382 1130.520 790.953 460.250 1170.208 840.604 630.670 780.644 800.742 680.538 940.919 370.803 700.513 103
HPEIN0.618 920.729 760.668 1000.647 970.597 760.766 940.414 870.680 850.520 840.525 750.946 690.432 810.215 820.493 950.599 880.638 810.617 1070.570 730.897 530.806 670.605 74
Li Jiang, Hengshuang Zhao, Shu Liu, Xiaoyong Shen, Chi-Wing Fu, Jiaya Jia: Hierarchical Point-Edge Interaction Network for Point Cloud Semantic Segmentation. ICCV 2019
SPH3D-GCNpermissive0.610 930.858 280.772 520.489 1150.532 950.792 840.404 910.643 950.570 720.507 840.935 920.414 920.046 1190.510 900.702 730.602 910.705 830.549 870.859 870.773 930.534 98
Huan Lei, Naveed Akhtar, and Ajmal Mian: Spherical Kernel for Efficient Graph Convolution on 3D Point Clouds. TPAMI 2020
AttAN0.609 940.760 620.667 1010.649 960.521 970.793 820.457 660.648 930.528 820.434 1020.947 660.401 950.153 1080.454 1000.721 690.648 750.717 770.536 950.904 460.765 950.485 107
Gege Zhang, Qinghua Ma, Licheng Jiao, Fang Liu and Qigong Sun: AttAN: Attention Adversarial Networks for 3D Point Cloud Semantic Segmentation. IJCAI2020
wsss-transformer0.600 950.634 1000.743 740.697 800.601 750.781 870.437 820.585 1020.493 920.446 970.933 970.394 960.011 1210.654 460.661 830.603 900.733 720.526 970.832 930.761 970.480 108
LAP-D0.594 960.720 820.692 940.637 1010.456 1060.773 920.391 980.730 740.587 610.445 990.940 860.381 990.288 440.434 1030.453 1040.591 950.649 1000.581 710.777 1010.749 1010.610 70
DPC0.592 970.720 820.700 880.602 1060.480 1020.762 970.380 1010.713 800.585 640.437 1000.940 860.369 1010.288 440.434 1030.509 1000.590 970.639 1050.567 770.772 1020.755 990.592 80
Francis Engelmann, Theodora Kontogianni, Bastian Leibe: Dilated Point Convolutions: On the Receptive Field Size of Point Convolutions on 3D Point Clouds. ICRA 2020
CCRFNet0.589 980.766 610.659 1040.683 830.470 1050.740 1020.387 990.620 980.490 930.476 900.922 1040.355 1040.245 710.511 890.511 990.571 1000.643 1030.493 1040.872 780.762 960.600 76
ROSMRF0.580 990.772 560.707 840.681 840.563 880.764 950.362 1030.515 1100.465 1010.465 940.936 910.427 870.207 850.438 1010.577 910.536 1030.675 940.486 1050.723 1080.779 890.524 100
SD-DETR0.576 1000.746 670.609 1130.445 1190.517 980.643 1140.366 1020.714 790.456 1030.468 930.870 1160.432 810.264 630.558 780.674 770.586 980.688 890.482 1060.739 1060.733 1040.537 97
SQN_0.1%0.569 1010.676 920.696 910.657 920.497 990.779 900.424 840.548 1060.515 860.376 1070.902 1130.422 890.357 100.379 1080.456 1030.596 940.659 980.544 890.685 1110.665 1150.556 93
TextureNetpermissive0.566 1020.672 940.664 1020.671 880.494 1000.719 1040.445 750.678 870.411 1110.396 1050.935 920.356 1030.225 770.412 1050.535 950.565 1010.636 1060.464 1080.794 1000.680 1120.568 87
Jingwei Huang, Haotian Zhang, Li Yi, Thomas Funkerhouser, Matthias Niessner, Leonidas Guibas: TextureNet: Consistent Local Parametrizations for Learning from High-Resolution Signals on Meshes. CVPR
DVVNet0.562 1030.648 970.700 880.770 610.586 810.687 1080.333 1070.650 920.514 870.475 910.906 1100.359 1020.223 790.340 1100.442 1050.422 1140.668 960.501 1010.708 1090.779 890.534 98
Pointnet++ & Featurepermissive0.557 1040.735 720.661 1030.686 820.491 1010.744 1010.392 960.539 1070.451 1040.375 1080.946 690.376 1000.205 870.403 1060.356 1100.553 1020.643 1030.497 1020.824 960.756 980.515 101
GMLPs0.538 1050.495 1150.693 930.647 970.471 1040.793 820.300 1100.477 1110.505 890.358 1090.903 1120.327 1070.081 1160.472 980.529 970.448 1120.710 780.509 980.746 1040.737 1030.554 94
PanopticFusion-label0.529 1060.491 1160.688 970.604 1050.386 1110.632 1150.225 1210.705 830.434 1080.293 1150.815 1190.348 1050.241 720.499 930.669 790.507 1050.649 1000.442 1140.796 990.602 1190.561 90
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
subcloud_weak0.516 1070.676 920.591 1160.609 1030.442 1070.774 910.335 1060.597 990.422 1100.357 1100.932 980.341 1060.094 1150.298 1120.528 980.473 1100.676 930.495 1030.602 1170.721 1070.349 119
Online SegFusion0.515 1080.607 1050.644 1070.579 1080.434 1080.630 1160.353 1040.628 970.440 1060.410 1030.762 1220.307 1090.167 1030.520 870.403 1080.516 1040.565 1100.447 1120.678 1120.701 1090.514 102
Davide Menini, Suryansh Kumar, Martin R. Oswald, Erik Sandstroem, Cristian Sminchisescu, Luc van Gool: A Real-Time Learning Framework for Joint 3D Reconstruction and Semantic Segmentation. Robotics and Automation Letters Submission
3DMV, FTSDF0.501 1090.558 1100.608 1140.424 1210.478 1030.690 1070.246 1170.586 1010.468 990.450 960.911 1080.394 960.160 1060.438 1010.212 1170.432 1130.541 1150.475 1070.742 1050.727 1050.477 109
PCNN0.498 1100.559 1090.644 1070.560 1100.420 1100.711 1060.229 1190.414 1120.436 1070.352 1110.941 840.324 1080.155 1070.238 1170.387 1090.493 1060.529 1160.509 980.813 980.751 1000.504 104
3DMV0.484 1110.484 1170.538 1190.643 990.424 1090.606 1190.310 1080.574 1030.433 1090.378 1060.796 1200.301 1100.214 830.537 850.208 1180.472 1110.507 1190.413 1170.693 1100.602 1190.539 96
Angela Dai, Matthias Niessner: 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation. ECCV'18
PointCNN with RGBpermissive0.458 1120.577 1080.611 1120.356 1230.321 1190.715 1050.299 1120.376 1160.328 1190.319 1130.944 780.285 1120.164 1040.216 1200.229 1150.484 1080.545 1140.456 1100.755 1030.709 1080.475 110
Yangyan Li, Rui Bu, Mingchao Sun, Baoquan Chen: PointCNN. NeurIPS 2018
FCPNpermissive0.447 1130.679 910.604 1150.578 1090.380 1120.682 1090.291 1130.106 1230.483 960.258 1210.920 1050.258 1160.025 1200.231 1190.325 1110.480 1090.560 1120.463 1090.725 1070.666 1140.231 123
Dario Rethage, Johanna Wald, Jürgen Sturm, Nassir Navab, Federico Tombari: Fully-Convolutional Point Networks for Large-Scale Point Clouds. ECCV 2018
DGCNN_reproducecopyleft0.446 1140.474 1180.623 1100.463 1170.366 1140.651 1120.310 1080.389 1150.349 1170.330 1120.937 890.271 1140.126 1120.285 1130.224 1160.350 1190.577 1090.445 1130.625 1150.723 1060.394 115
Yue Wang, Yongbin Sun, Ziwei Liu, Sanjay E. Sarma, Michael M. Bronstein, Justin M. Solomon: Dynamic Graph CNN for Learning on Point Clouds. TOG 2019
PNET20.442 1150.548 1120.548 1180.597 1070.363 1150.628 1170.300 1100.292 1180.374 1140.307 1140.881 1150.268 1150.186 960.238 1170.204 1190.407 1150.506 1200.449 1110.667 1130.620 1180.462 113
SurfaceConvPF0.442 1150.505 1140.622 1110.380 1220.342 1170.654 1110.227 1200.397 1140.367 1150.276 1170.924 1020.240 1180.198 920.359 1090.262 1130.366 1160.581 1080.435 1150.640 1140.668 1130.398 114
Hao Pan, Shilin Liu, Yang Liu, Xin Tong: Convolutional Neural Networks on 3D Surfaces Using Parallel Frames.
Tangent Convolutionspermissive0.438 1170.437 1200.646 1060.474 1160.369 1130.645 1130.353 1040.258 1200.282 1220.279 1160.918 1070.298 1110.147 1110.283 1140.294 1120.487 1070.562 1110.427 1160.619 1160.633 1170.352 118
Maxim Tatarchenko, Jaesik Park, Vladlen Koltun, Qian-Yi Zhou: Tangent convolutions for dense prediction in 3d. CVPR 2018
3DWSSS0.425 1180.525 1130.647 1050.522 1110.324 1180.488 1230.077 1240.712 810.353 1160.401 1040.636 1240.281 1130.176 990.340 1100.565 930.175 1230.551 1130.398 1180.370 1240.602 1190.361 117
SPLAT Netcopyleft0.393 1190.472 1190.511 1200.606 1040.311 1200.656 1100.245 1180.405 1130.328 1190.197 1220.927 1010.227 1200.000 1240.001 1250.249 1140.271 1220.510 1170.383 1200.593 1180.699 1100.267 121
Hang Su, Varun Jampani, Deqing Sun, Subhransu Maji, Evangelos Kalogerakis, Ming-Hsuan Yang, Jan Kautz: SPLATNet: Sparse Lattice Networks for Point Cloud Processing. CVPR 2018
ScanNet+FTSDF0.383 1200.297 1220.491 1210.432 1200.358 1160.612 1180.274 1150.116 1220.411 1110.265 1180.904 1110.229 1190.079 1170.250 1150.185 1200.320 1200.510 1170.385 1190.548 1190.597 1220.394 115
PointNet++permissive0.339 1210.584 1070.478 1220.458 1180.256 1220.360 1240.250 1160.247 1210.278 1230.261 1200.677 1230.183 1210.117 1130.212 1210.145 1220.364 1170.346 1240.232 1240.548 1190.523 1230.252 122
Charles R. Qi, Li Yi, Hao Su, Leonidas J. Guibas: pointnet++: deep hierarchical feature learning on point sets in a metric space.
GrowSP++0.323 1220.114 1240.589 1170.499 1130.147 1240.555 1200.290 1140.336 1170.290 1210.262 1190.865 1180.102 1240.000 1240.037 1230.000 1250.000 1250.462 1210.381 1210.389 1230.664 1160.473 111
SSC-UNetpermissive0.308 1230.353 1210.290 1240.278 1240.166 1230.553 1210.169 1230.286 1190.147 1240.148 1240.908 1090.182 1220.064 1180.023 1240.018 1240.354 1180.363 1220.345 1220.546 1210.685 1110.278 120
ScanNetpermissive0.306 1240.203 1230.366 1230.501 1120.311 1200.524 1220.211 1220.002 1250.342 1180.189 1230.786 1210.145 1230.102 1140.245 1160.152 1210.318 1210.348 1230.300 1230.460 1220.437 1240.182 124
Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner: ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. CVPR'17
ERROR0.054 1250.000 1250.041 1250.172 1250.030 1250.062 1250.001 1250.035 1240.004 1250.051 1250.143 1250.019 1250.003 1230.041 1220.050 1230.003 1240.054 1250.018 1250.005 1250.264 1250.082 125


This table lists the benchmark results for the 3D semantic instance scenario.




Method Infoavg apbathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Volt-SPFormerScanNetpermissive0.640 10.920 120.665 120.634 10.582 10.794 50.242 190.559 210.496 10.535 20.646 10.737 20.709 60.731 290.509 40.796 20.566 160.944 100.457 2
Kadir Yilmaz, Adrian Kruse, Tristan Höfer, Daan de Geus, Bastian Leibe: Volume Transformer: Revisiting Vanilla Transformers for 3D Scene Understanding.
PointComp0.629 20.787 260.679 100.574 60.502 40.824 10.378 10.480 400.483 40.480 160.601 20.744 10.682 90.809 80.460 210.819 10.643 20.935 140.449 4
PointRel0.622 30.926 70.710 30.541 120.502 30.772 100.314 50.598 110.425 110.504 120.565 40.650 90.716 20.809 70.476 130.747 70.618 30.963 30.364 22
: Relation3D: Enhancing Relation Modeling for Point Cloud Instance Segmentation. CVPR 2025
Competitor-MAFT0.618 40.866 160.724 10.628 20.484 60.803 30.300 90.509 330.496 20.539 10.547 80.703 30.668 100.708 350.463 180.708 200.595 40.959 50.418 9
SIM3D0.617 50.952 30.629 200.539 130.426 180.768 140.302 80.681 20.425 120.473 180.511 180.701 40.717 10.821 60.467 160.774 30.559 170.914 210.448 5
Spherical Mask(CtoF)0.616 60.946 40.654 150.555 80.434 150.769 130.271 140.604 80.447 60.505 100.549 50.698 50.716 20.775 170.480 100.747 80.575 120.925 160.436 7
EV3D0.615 70.946 40.652 160.555 80.433 160.773 90.271 150.604 80.447 60.506 90.544 90.698 50.716 20.775 170.480 100.747 80.572 140.925 160.435 8
DCD0.614 80.892 130.633 190.434 300.495 50.810 20.292 100.501 340.408 130.525 60.582 30.688 70.625 120.801 90.608 10.672 230.649 10.965 20.476 1
ExtMask3D0.598 90.852 170.692 80.433 330.461 90.791 60.264 160.488 370.493 30.508 80.528 170.594 150.706 70.791 110.483 80.734 120.595 50.911 230.437 6
MAFT0.596 100.889 140.721 20.448 250.460 100.768 150.251 180.558 220.408 140.504 110.539 110.616 120.618 140.858 30.482 90.684 220.551 200.931 150.450 3
MG-Former0.587 110.852 170.639 180.454 240.393 240.758 180.338 30.572 160.480 50.527 40.491 250.671 80.527 240.867 10.485 70.601 340.590 80.938 130.390 14
InsSSM0.586 121.000 10.593 240.440 280.480 70.771 110.345 20.437 430.444 90.495 150.548 70.579 190.621 130.720 310.409 250.712 150.593 60.960 40.395 12
Lei Yao, Yi Wang, Moyun Liu, Lap-Pui Chau: SGIFormer: Semantic-guided and Geometric-enhanced Interleaving Transformer for 3D Instance Segmentation. TCSVT, 2024
Queryformer0.583 130.926 70.702 50.393 390.504 20.733 240.276 130.527 280.373 200.479 170.534 130.533 260.697 80.720 320.436 230.745 100.592 70.958 60.363 23
KmaxOneFormerNetpermissive0.581 140.745 310.692 90.551 100.458 110.798 40.264 170.531 270.369 220.513 70.531 160.632 100.494 270.798 100.567 30.648 270.558 190.950 80.362 25
Competitor-SPFormer0.580 150.721 380.705 40.593 50.444 140.786 80.286 110.564 190.376 190.498 140.534 140.546 240.390 480.785 130.577 20.708 190.579 100.954 70.388 15
VDG-Uni3DSeg0.576 160.833 210.699 60.483 180.412 220.767 160.313 60.461 420.446 80.526 50.498 230.584 160.551 200.743 260.464 170.766 40.538 240.919 190.363 24
PBNetpermissive0.573 170.926 70.575 300.619 30.472 80.736 220.239 210.487 380.383 180.459 220.506 210.533 250.585 160.767 190.404 270.717 140.559 180.969 10.381 18
Weiguang Zhao, Yuyao Yan, Chaolong Yang, Jianan Ye, Xi Yang, Kaizhu Huang: Divide and Conquer: 3D Instance Segmentation With Point-Wise Binarization. ICCV 2023
TST3D0.569 180.778 280.675 110.598 40.451 130.727 250.280 120.476 410.395 150.472 190.457 310.583 170.580 180.777 140.462 200.735 110.547 220.919 200.333 31
Duc Tran Dang Trung, Byeongkeun Kang, Yeejin Lee: MSTA3D: Multi-scale Twin-attention for 3D Instance Segmentation. ACM Multimedia 2024
Mask3D0.566 190.926 70.597 230.408 360.420 200.737 210.239 200.598 110.386 170.458 230.549 50.568 220.716 20.601 480.480 100.646 280.575 120.922 180.364 21
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
OneFormer3Dcopyleft0.566 190.781 270.697 70.562 70.431 170.770 120.331 40.400 490.373 210.529 30.504 220.568 210.475 310.732 280.470 140.762 50.550 210.871 380.379 19
Maxim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: OneFormer3D: One Transformer for Unified Point Cloud Segmentation.
UniPerception0.560 210.815 220.659 130.388 410.453 120.786 70.212 220.526 290.441 100.471 200.539 100.607 130.442 360.671 420.406 260.731 130.577 110.944 110.411 10
ISBNetpermissive0.559 220.939 60.655 140.383 430.426 190.763 170.180 240.534 260.386 160.499 130.509 200.621 110.427 420.704 370.467 150.649 260.571 150.948 90.401 11
Tuan Duc Ngo, Binh-Son Hua, Khoi Nguyen: ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution. CVPR 2023
GraphCut0.552 231.000 10.611 220.438 290.392 250.714 260.139 280.598 130.327 260.389 260.510 190.598 140.427 430.754 220.463 190.761 60.588 90.903 260.329 33
SPFormerpermissive0.549 240.745 310.640 170.484 170.395 230.739 200.311 70.566 180.335 240.468 210.492 240.555 230.478 300.747 240.436 220.712 160.540 230.893 300.343 30
Sun Jiahao, Qing Chunmei, Tan Junpeng, Xu Xiangmin: Superpoint Transformer for 3D Scene Instance Segmentation. AAAI 2023 [Oral]
DKNet0.532 250.815 220.624 210.517 140.377 270.749 190.107 300.509 320.304 280.437 240.475 260.581 180.539 220.775 160.339 330.640 300.506 270.901 270.385 17
Yizheng Wu, Min Shi, Shuaiyuan Du, Hao Lu, Zhiguo Cao, Weicai Zhong: 3D Instances as 1D Kernels. ECCV 2022
IPCA-Inst0.520 260.889 140.551 340.548 110.418 210.665 360.064 390.585 140.260 360.277 410.471 280.500 270.644 110.785 120.369 290.591 380.511 250.878 350.362 26
SoftGroup++0.513 270.704 400.578 290.398 380.363 330.704 270.061 400.647 50.297 330.378 290.537 120.343 310.614 150.828 50.295 380.710 180.505 290.875 370.394 13
SSTNetpermissive0.506 280.738 350.549 350.497 160.316 390.693 300.178 250.377 530.198 420.330 320.463 300.576 200.515 250.857 40.494 50.637 310.457 330.943 120.290 42
Zhihao Liang, Zhihao Li, Songcen Xu, Mingkui Tan, Kui Jia: Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks. ICCV2021
DANCENET0.504 290.926 70.579 260.472 200.367 300.626 460.165 260.432 440.221 380.408 250.449 330.411 290.564 190.746 250.421 240.707 210.438 360.846 460.288 43
SoftGrouppermissive0.504 290.667 470.579 270.372 450.381 260.694 290.072 360.677 30.303 290.387 270.531 150.319 350.582 170.754 210.318 340.643 290.492 300.907 250.388 16
Thang Vu, Kookhoi Kim, Tung M. Luu, Xuan Thanh Nguyen, Chang D. Yoo: SoftGroup for 3D Instance Segmentaiton on Point Clouds. CVPR 2022 [Oral]
TD3Dpermissive0.489 310.852 170.511 440.434 310.322 380.735 230.101 330.512 310.355 230.349 310.468 290.283 390.514 260.676 410.268 430.671 240.510 260.908 240.329 34
Maksim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: Top-Down Beats Bottom-Up in 3D Instance Segmentation. WACV 2024
OccuSeg+instance0.486 320.802 250.536 370.428 340.369 290.702 280.205 230.331 580.301 300.379 280.474 270.327 320.437 370.862 20.485 60.601 350.394 440.846 480.273 46
Lei Han, Tian Zheng, Lan Xu, Lu Fang: OccuSeg: Occupancy-aware 3D Instance Segmentation. CVPR2020
TopoSeg0.479 330.704 400.564 310.467 220.366 310.633 440.068 370.554 230.262 350.328 330.447 340.323 330.534 230.722 300.288 400.614 320.482 310.912 220.358 28
DualGroup0.469 340.815 220.552 330.398 370.374 280.683 320.130 290.539 250.310 270.327 340.407 370.276 400.447 350.535 520.342 320.659 250.455 340.900 290.301 38
SSEC0.465 350.667 470.578 280.502 150.362 340.641 430.035 490.605 70.291 340.323 350.451 320.296 370.417 460.677 400.245 470.501 560.506 280.900 280.366 20
ODIN - Inspermissive0.463 360.738 350.589 250.344 490.358 350.560 550.139 270.393 520.331 250.373 300.392 400.496 280.493 280.709 340.377 280.599 360.359 500.752 580.332 32
Ayush Jain, Pushkal Katara, Nikolaos Gkanatsios, Adam W. Harley, Gabriel Sarch, Kriti Aggarwal, Vishrav Chaudhary, Katerina Fragkiadaki: ODIN: A Single Model for 2D and 3D Segmentation. CVPR 2024
HAISpermissive0.457 370.704 400.561 320.457 230.364 320.673 330.046 480.547 240.194 430.308 360.426 350.288 380.454 340.711 330.262 440.563 460.434 380.889 320.344 29
Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang: Hierarchical Aggregation for 3D Instance Segmentation. ICCV 2021
DD-UNet+Group0.436 380.630 550.508 470.480 190.310 410.624 480.065 380.638 60.174 440.256 450.384 420.194 520.428 400.759 200.289 390.574 430.400 420.849 450.291 41
H. Liu, R. Liu, K. Yang, J. Zhang, K. Peng, R. Stiefelhagen: HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor. ICCVW 2021
INS-Conv-instance0.435 390.716 390.495 490.355 470.331 360.689 310.102 320.394 510.208 410.280 390.395 390.250 430.544 210.741 270.309 360.536 520.391 450.842 510.258 50
Mask-Group0.434 400.778 280.516 420.471 210.330 370.658 370.029 510.526 300.249 370.256 440.400 380.309 360.384 510.296 680.368 300.575 420.425 390.877 360.362 27
Min Zhong, Xinghao Chen, Xiaokang Chen, Gang Zeng, Yunhe Wang: MaskGroup: Hierarchical Point Grouping and Masking for 3D Instance Segmentation. ICME 2022
Box2Mask0.433 410.741 330.463 540.433 320.283 440.625 470.103 310.298 630.125 530.260 430.424 360.322 340.472 320.701 380.363 310.711 170.309 620.882 330.272 48
Julian Chibane, Francis Engelmann, Tuan Anh Tran, Gerard Pons-Moll: Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation Using Bounding Boxes. ECCV 2022
RPGN0.428 420.630 550.508 460.367 460.249 510.658 380.016 590.673 40.131 510.234 480.383 430.270 410.434 380.748 230.274 420.609 330.406 410.842 500.267 49
Shichao Dong, Guosheng Lin, Tzu-Yi Hung: Learning Regional Purity for Instance Segmentation on 3D Point Clouds. ECCV 2022
DENet0.413 430.741 330.520 390.237 590.284 430.523 580.097 340.691 10.138 480.209 580.229 600.238 460.390 490.707 360.310 350.448 630.470 320.892 310.310 36
PointGroup0.407 440.639 540.496 480.415 350.243 530.645 420.021 560.570 170.114 540.211 560.359 450.217 500.428 410.660 430.256 450.562 470.341 540.860 410.291 40
Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia: PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation. CVPR 2020 [oral]
CSC-Pretrained0.405 450.738 350.465 530.331 520.205 570.655 390.051 440.601 100.092 580.211 570.329 480.198 510.459 330.775 150.195 540.524 540.400 430.878 340.184 59
PE0.396 460.667 470.467 520.446 270.243 520.624 490.022 550.577 150.106 550.219 510.340 460.239 450.487 290.475 590.225 490.541 510.350 520.818 530.273 47
Biao Zhang, Peter Wonka: Point Cloud Instance Segmentation using Probabilistic Embeddings. CVPR 2021
Dyco3Dcopyleft0.395 470.642 530.518 410.447 260.259 500.666 350.050 450.251 680.166 450.231 490.362 440.232 470.331 540.535 510.229 480.587 390.438 370.850 430.317 35
Tong He; Chunhua Shen; Anton van den Hengel: DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution. CVPR2021
OSIS0.392 480.778 280.530 380.220 610.278 450.567 540.083 350.330 590.299 310.270 420.310 510.143 580.260 580.624 460.277 410.568 450.361 490.865 400.301 37
AOIA0.387 490.704 400.515 430.385 420.225 560.669 340.005 660.482 390.126 520.181 610.269 570.221 490.426 440.478 580.218 500.592 370.371 470.851 420.242 52
SSEN0.384 500.852 170.494 500.192 620.226 550.648 410.022 540.398 500.299 320.277 400.317 500.231 480.194 650.514 550.196 520.586 400.444 350.843 490.184 58
Dongsu Zhang, Junha Chun, Sang Kyun Cha, Young Min Kim: Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with Deep Metric Learning. Arxiv
Mask3D_evaluation0.382 510.593 570.520 400.390 400.314 400.600 500.018 580.287 660.151 470.281 380.387 410.169 560.429 390.654 440.172 580.578 410.384 460.670 650.278 45
PCJC0.375 520.704 400.542 360.284 560.197 590.649 400.006 630.426 450.138 490.242 460.304 520.183 550.388 500.629 450.141 650.546 500.344 530.738 600.283 44
ClickSeg_Instance0.366 530.654 510.375 580.184 630.302 420.592 520.050 460.300 620.093 570.283 370.277 540.249 440.426 450.615 470.299 370.504 550.367 480.832 520.191 57
SphereSeg0.357 540.651 520.411 560.345 480.264 490.630 450.059 410.289 650.212 390.240 470.336 470.158 570.305 550.557 490.159 610.455 620.341 550.726 620.294 39
3D-MPA0.355 550.457 670.484 510.299 540.277 460.591 530.047 470.332 560.212 400.217 520.278 530.193 530.413 470.410 620.195 530.574 440.352 510.849 440.213 55
Francis Engelmann, Martin Bokeloh, Alireza Fathi, Bastian Leibe, Matthias Nießner: 3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation. CVPR 2020
NeuralBF0.353 560.593 570.511 450.375 440.264 480.597 510.008 610.332 570.160 460.229 500.274 560.000 790.206 620.678 390.155 620.485 580.422 400.816 540.254 51
Weiwei Sun, Daniel Rebain, Renjie Liao, Vladimir Tankovich, Soroosh Yazdani, Kwang Moo Yi, Andrea Tagliasacchi: NeuralBF: Neural Bilateral Filtering for Top-down Instance Segmentation on Point Clouds. WACV 2023
RWSeg0.348 570.475 640.456 550.320 530.275 470.476 600.020 570.491 360.056 650.212 550.320 490.261 420.302 560.520 530.182 560.557 480.285 640.867 390.197 56
GICN0.341 580.580 590.371 590.344 500.198 580.469 610.052 430.564 200.093 560.212 540.212 620.127 600.347 530.537 500.206 510.525 530.329 570.729 610.241 53
One_Thing_One_Clickpermissive0.326 590.472 650.361 600.232 600.183 600.555 560.000 720.498 350.038 670.195 590.226 610.362 300.168 660.469 600.251 460.553 490.335 560.846 470.117 67
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
Occipital-SCS0.320 600.679 460.352 610.334 510.229 540.436 620.025 520.412 480.058 630.161 660.240 590.085 620.262 570.496 570.187 550.467 600.328 580.775 550.231 54
Sparse R-CNN0.292 610.704 400.213 710.153 650.154 620.551 570.053 420.212 690.132 500.174 630.274 550.070 640.363 520.441 610.176 570.424 650.234 660.758 570.161 63
MTML0.282 620.577 600.380 570.182 640.107 680.430 630.001 690.422 460.057 640.179 620.162 650.070 650.229 600.511 560.161 590.491 570.313 590.650 680.162 61
Jean Lahoud, Bernard Ghanem, Marc Pollefeys, Martin R. Oswald: 3D Instance Segmentation via Multi-task Metric Learning. ICCV 2019 [oral]
SALoss-ResNet0.262 630.667 470.335 620.067 720.123 660.427 640.022 530.280 670.058 620.216 530.211 630.039 680.142 680.519 540.106 690.338 690.310 610.721 630.138 64
Zhidong Liang, Ming Yang, Hao Li, Chunxiang Wang: 3D Instance Embedding Learning With a Structure-Aware Loss Function for Point Cloud Segmentation. IEEE Robotics and Automation Letters (IROS2020)
MASCpermissive0.254 640.463 660.249 700.113 660.167 610.412 660.000 710.374 540.073 590.173 640.243 580.130 590.228 610.368 640.160 600.356 670.208 670.711 640.136 65
Chen Liu, Yasutaka Furukawa: MASC: Multi-scale Affinity with Sparse Convolution for 3D Instance Segmentation.
3D-BoNet0.253 650.519 620.324 650.251 580.137 650.345 710.031 500.419 470.069 600.162 650.131 670.052 660.202 640.338 660.147 640.301 720.303 630.651 670.178 60
Bo Yang, Jianan Wang, Ronald Clark, Qingyong Hu, Sen Wang, Andrew Markham, Niki Trigoni: Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds. NeurIPS 2019 Spotlight
SPG_WSIS0.251 660.380 690.274 680.289 550.144 630.413 650.000 720.311 600.065 610.113 680.130 680.029 710.204 630.388 630.108 680.459 610.311 600.769 560.127 66
SegGroup_inspermissive0.246 670.556 610.335 630.062 740.115 670.490 590.000 720.297 640.018 710.186 600.142 660.083 630.233 590.216 700.153 630.469 590.251 650.744 590.083 70
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
PanopticFusion-inst0.214 680.250 740.330 640.275 570.103 690.228 770.000 720.345 550.024 690.088 700.203 640.186 540.167 670.367 650.125 660.221 750.112 770.666 660.162 62
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
UNet-backbone0.161 690.519 620.259 690.084 680.059 710.325 730.002 670.093 740.009 730.077 720.064 710.045 670.044 750.161 720.045 710.331 700.180 690.566 690.033 79
3D-SISpermissive0.161 690.407 680.155 760.068 710.043 750.346 700.001 680.134 710.005 740.088 690.106 700.037 690.135 700.321 670.028 750.339 680.116 760.466 720.093 69
Ji Hou, Angela Dai, Matthias Niessner: 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans. CVPR 2019
R-PointNet0.158 710.356 700.173 740.113 670.140 640.359 670.012 600.023 770.039 660.134 670.123 690.008 750.089 710.149 730.117 670.221 740.128 740.563 700.094 68
Region-18class0.146 720.175 780.321 660.080 690.062 700.357 680.000 720.307 610.002 760.066 730.044 730.000 790.018 770.036 780.054 700.447 640.133 720.472 710.060 74
SemRegionNet-20cls0.121 730.296 720.203 720.071 700.058 720.349 690.000 720.150 700.019 700.054 750.034 760.017 740.052 730.042 770.013 780.209 760.183 680.371 730.057 75
3D-BEVIS0.117 740.250 740.308 670.020 780.009 800.269 760.006 640.008 780.029 680.037 780.014 790.003 770.036 760.147 740.042 730.381 660.118 750.362 740.069 73
Cathrin Elich, Francis Engelmann, Jonas Schult, Theodora Kontogianni, Bastian Leibe: 3D-BEVIS: Birds-Eye-View Instance Segmentation.
Hier3Dcopyleft0.117 740.222 760.161 750.054 760.027 770.289 740.000 720.124 720.001 780.079 710.061 720.027 720.141 690.240 690.005 790.310 710.129 730.153 790.081 71
Tan: HCFS3D: Hierarchical Coupled Feature Selection Network for 3D Semantic and Instance Segmentation.
tmp0.113 760.333 710.151 770.056 750.053 730.344 720.000 720.105 730.016 720.049 760.035 750.020 730.053 720.048 760.013 770.183 780.173 700.344 760.054 76
Sem_Recon_ins0.098 770.295 730.187 730.015 790.036 760.213 780.005 650.038 760.003 750.056 740.037 740.036 700.015 780.051 750.044 720.209 770.098 780.354 750.071 72
ASIS0.085 780.037 790.080 790.066 730.047 740.282 750.000 720.052 750.002 770.047 770.026 770.001 780.046 740.194 710.031 740.264 730.140 710.167 780.047 78
Sgpn_scannet0.049 790.023 800.134 780.031 770.013 790.144 790.006 620.008 790.000 790.028 790.017 780.003 760.009 800.000 790.021 760.122 790.095 790.175 770.054 77
MaskRCNN 2d->3d Proj0.022 800.185 770.000 800.000 800.015 780.000 800.000 700.006 800.000 790.010 800.006 800.107 610.012 790.000 790.002 800.027 800.004 800.022 800.001 80


This table lists the benchmark results for the 2D semantic label scenario.


Method Infoavg ioubathtubbedbookshelfcabinetchaircountercurtaindeskdoorfloorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwallwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Virtual MVFusion (R)0.745 10.861 10.839 10.881 10.672 20.512 10.422 190.898 10.723 10.714 10.954 20.454 10.509 10.773 10.895 10.756 10.820 10.653 10.935 10.891 10.728 1
Abhijit Kundu, Xiaoqi Yin, Alireza Fathi, David Ross, Brian Brewington, Thomas Funkhouser, Caroline Pantofaru: Virtual Multi-view Fusion for 3D Semantic Segmentation. ECCV 2020
BPNet_2Dcopyleft0.670 20.822 30.795 30.836 20.659 30.481 20.451 150.769 50.656 30.567 40.931 30.395 60.390 60.700 40.534 40.689 110.770 20.574 30.865 110.831 30.675 6
Wenbo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia and Tien-Tsin Wong: Bidirectional Projection Network for Cross Dimension Scene Understanding. CVPR 2021 (Oral)
MVF-GNN(2D)0.636 30.606 160.794 40.434 170.688 10.337 80.464 140.798 40.632 50.589 30.908 90.420 20.329 140.743 20.594 20.738 20.676 50.527 40.906 20.818 60.715 3
CU-Hybrid-2D Net0.636 30.825 20.820 20.179 250.648 40.463 30.549 20.742 90.676 20.628 20.961 10.420 20.379 70.684 80.381 200.732 30.723 30.599 20.827 180.851 20.634 9
DVEFormer0.626 50.616 120.764 60.690 50.583 110.322 140.540 30.809 30.593 70.502 120.900 140.374 90.433 30.660 90.528 50.665 190.663 60.491 90.871 100.810 90.705 4
Fischedick, S., Seichter, D., Stephan, B., Schmidt, R., Gross, H.-M.: DVEFormer: Efficient Prediction of Dense Visual Embeddings via Distillation and RGB-D Transformers. IROS 2025
CMX0.613 60.681 90.725 130.502 130.634 60.297 190.478 120.830 20.651 40.537 70.924 40.375 70.315 160.686 70.451 150.714 50.543 230.504 60.894 70.823 50.688 5
DMMF_3d0.605 70.651 100.744 110.782 30.637 50.387 40.536 50.732 100.590 80.540 60.856 230.359 120.306 170.596 160.539 30.627 220.706 40.497 80.785 230.757 210.476 24
EMSANet0.600 80.716 40.746 100.395 200.614 90.382 50.523 60.713 130.571 120.503 100.922 70.404 50.397 50.655 100.400 170.626 230.663 60.469 140.900 40.827 40.577 16
Seichter, Daniel and Fischedick, Söhnke and Köhler, Mona and Gross, Horst-Michael: EMSANet: Efficient Multi-Task RGB-D Scene Analysis for Indoor Environments. IJCNN 2022
MCA-Net0.595 90.533 220.756 90.746 40.590 100.334 100.506 90.670 170.587 90.500 130.905 110.366 110.352 100.601 150.506 90.669 170.648 100.501 70.839 170.769 170.516 23
RFBNet0.592 100.616 120.758 80.659 60.581 120.330 110.469 130.655 200.543 150.524 80.924 40.355 140.336 120.572 190.479 110.671 150.648 100.480 110.814 210.814 70.614 12
FAN_NV_RVC0.586 110.510 230.764 60.079 280.620 80.330 110.494 100.753 70.573 100.556 50.884 180.405 40.303 180.718 30.452 140.672 140.658 80.509 50.898 50.813 80.727 2
WSGFormer0.585 120.706 50.708 180.434 170.574 140.283 220.538 40.759 60.542 170.482 170.924 40.351 160.333 130.614 120.393 180.692 100.551 220.461 150.874 90.809 100.673 7
DCRedNet0.583 130.682 80.723 140.542 120.510 220.310 160.451 150.668 180.549 140.520 90.920 80.375 70.446 20.528 220.417 160.670 160.577 190.478 120.862 120.806 110.628 11
MIX6D_RVC0.582 140.695 60.687 190.225 230.632 70.328 130.550 10.748 80.623 60.494 160.890 160.350 170.254 250.688 60.454 130.716 40.597 180.489 100.881 80.768 180.575 17
SSMAcopyleft0.577 150.695 60.716 160.439 150.563 160.314 150.444 170.719 110.551 130.503 100.887 170.346 180.348 110.603 140.353 220.709 60.600 160.457 160.901 30.786 130.599 15
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. International Journal of Computer Vision, 2019
DMMF0.567 160.623 110.767 50.238 220.571 150.347 60.413 210.719 110.472 220.418 240.895 150.357 130.260 240.696 50.523 80.666 180.642 120.437 200.895 60.793 120.603 14
UNIV_CNP_RVC_UE0.566 170.569 210.686 210.435 160.524 190.294 200.421 200.712 140.543 150.463 190.872 190.320 190.363 90.611 130.477 120.686 120.627 130.443 190.862 120.775 160.639 8
EMSAFormer0.564 180.581 180.736 120.564 110.546 180.219 250.517 70.675 160.486 210.427 230.904 120.352 150.320 150.589 170.528 50.708 70.464 260.413 240.847 160.786 130.611 13
Söhnke Benedikt Fischedick, Daniel Seichter, Robin Schmidt, Leonard Rabes, and Horst-Michael Gross: Efficient Multi-Task Scene Analysis with RGB-D Transformers. IJCNN 2023
SN_RN152pyrx8_RVCcopyleft0.546 190.572 190.663 230.638 80.518 200.298 180.366 260.633 230.510 190.446 210.864 210.296 220.267 210.542 210.346 230.704 80.575 200.431 210.853 150.766 190.630 10
UDSSEG_RVC0.545 200.610 150.661 240.588 90.556 170.268 230.482 110.642 220.572 110.475 180.836 250.312 200.367 80.630 110.189 250.639 210.495 250.452 170.826 190.756 220.541 19
segfomer with 6d0.542 210.594 170.687 190.146 260.579 130.308 170.515 80.703 150.472 220.498 140.868 200.369 100.282 190.589 170.390 190.701 90.556 210.416 230.860 140.759 200.539 21
FuseNetpermissive0.535 220.570 200.681 220.182 240.512 210.290 210.431 180.659 190.504 200.495 150.903 130.308 210.428 40.523 230.365 210.676 130.621 150.470 130.762 240.779 150.541 19
Caner Hazirbas, Lingni Ma, Csaba Domokos, Daniel Cremers: FuseNet: Incorporating Depth into Semantic Segmentation via Fusion-based CNN Architecture. ACCV 2016
AdapNet++copyleft0.503 230.613 140.722 150.418 190.358 280.337 80.370 250.479 260.443 240.368 260.907 100.207 250.213 270.464 260.525 70.618 240.657 90.450 180.788 220.721 250.408 27
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. International Journal of Computer Vision, 2019
3DMV (2d proj)0.498 240.481 260.612 250.579 100.456 240.343 70.384 230.623 240.525 180.381 250.845 240.254 240.264 230.557 200.182 260.581 260.598 170.429 220.760 250.661 270.446 26
Angela Dai, Matthias Niessner: 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation. ECCV'18
MSeg1080_RVCpermissive0.485 250.505 240.709 170.092 270.427 250.241 240.411 220.654 210.385 280.457 200.861 220.053 280.279 200.503 240.481 100.645 200.626 140.365 260.748 260.725 240.529 22
John Lambert*, Zhuang Liu*, Ozan Sener, James Hays, Vladlen Koltun: MSeg: A Composite Dataset for Multi-domain Semantic Segmentation. CVPR 2020
ILC-PSPNet0.475 260.490 250.581 260.289 210.507 230.067 280.379 240.610 250.417 260.435 220.822 270.278 230.267 210.503 240.228 240.616 250.533 240.375 250.820 200.729 230.560 18
Enet (reimpl)0.376 270.264 280.452 280.452 140.365 260.181 260.143 280.456 270.409 270.346 270.769 280.164 260.218 260.359 270.123 280.403 280.381 280.313 280.571 270.685 260.472 25
Re-implementation of Adam Paszke, Abhishek Chaurasia, Sangpil Kim, Eugenio Culurciello: ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation.
ScanNet (2d proj)permissive0.330 280.293 270.521 270.657 70.361 270.161 270.250 270.004 280.440 250.183 280.836 250.125 270.060 280.319 280.132 270.417 270.412 270.344 270.541 280.427 280.109 28
Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner: ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. CVPR'17


This table lists the benchmark results for the 2D semantic instance scenario.




Method Infoavg apbathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
EMSANet (Instance)0.241 10.401 10.439 10.085 10.242 10.220 10.081 10.289 20.117 20.121 10.182 10.126 10.346 10.181 20.181 20.358 10.156 10.675 20.131 1
Seichter, Daniel and Fischedick, Söhnke and Köhler, Mona and Gross, Horst-Michael: EMSANet: Efficient Multi-Task RGB-D Scene Analysis for Indoor Environments. IJCNN 2022
UniDet_RVC0.205 20.381 20.323 30.037 30.226 30.177 30.063 20.277 30.120 10.067 30.131 30.074 30.317 20.080 30.235 10.289 30.141 30.678 10.080 3
FKNet0.204 30.334 30.358 20.038 20.234 20.184 20.025 30.318 10.042 40.088 20.141 20.053 40.300 30.207 10.171 30.292 20.149 20.636 30.109 2
MaskRCNN_ScanNetpermissive0.119 40.129 40.212 40.002 40.112 40.148 40.014 40.205 40.044 30.066 40.078 40.095 20.142 40.030 40.128 40.139 40.080 40.459 40.057 4
Re-implementation of Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross Girshick: Mask R-CNN. ICCV'17


This table lists the benchmark results for the scene type classification scenario.




Method Infoavg recallapartmentbathroombedroom / hotelbookstore / libraryconference roomcopy/mail roomhallwaykitchenlaundry roomliving room / loungemiscofficestorage / basement / garage
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
LAST-PCL-type0.780 10.250 31.000 11.000 11.000 11.000 11.000 10.500 21.000 10.500 20.889 10.000 21.000 11.000 1
Yanmin Wu, Qiankun Gao, Renrui Zhang, and Jian Zhang: Language-Assisted 3D Scene Understanding. arxiv23.12
multi-taskpermissive0.700 20.500 11.000 10.882 30.500 31.000 11.000 10.500 21.000 11.000 10.778 20.000 20.938 20.000 3
Shengyu Huang, Mikhail Usvyatsov, Konrad Schindler: Indoor Scene Recognition in 3D. IROS 2020
3DASPP-SCE0.691 30.500 10.938 30.824 41.000 11.000 10.500 31.000 10.857 30.500 20.556 40.000 20.812 30.500 2
SE-ResNeXt-SSMA0.498 40.000 50.812 40.941 20.500 30.500 40.500 30.500 20.429 50.500 20.667 30.500 10.625 40.000 3
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. arXiv
resnet50_scannet0.353 50.250 30.812 40.529 50.500 30.500 40.000 50.500 20.571 40.000 50.556 40.000 20.375 50.000 3