Presenting the ScanNet200 Benchmark

We present the ScanNet200 benchmark, which studies an order of magnitude more class categories than previous version of ScanNet. The scene geometry is shared within the two tasks, but the parsing of surface annotation allows for a larger vocabulary and more realistic setting for in the wild 3D understanding methods.

The ScanNet200 benchmark includes both finer-grained categories as well as a large number of previously unaddressed classes. This induces a much more challenging setting regarding the diversity of naturally observed semantic classes seen in the raw ScanNet RGB-D observations, where the data also reflects naturally encountered class imbalances. The difference in category frequencies between ScanNet and ScanNet200 can be seen in the Figure above.

ScanNet200 Benchmark

This table lists the benchmark results for the ScanNet200 3D semantic label scenario.




Method Infoavg iouhead ioucommon ioutail iouwallchairfloortabledoorcouchcabinetshelfdeskoffice chairbedpillowsinkpicturewindowtoiletbookshelfmonitorcurtainbookarmchaircoffee tableboxrefrigeratorlampkitchen cabinettowelclothestvnightstandcounterdresserstoolcushionplantceilingbathtubend tabledining tablekeyboardbagbackpacktoilet paperprintertv standwhiteboardblanketshower curtaintrash canclosetstairsmicrowavestoveshoecomputer towerbottlebinottomanbenchboardwashing machinemirrorcopierbasketsofa chairfile cabinetfanlaptopshowerpaperpersonpaper towel dispenserovenblindsrackplateblackboardpianosuitcaserailradiatorrecycling bincontainerwardrobesoap dispensertelephonebucketclockstandlightlaundry basketpipeclothes dryerguitartoilet paper holderseatspeakercolumnbicycleladderbathroom stallshower wallcupjacketstorage bincoffee makerdishwasherpaper towel rollmachinematwindowsillbartoasterbulletin boardironing boardfireplacesoap dishkitchen counterdoorframetoilet paper dispensermini fridgefire extinguisherballhatshower curtain rodwater coolerpaper cuttertrayshower doorpillarledgetoaster ovenmousetoilet seat cover dispenserfurniturecartstorage containerscaletissue boxlight switchcratepower outletdecorationsignprojectorcloset doorvacuum cleanercandleplungerstuffed animalheadphonesdish rackbroomguitar caserange hooddustpanhair dryerwater bottlehandicap barpurseventshower floorwater pitchermailboxbowlpaper bagalarm clockmusic standprojector screendividerlaundry detergentbathroom counterobjectbathroom vanitycloset walllaundry hamperbathroom stall doorceiling lighttrash bindumbbellstair railtubebathroom cabinetcd casecloset rodcoffee kettlestructureshower headkeyboard pianocase of water bottlescoat rackstorage organizerfolded chairfire alarmpower stripcalendarposterpotted plantluggagemattress
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
LGroundpermissive0.272 60.485 60.184 60.106 60.778 60.676 30.932 60.479 80.572 60.718 50.399 50.265 60.453 70.085 20.745 60.446 60.726 40.232 60.622 60.901 60.512 50.826 60.786 70.178 70.549 40.277 60.659 60.381 60.518 50.295 80.323 50.777 50.599 40.028 40.321 40.363 70.000 10.708 60.858 60.746 50.063 60.022 30.457 60.077 50.476 30.243 60.402 50.397 80.233 40.077 80.720 80.610 70.103 20.629 50.437 80.626 50.446 50.702 30.190 50.005 10.058 70.322 60.702 70.244 60.768 60.000 10.134 50.552 60.279 70.395 60.147 70.000 10.207 60.612 50.000 40.000 40.000 10.000 30.658 50.566 60.323 60.525 80.229 60.179 50.467 80.154 70.000 30.002 10.000 10.051 10.000 20.127 20.703 60.000 10.000 60.216 10.112 70.358 40.547 10.187 20.092 70.156 80.055 50.296 60.252 40.143 20.000 30.014 50.398 30.000 10.028 70.173 20.000 70.265 70.348 60.415 80.179 20.019 70.218 40.000 10.597 40.274 80.565 40.000 10.012 30.000 30.039 70.022 20.000 30.117 60.000 30.000 20.000 40.000 10.000 10.324 70.000 10.384 40.000 10.000 40.251 80.000 10.566 50.000 10.000 10.066 60.404 10.886 70.199 10.000 30.000 10.059 20.000 10.136 10.540 10.127 80.295 60.085 50.143 50.514 20.413 60.000 60.000 10.498 30.000 10.000 20.000 40.623 50.000 20.000 10.000 10.132 70.000 20.000 20.000 30.000 1
David Rozenberszki, Or Litany, Angela Dai: Language-Grounded Indoor 3D Semantic Segmentation in the Wild. arXiv
OA-CNN-L_ScanNet2000.333 20.558 10.269 20.124 40.821 10.703 10.946 10.569 10.662 10.748 20.487 10.455 10.572 20.000 80.789 20.534 30.736 20.271 10.713 10.949 10.498 70.877 20.860 30.332 20.706 10.474 10.788 30.406 40.637 20.495 30.355 40.805 20.592 60.015 70.396 10.602 40.000 10.799 20.876 10.713 80.276 10.000 50.493 40.080 40.448 60.363 10.661 20.833 20.262 20.125 20.823 40.665 30.076 40.720 10.557 30.637 40.517 40.672 60.227 40.000 30.158 40.496 30.843 50.352 40.835 50.000 10.103 60.711 10.527 10.526 20.320 30.000 10.568 20.625 30.067 10.000 40.000 10.001 20.806 20.836 20.621 30.591 30.373 30.314 20.668 20.398 20.003 20.000 30.000 10.016 80.024 10.043 60.906 20.000 10.052 30.000 70.384 30.330 50.342 50.100 40.223 20.183 40.112 30.476 40.313 20.130 50.196 20.112 30.370 50.000 10.234 30.071 40.160 10.403 20.398 50.492 70.197 10.076 50.272 30.000 10.200 80.560 20.735 30.000 10.000 50.000 30.110 20.002 30.021 20.412 30.000 30.000 20.000 40.000 10.000 10.794 40.000 10.445 10.000 10.022 20.509 30.000 10.517 70.000 10.000 10.001 80.245 20.915 20.024 20.089 10.000 10.262 10.000 10.103 60.524 20.392 40.515 20.013 80.251 30.411 60.662 10.001 50.000 10.473 40.000 10.000 20.150 30.699 30.000 20.000 10.000 10.166 20.000 20.024 10.000 30.000 1
AWCS0.305 50.508 50.225 50.142 20.782 50.634 80.937 50.489 60.578 50.721 30.364 60.355 50.515 40.023 50.764 50.523 40.707 50.264 30.633 50.922 50.507 60.886 10.804 60.179 60.436 70.300 50.656 70.529 20.501 60.394 40.296 70.820 10.603 30.131 20.179 80.619 20.000 10.707 70.865 50.773 30.171 30.010 40.484 50.063 60.463 50.254 50.332 70.649 50.220 50.100 50.729 60.613 60.071 60.582 70.628 20.702 10.424 60.749 10.137 60.000 30.142 50.360 50.863 30.305 50.877 20.000 10.173 10.606 50.337 50.478 50.154 60.000 10.253 50.664 20.000 40.000 40.000 10.000 30.626 60.782 50.302 70.602 20.185 70.282 30.651 40.317 40.000 30.000 30.000 10.022 60.000 20.154 10.876 40.000 10.014 50.063 60.029 80.553 10.467 20.084 50.124 50.157 70.049 60.373 50.252 40.097 60.000 30.219 10.542 10.000 10.392 10.172 30.000 70.339 30.417 20.533 60.093 60.115 30.195 50.000 10.516 50.288 70.741 20.000 10.001 40.233 20.056 50.000 40.159 10.334 40.077 20.000 20.000 40.000 10.000 10.749 50.000 10.411 30.000 10.008 30.452 50.000 10.595 40.000 10.000 10.220 40.006 50.894 60.006 30.000 30.000 10.000 40.000 10.112 30.504 30.404 30.551 10.093 30.129 80.484 40.381 80.000 60.000 10.396 60.000 10.000 20.620 20.402 80.000 20.000 10.000 10.142 40.000 20.000 20.512 20.000 1
PPT-SpUNet-F.T.0.332 30.556 20.270 10.123 50.816 20.682 20.946 10.549 30.657 30.756 10.459 30.376 40.550 30.001 70.807 10.616 10.727 30.267 20.691 20.942 40.530 40.872 30.874 20.330 30.542 50.374 30.792 20.400 50.673 10.572 20.433 10.793 30.623 20.008 80.351 30.594 50.000 10.783 40.876 10.833 20.213 20.000 50.537 20.091 20.519 10.304 20.620 40.942 10.264 10.124 30.855 10.695 10.086 30.646 30.506 70.658 20.535 20.715 20.314 10.000 30.241 10.608 20.897 10.359 30.858 30.000 10.076 80.611 40.392 30.509 30.378 20.000 10.579 10.565 70.000 40.000 40.000 10.000 30.755 30.806 40.661 10.572 60.350 40.181 40.660 30.300 50.000 30.000 30.000 10.023 50.000 20.042 70.930 10.000 10.000 60.077 40.584 20.392 30.339 60.185 30.171 40.308 10.006 70.563 30.256 30.150 10.000 30.002 70.345 60.000 10.045 50.197 10.063 30.323 50.453 10.600 30.163 50.037 60.349 20.000 10.672 10.679 10.753 10.000 10.000 50.000 30.117 10.000 40.000 30.291 50.000 30.000 20.039 10.000 10.000 10.899 20.000 10.374 50.000 10.000 40.545 20.000 10.634 10.000 10.000 10.074 50.223 30.914 30.000 40.021 20.000 10.000 40.000 10.112 30.498 50.649 10.383 50.095 10.135 70.449 50.432 40.008 30.000 10.518 20.000 10.000 20.000 40.796 10.000 20.000 10.000 10.138 50.000 20.000 20.000 30.000 1
Xiaoyang Wu, Zhuotao Tian, Xin Wen, Bohao Peng, Xihui Liu, Kaicheng Yu, Hengshuang Zhao: Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt Training.
OctFormer ScanNet200permissive0.326 40.539 40.265 30.131 30.806 30.670 40.943 30.535 40.662 10.705 70.423 40.407 30.505 50.003 60.765 40.582 20.686 60.227 70.680 30.943 30.601 10.854 50.892 10.335 10.417 80.357 40.724 50.453 30.632 30.596 10.432 20.783 40.512 80.021 60.244 60.637 10.000 10.787 30.873 30.743 60.000 80.000 50.534 30.110 10.499 20.289 30.626 30.620 60.168 80.204 10.849 20.679 20.117 10.633 40.684 10.650 30.552 10.684 50.312 20.000 30.175 30.429 40.865 20.413 10.837 40.000 10.145 30.626 30.451 20.487 40.513 10.000 10.529 30.613 40.000 40.033 20.000 10.000 30.828 10.871 10.622 20.587 40.411 20.137 60.645 50.343 30.000 30.000 30.000 10.022 60.000 20.026 80.829 50.000 10.022 40.089 30.842 10.253 70.318 80.296 10.178 30.291 20.224 10.584 20.200 60.132 40.000 30.128 20.227 70.000 10.230 40.047 50.149 20.331 40.412 30.618 20.164 40.102 40.522 10.000 10.655 20.378 40.469 60.000 10.000 50.000 30.105 30.000 40.000 30.483 20.000 30.000 20.028 20.000 10.000 10.906 10.000 10.339 60.000 10.000 40.457 40.000 10.612 30.000 10.000 10.408 10.000 70.900 40.000 40.000 30.000 10.029 30.000 10.074 80.455 60.479 20.427 40.079 60.140 60.496 30.414 50.022 10.000 10.471 50.000 10.000 20.000 40.722 20.000 20.000 10.000 10.138 50.000 20.000 20.000 30.000 1
Peng-Shuai Wang: OctFormer: Octree-based Transformers for 3D Point Clouds. SIGGRAPH 2023
CeCo0.340 10.551 30.247 40.181 10.784 40.661 50.939 40.564 20.624 40.721 30.484 20.429 20.575 10.027 40.774 30.503 50.753 10.242 40.656 40.945 20.534 20.865 40.860 30.177 80.616 30.400 20.818 10.579 10.615 40.367 50.408 30.726 60.633 10.162 10.360 20.619 20.000 10.828 10.873 30.924 10.109 50.083 20.564 10.057 80.475 40.266 40.781 10.767 30.257 30.100 50.825 30.663 40.048 80.620 60.551 40.595 60.532 30.692 40.246 30.000 30.213 20.615 10.861 40.376 20.900 10.000 10.102 70.660 20.321 60.547 10.226 40.000 10.311 40.742 10.011 30.006 30.000 10.000 30.546 80.824 30.345 50.665 10.450 10.435 10.683 10.411 10.338 10.000 30.000 10.030 40.000 20.068 40.892 30.000 10.063 20.000 70.257 40.304 60.387 30.079 60.228 10.190 30.000 80.586 10.347 10.133 30.000 30.037 40.377 40.000 10.384 20.006 70.003 50.421 10.410 40.643 10.171 30.121 20.142 70.000 10.510 60.447 30.474 50.000 10.000 50.286 10.083 40.000 40.000 30.603 10.096 10.063 10.000 40.000 10.000 10.898 30.000 10.429 20.000 10.400 10.550 10.000 10.633 20.000 10.000 10.377 20.000 70.916 10.000 40.000 30.000 10.000 40.000 10.102 70.499 40.296 50.463 30.089 40.304 10.740 10.401 70.010 20.000 10.560 10.000 10.000 20.709 10.652 40.000 20.000 10.000 10.143 30.000 20.000 20.609 10.000 1
: Understanding Imbalanced Semantic Segmentation Through Neural Collapse.
CSC-Pretrainpermissive0.249 80.455 80.171 70.079 80.766 80.659 60.930 80.494 50.542 80.700 80.314 80.215 80.430 80.121 10.697 80.441 70.683 70.235 50.609 80.895 70.476 80.816 70.770 80.186 50.634 20.216 80.734 40.340 70.471 70.307 70.293 80.591 80.542 70.076 30.205 70.464 60.000 10.484 80.832 80.766 40.052 70.000 50.413 70.059 70.418 70.222 70.318 80.609 70.206 70.112 40.743 50.625 50.076 40.579 80.548 50.590 70.371 70.552 80.081 70.003 20.142 50.201 80.638 80.233 70.686 80.000 10.142 40.444 80.375 40.247 80.198 50.000 10.128 80.454 80.019 20.097 10.000 10.000 30.553 70.557 70.373 40.545 70.164 80.014 80.547 70.174 60.000 30.002 10.000 10.037 20.000 20.063 50.664 80.000 10.000 60.130 20.170 50.152 80.335 70.079 60.110 60.175 50.098 40.175 80.166 70.045 80.207 10.014 50.465 20.000 10.001 80.001 80.046 40.299 60.327 70.537 50.033 70.012 80.186 60.000 10.205 70.377 50.463 70.000 10.058 20.000 30.055 60.041 10.000 30.105 70.000 30.000 20.000 40.000 10.000 10.398 60.000 10.308 80.000 10.000 40.319 60.000 10.543 60.000 10.000 10.062 70.004 60.862 80.000 40.000 30.000 10.000 40.000 10.123 20.316 70.225 60.250 70.094 20.180 40.332 70.441 30.000 60.000 10.310 80.000 10.000 20.000 40.592 60.000 20.000 10.000 10.203 10.000 20.000 20.000 30.000 1
Ji Hou, Benjamin Graham, Matthias Nießner, Saining Xie: Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts. CVPR 2021
Minkowski 34Dpermissive0.253 70.463 70.154 80.102 70.771 70.650 70.932 60.483 70.571 70.710 60.331 70.250 70.492 60.044 30.703 70.419 80.606 80.227 70.621 70.865 80.531 30.771 80.813 50.291 40.484 60.242 70.612 80.282 80.440 80.351 60.299 60.622 70.593 50.027 50.293 50.310 80.000 10.757 50.858 60.737 70.150 40.164 10.368 80.084 30.381 80.142 80.357 60.720 40.214 60.092 70.724 70.596 80.056 70.655 20.525 60.581 80.352 80.594 70.056 80.000 30.014 80.224 70.772 60.205 80.720 70.000 10.159 20.531 70.163 80.294 70.136 80.000 10.169 70.589 60.000 40.000 40.000 10.002 10.663 40.466 80.265 80.582 50.337 50.016 70.559 60.084 80.000 30.000 30.000 10.036 30.000 20.125 30.670 70.000 10.102 10.071 50.164 60.406 20.386 40.046 80.068 80.159 60.117 20.284 70.111 80.094 70.000 30.000 80.197 80.000 10.044 60.013 60.002 60.228 80.307 80.588 40.025 80.545 10.134 80.000 10.655 20.302 60.282 80.000 10.060 10.000 30.035 80.000 40.000 30.097 80.000 30.000 20.005 30.000 10.000 10.096 80.000 10.334 70.000 10.000 40.274 70.000 10.513 80.000 10.000 10.280 30.194 40.897 50.000 40.000 30.000 10.000 40.000 10.108 50.279 80.189 70.141 80.059 70.272 20.307 80.445 20.003 40.000 10.353 70.000 10.026 10.000 40.581 70.001 10.000 10.000 10.093 80.002 10.000 20.000 30.000 1
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019


This table lists the benchmark results for the ScanNet200 3D semantic instance scenario.




Method Infoavg ap 50%head ap 50%common ap 50%tail ap 50%chairtabledoorcouchcabinetshelfdeskoffice chairbedpillowsinkpicturewindowtoiletbookshelfmonitorcurtainbookarmchaircoffee tableboxrefrigeratorlampkitchen cabinettowelclothestvnightstandcounterdresserstoolcushionplantceilingbathtubend tabledining tablekeyboardbagbackpacktoilet paperprintertv standwhiteboardblanketshower curtaintrash canclosetstairsmicrowavestoveshoecomputer towerbottlebinottomanbenchboardwashing machinemirrorcopierbasketsofa chairfile cabinetfanlaptopshowerpaperpersonpaper towel dispenserovenblindsrackplateblackboardpianosuitcaserailradiatorrecycling bincontainerwardrobesoap dispensertelephonebucketclockstandlightlaundry basketpipeclothes dryerguitartoilet paper holderseatspeakercolumnbicycleladderbathroom stallshower wallcupjacketstorage bincoffee makerdishwasherpaper towel rollmachinematwindowsillbartoasterbulletin boardironing boardfireplacesoap dishkitchen counterdoorframetoilet paper dispensermini fridgefire extinguisherballhatshower curtain rodwater coolerpaper cuttertrayshower doorpillarledgetoaster ovenmousetoilet seat cover dispenserfurniturecartstorage containerscaletissue boxlight switchcratepower outletdecorationsignprojectorcloset doorvacuum cleanercandleplungerstuffed animalheadphonesdish rackbroomguitar caserange hooddustpanhair dryerwater bottlehandicap barpurseventshower floorwater pitchermailboxbowlpaper bagalarm clockmusic standprojector screendividerlaundry detergentbathroom counterobjectbathroom vanitycloset walllaundry hamperbathroom stall doorceiling lighttrash bindumbbellstair railtubebathroom cabinetcd casecloset rodcoffee kettlestructureshower headkeyboard pianocase of water bottlescoat rackstorage organizerfolded chairfire alarmpower stripcalendarposterpotted plantluggagemattress
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
TD3D Scannet2000.320 20.501 20.264 20.164 20.841 10.679 10.716 20.879 20.280 30.192 10.634 10.231 10.733 30.459 20.565 30.498 50.560 21.000 10.686 10.890 20.708 10.123 40.820 10.152 20.967 10.456 10.458 20.387 20.194 10.435 50.906 10.077 10.396 20.509 10.217 20.715 10.619 21.000 10.099 20.792 10.513 20.062 20.506 30.549 10.605 11.000 10.123 40.106 11.000 10.744 40.000 21.000 10.504 50.525 20.185 20.790 40.101 20.008 20.587 20.356 10.817 10.083 51.000 10.000 10.621 10.842 10.415 10.268 40.083 40.000 20.098 30.881 10.125 20.000 20.000 10.000 20.000 30.125 40.332 30.448 50.202 20.196 10.798 10.264 20.000 20.000 10.017 20.233 20.000 10.063 10.333 20.038 10.111 10.250 30.000 20.516 10.208 10.470 20.094 30.218 10.000 10.667 20.033 50.000 20.000 10.400 10.156 20.000 10.267 10.226 10.000 10.104 20.159 20.299 50.095 30.458 10.500 10.000 11.000 10.472 10.792 30.000 10.022 10.061 20.250 10.008 10.250 20.333 20.143 20.396 20.049 20.012 10.000 10.283 40.000 10.241 40.000 10.101 20.331 40.000 10.629 30.000 10.000 10.857 20.222 30.677 10.000 10.003 20.000 10.000 20.000 10.076 20.252 30.400 10.431 20.061 30.328 30.331 40.500 10.000 20.000 10.167 10.000 10.000 10.000 20.500 20.000 10.000 21.000 10.542 10.000 20.063 10.000 20.000 1
Mask3D Scannet2000.388 10.542 10.357 10.237 10.808 20.676 20.741 10.832 40.496 10.151 30.628 20.021 20.955 10.578 10.753 10.612 10.591 10.822 50.609 30.926 10.614 30.291 10.725 40.163 10.890 20.380 50.615 10.517 10.130 30.806 10.857 20.024 20.511 10.412 50.226 10.597 20.756 11.000 10.111 10.792 10.736 10.091 10.610 10.527 20.323 41.000 10.504 10.063 21.000 10.853 10.010 10.974 30.839 10.667 10.301 10.883 10.266 10.039 10.640 10.311 20.739 20.463 11.000 10.000 10.287 20.715 20.313 20.600 11.000 10.027 10.076 40.502 50.500 10.409 10.000 10.194 10.125 20.500 10.491 10.748 10.050 40.042 20.776 20.352 10.008 10.000 10.033 10.254 10.000 10.005 20.552 10.008 20.020 20.750 10.500 10.409 20.065 30.511 10.107 10.178 20.000 11.000 10.400 10.016 10.000 10.400 10.571 10.000 10.060 20.044 20.000 10.514 10.278 11.000 10.258 10.017 30.125 50.000 10.792 30.399 31.000 10.000 10.013 20.265 10.018 20.000 21.000 10.335 10.381 10.500 10.250 10.004 20.000 10.727 10.000 10.497 10.000 10.188 10.677 20.000 10.708 20.000 10.000 10.945 10.391 10.123 40.000 10.028 10.000 11.000 10.000 10.099 10.451 10.400 10.668 10.573 10.606 10.077 50.003 40.004 10.000 10.042 30.000 10.000 11.000 11.000 10.000 10.042 10.000 20.200 20.302 10.000 21.000 10.000 1
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
Minkowski 34D Inst.permissive0.203 50.369 40.134 50.078 50.706 40.382 40.693 30.845 30.221 50.150 40.158 40.000 30.746 20.369 40.545 40.595 20.387 40.997 30.413 50.720 50.636 20.165 30.732 30.070 40.851 40.402 40.251 40.313 40.123 40.583 40.696 30.000 30.051 50.500 20.000 30.500 40.372 50.667 40.009 40.000 30.307 50.003 40.479 40.107 50.226 50.903 40.109 50.031 30.981 30.726 50.000 20.522 50.669 20.282 50.052 50.778 50.000 40.000 30.400 30.074 40.333 40.218 41.000 10.000 10.250 30.406 50.118 50.317 20.100 30.000 20.191 10.596 20.000 30.000 20.000 10.000 20.000 30.500 10.178 50.701 20.000 50.000 30.522 50.018 50.000 20.000 10.000 30.060 40.000 10.000 30.033 50.000 30.000 30.000 40.000 20.281 30.100 20.000 50.090 40.133 40.000 10.422 50.050 40.000 20.000 10.200 30.000 50.000 10.000 30.000 30.000 10.000 40.123 40.677 20.021 40.000 40.500 10.000 10.500 40.442 20.125 50.000 10.000 30.000 30.000 30.000 20.000 30.056 40.000 30.000 30.000 30.000 30.000 10.200 50.000 10.143 50.000 10.000 30.250 50.000 10.511 40.000 10.000 10.286 30.083 40.396 20.000 10.000 30.000 10.000 20.000 10.025 40.300 20.000 30.371 30.070 20.000 40.385 30.000 50.000 20.000 10.000 50.000 10.000 10.000 20.500 20.000 10.000 20.000 20.200 20.000 20.000 20.000 20.000 1
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019
CSC-Pretrain Inst.permissive0.209 40.361 50.157 40.085 40.700 50.248 50.634 50.776 50.322 20.135 50.103 50.000 30.524 50.364 50.618 20.592 30.381 50.997 30.589 40.747 40.340 50.109 50.768 20.059 50.702 50.448 20.188 50.149 50.091 50.636 30.573 50.000 30.246 30.500 20.000 30.450 50.405 30.667 40.006 50.000 30.356 40.007 30.506 20.420 30.340 30.667 50.294 20.004 40.571 40.748 20.000 21.000 10.573 40.502 40.094 40.807 30.000 40.000 30.400 30.000 50.278 50.228 31.000 10.000 10.115 50.432 40.198 30.050 50.125 20.000 20.000 50.573 30.000 30.000 20.000 10.000 20.000 30.125 40.312 40.610 30.221 10.000 30.667 40.050 40.000 20.000 10.000 30.032 50.000 10.000 30.083 30.000 30.000 30.000 40.000 20.220 40.000 50.125 30.000 50.111 50.000 10.667 20.200 30.000 20.000 10.000 40.110 30.000 10.000 30.000 30.000 10.000 40.053 50.500 40.000 50.000 40.500 10.000 10.500 40.333 40.500 40.000 10.000 30.000 30.000 30.000 20.000 30.000 50.000 30.000 30.000 30.000 30.000 10.600 20.000 10.364 20.000 10.000 30.750 10.000 10.833 10.000 10.000 10.143 50.000 50.396 20.000 10.000 30.000 10.000 20.000 10.021 50.221 40.000 30.093 50.055 40.451 20.677 20.125 20.000 20.000 10.028 40.000 10.000 10.000 20.500 20.000 10.000 20.000 20.050 40.000 20.000 20.000 20.000 1
Ji Hou, Benjamin Graham, Matthias Nießner, Saining Xie: Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts. CVPR 2021
LGround Inst.permissive0.246 30.413 30.170 30.130 30.754 30.541 30.682 40.903 10.264 40.164 20.234 30.000 30.681 40.452 30.464 50.541 40.399 31.000 10.637 20.772 30.588 40.190 20.589 50.081 30.857 30.426 30.373 30.318 30.135 20.690 20.653 40.000 30.159 40.500 20.000 30.581 30.387 41.000 10.046 30.000 30.402 30.003 50.455 50.196 40.571 21.000 10.270 30.003 50.530 50.748 30.000 20.744 40.575 30.511 30.112 30.815 20.067 30.000 30.400 30.167 30.667 30.241 21.000 10.000 10.208 40.660 30.125 40.317 20.000 50.000 20.100 20.561 40.000 30.000 20.000 10.000 21.000 10.500 10.344 20.568 40.167 30.000 30.706 30.068 30.000 20.000 10.000 30.063 30.000 10.000 30.056 40.000 30.000 30.500 20.000 20.143 50.017 40.125 30.097 20.164 30.000 10.582 40.400 10.000 20.000 10.000 40.083 40.000 10.000 30.000 30.000 10.025 30.156 30.533 30.250 20.200 20.500 10.000 11.000 10.333 41.000 10.000 10.000 30.000 30.000 30.000 20.000 30.333 20.000 30.000 30.000 30.000 30.000 10.400 30.000 10.364 20.000 10.000 30.500 30.000 10.511 40.000 10.000 10.286 30.333 20.000 50.000 10.000 30.000 10.000 20.000 10.034 30.111 50.000 30.333 40.031 50.000 40.750 10.125 20.000 20.000 10.151 20.000 10.000 10.000 20.500 20.000 10.000 20.000 20.000 50.000 20.000 20.000 20.000 1
David Rozenberszki, Or Litany, Angela Dai: Language-Grounded Indoor 3D Semantic Segmentation in the Wild.


ScanNet Benchmark

This table lists the benchmark results for the 3D semantic label scenario.


Method Infoavg ioubathtubbedbookshelfcabinetchaircountercurtaindeskdoorfloorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwallwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Mix3Dpermissive0.781 10.964 10.855 10.843 130.781 40.858 90.575 40.831 260.685 90.714 20.979 10.594 40.310 210.801 10.892 120.841 20.819 30.723 30.940 90.887 30.725 18
Alexey Nekrasov, Jonas Schult, Or Litany, Bastian Leibe, Francis Engelmann: Mix3D: Out-of-Context Data Augmentation for 3D Scenes. 3DV 2021 (Oral)
Swin3Dpermissive0.779 20.861 170.818 120.836 160.790 20.875 20.576 30.905 30.704 30.739 10.969 70.611 10.349 70.756 160.958 10.702 370.805 120.708 60.916 250.898 10.801 1
PPT-SpUNet-Joint0.766 30.932 20.794 270.829 200.751 170.854 110.540 150.903 40.630 280.672 100.963 100.565 170.357 50.788 20.900 80.737 200.802 130.685 120.950 30.887 30.780 2
Xiaoyang Wu, Zhuotao Tian, Xin Wen, Bohao Peng, Xihui Liu, Kaicheng Yu, Hengshuang Zhao: Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt Training.
OctFormerpermissive0.766 30.925 40.808 180.849 70.786 30.846 210.566 70.876 100.690 70.674 90.960 120.576 130.226 590.753 180.904 60.777 80.815 50.722 40.923 210.877 90.776 4
Peng-Shuai Wang: OctFormer: Octree-based Transformers for 3D Point Clouds. SIGGRAPH 2023
CU-Hybrid Net0.764 50.924 50.819 100.840 140.757 120.853 130.580 10.848 190.709 20.643 180.958 150.587 80.295 270.753 180.884 160.758 140.815 50.725 20.927 190.867 160.743 10
OccuSeg+Semantic0.764 50.758 530.796 250.839 150.746 190.907 10.562 80.850 180.680 110.672 100.978 20.610 20.335 120.777 50.819 390.847 10.830 10.691 100.972 10.885 50.727 16
O-CNNpermissive0.762 70.924 50.823 60.844 120.770 60.852 140.577 20.847 210.711 10.640 220.958 150.592 50.217 650.762 120.888 130.758 140.813 80.726 10.932 170.868 150.744 9
Peng-Shuai Wang, Yang Liu, Yu-Xiao Guo, Chun-Yu Sun, Xin Tong: O-CNN: Octree-based Convolutional Neural Networks for 3D Shape Analysis. SIGGRAPH 2017
OA-CNN-L_ScanNet200.756 80.783 390.826 50.858 40.776 50.837 270.548 120.896 70.649 200.675 80.962 110.586 90.335 120.771 80.802 430.770 100.787 280.691 100.936 120.880 80.761 6
PNE0.755 90.786 370.835 40.834 180.758 100.849 170.570 60.836 250.648 210.668 120.978 20.581 120.367 30.683 280.856 250.804 30.801 170.678 140.961 20.889 20.716 23
ConDaFormer0.755 90.927 30.822 70.836 160.801 10.849 170.516 250.864 150.651 190.680 70.958 150.584 110.282 340.759 140.855 270.728 220.802 130.678 140.880 520.873 140.756 7
Lunhao Duan, Shanshan Zhao, Nan Xue, Mingming Gong, Guisong Xia, Dacheng Tao: ConDaFormer : Disassembled Transformer with Local Structure Enhancement for 3D Point Cloud Understanding. Neurips, 2023
PointTransformerV20.752 110.742 600.809 170.872 10.758 100.860 80.552 100.891 80.610 350.687 30.960 120.559 200.304 240.766 100.926 30.767 110.797 200.644 270.942 70.876 120.722 20
Xiaoyang Wu, Yixing Lao, Li Jiang, Xihui Liu, Hengshuang Zhao: Point Transformer V2: Grouped Vector Attention and Partition-based Pooling. NeurIPS 2022
DMF-Net0.752 110.906 90.793 290.802 350.689 330.825 380.556 90.867 120.681 100.602 370.960 120.555 220.365 40.779 40.859 220.747 170.795 240.717 50.917 240.856 240.764 5
C.Yang, Y.Yan, W.Zhao, J.Ye, X.Yang, A.Hussain, B.Dong, K.Huang: Towards Deeper and Better Multi-view Feature Fusion for 3D Semantic Segmentation. ICONIP 2023
PointConvFormer0.749 130.793 350.790 300.807 310.750 180.856 100.524 210.881 90.588 470.642 210.977 50.591 60.274 390.781 30.929 20.804 30.796 210.642 280.947 50.885 50.715 24
Wenxuan Wu, Qi Shan, Li Fuxin: PointConvFormer: Revenge of the Point-based Convolution.
BPNetcopyleft0.749 130.909 70.818 120.811 280.752 150.839 260.485 400.842 220.673 120.644 170.957 190.528 310.305 230.773 70.859 220.788 50.818 40.693 90.916 250.856 240.723 19
Wenbo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia, Tien-Tsin Wong: Bidirectional Projection Network for Cross Dimension Scene Understanding. CVPR 2021 (Oral)
MSP0.748 150.623 860.804 200.859 30.745 200.824 400.501 300.912 20.690 70.685 50.956 200.567 160.320 180.768 90.918 40.720 270.802 130.676 170.921 220.881 70.779 3
StratifiedFormerpermissive0.747 160.901 100.803 210.845 110.757 120.846 210.512 260.825 290.696 60.645 160.956 200.576 130.262 500.744 230.861 210.742 180.770 370.705 70.899 380.860 210.734 11
Xin Lai*, Jianhui Liu*, Li Jiang, Liwei Wang, Hengshuang Zhao, Shu Liu, Xiaojuan Qi, Jiaya Jia: Stratified Transformer for 3D Point Cloud Segmentation. CVPR 2022
VMNetpermissive0.746 170.870 150.838 20.858 40.729 250.850 160.501 300.874 110.587 480.658 140.956 200.564 180.299 250.765 110.900 80.716 300.812 90.631 330.939 100.858 220.709 25
Zeyu HU, Xuyang Bai, Jiaxiang Shang, Runze Zhang, Jiayu Dong, Xin Wang, Guangyuan Sun, Hongbo Fu, Chiew-Lan Tai: VMNet: Voxel-Mesh Network for Geodesic-Aware 3D Semantic Segmentation. ICCV 2021 (Oral)
Virtual MVFusion0.746 170.771 470.819 100.848 90.702 310.865 70.397 770.899 50.699 40.664 130.948 480.588 70.330 140.746 220.851 310.764 120.796 210.704 80.935 130.866 170.728 14
Abhijit Kundu, Xiaoqi Yin, Alireza Fathi, David Ross, Brian Brewington, Thomas Funkhouser, Caroline Pantofaru: Virtual Multi-view Fusion for 3D Semantic Segmentation. ECCV 2020
Retro-FPN0.744 190.842 230.800 220.767 480.740 210.836 290.541 140.914 10.672 130.626 260.958 150.552 230.272 410.777 50.886 150.696 380.801 170.674 190.941 80.858 220.717 21
Peng Xiang*, Xin Wen*, Yu-Shen Liu, Hui Zhang, Yi Fang, Zhizhong Han: Retrospective Feature Pyramid Network for Point Cloud Semantic Segmentation. ICCV 2023
EQ-Net0.743 200.620 870.799 240.849 70.730 240.822 420.493 370.897 60.664 140.681 60.955 230.562 190.378 10.760 130.903 70.738 190.801 170.673 200.907 300.877 90.745 8
Zetong Yang*, Li Jiang*, Yanan Sun, Bernt Schiele, Jiaya JIa: A Unified Query-based Paradigm for Point Cloud Understanding. CVPR 2022
LRPNet0.742 210.816 300.806 190.807 310.752 150.828 360.575 40.839 240.699 40.637 230.954 290.520 330.320 180.755 170.834 350.760 130.772 340.676 170.915 270.862 190.717 21
SAT0.742 210.860 180.765 420.819 230.769 70.848 190.533 170.829 270.663 150.631 250.955 230.586 90.274 390.753 180.896 100.729 210.760 440.666 220.921 220.855 260.733 12
LargeKernel3D0.739 230.909 70.820 90.806 330.740 210.852 140.545 130.826 280.594 460.643 180.955 230.541 250.263 490.723 260.858 240.775 90.767 380.678 140.933 150.848 310.694 30
Yukang Chen*, Jianhui Liu*, Xiangyu Zhang, Xiaojuan Qi, Jiaya Jia: LargeKernel3D: Scaling up Kernels in 3D Sparse CNNs. CVPR 2023
RPN0.736 240.776 430.790 300.851 60.754 140.854 110.491 390.866 130.596 450.686 40.955 230.536 260.342 90.624 430.869 180.787 60.802 130.628 340.927 190.875 130.704 27
MinkowskiNetpermissive0.736 240.859 190.818 120.832 190.709 290.840 250.521 230.853 170.660 170.643 180.951 380.544 240.286 320.731 240.893 110.675 460.772 340.683 130.874 580.852 290.727 16
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019
IPCA0.731 260.890 110.837 30.864 20.726 260.873 30.530 200.824 300.489 790.647 150.978 20.609 30.336 110.624 430.733 520.758 140.776 320.570 580.949 40.877 90.728 14
PointTransformer++0.725 270.727 680.811 160.819 230.765 80.841 240.502 290.814 350.621 310.623 280.955 230.556 210.284 330.620 450.866 190.781 70.757 470.648 250.932 170.862 190.709 25
SparseConvNet0.725 270.647 830.821 80.846 100.721 270.869 40.533 170.754 500.603 410.614 300.955 230.572 150.325 160.710 270.870 170.724 250.823 20.628 340.934 140.865 180.683 33
MatchingNet0.724 290.812 320.812 150.810 290.735 230.834 310.495 360.860 160.572 540.602 370.954 290.512 350.280 360.757 150.845 330.725 240.780 300.606 440.937 110.851 300.700 29
INS-Conv-semantic0.717 300.751 560.759 450.812 270.704 300.868 50.537 160.842 220.609 370.608 330.953 320.534 280.293 280.616 460.864 200.719 290.793 250.640 290.933 150.845 350.663 38
PointMetaBase0.714 310.835 240.785 320.821 210.684 350.846 210.531 190.865 140.614 320.596 410.953 320.500 380.246 550.674 290.888 130.692 390.764 400.624 360.849 730.844 360.675 35
contrastBoundarypermissive0.705 320.769 500.775 370.809 300.687 340.820 450.439 650.812 360.661 160.591 430.945 560.515 340.171 830.633 400.856 250.720 270.796 210.668 210.889 450.847 320.689 31
Liyao Tang, Yibing Zhan, Zhe Chen, Baosheng Yu, Dacheng Tao: Contrastive Boundary Learning for Point Cloud Segmentation. CVPR2022
ClickSeg_Semantic0.703 330.774 450.800 220.793 390.760 90.847 200.471 430.802 390.463 860.634 240.968 90.491 410.271 430.726 250.910 50.706 340.815 50.551 690.878 530.833 370.570 69
RFCR0.702 340.889 120.745 540.813 260.672 380.818 490.493 370.815 340.623 290.610 310.947 500.470 490.249 540.594 490.848 320.705 350.779 310.646 260.892 430.823 430.611 52
Jingyu Gong, Jiachen Xu, Xin Tan, Haichuan Song, Yanyun Qu, Yuan Xie, Lizhuang Ma: Omni-Supervised Point Cloud Segmentation via Gradual Receptive Field Component Reasoning. CVPR2021
One Thing One Click0.701 350.825 280.796 250.723 550.716 280.832 320.433 670.816 320.634 260.609 320.969 70.418 740.344 80.559 610.833 360.715 310.808 110.560 630.902 350.847 320.680 34
JSENetpermissive0.699 360.881 140.762 430.821 210.667 390.800 620.522 220.792 420.613 330.607 340.935 760.492 400.205 700.576 540.853 290.691 400.758 460.652 240.872 610.828 400.649 42
Zeyu HU, Mingmin Zhen, Xuyang BAI, Hongbo Fu, Chiew-lan Tai: JSENet: Joint Semantic Segmentation and Edge Detection Network for 3D Point Clouds. ECCV 2020
One-Thing-One-Click0.693 370.743 590.794 270.655 790.684 350.822 420.497 350.719 600.622 300.617 290.977 50.447 610.339 100.750 210.664 680.703 360.790 270.596 480.946 60.855 260.647 43
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
PicassoNet-IIpermissive0.692 380.732 640.772 380.786 400.677 370.866 60.517 240.848 190.509 710.626 260.952 360.536 260.225 610.545 670.704 590.689 430.810 100.564 620.903 340.854 280.729 13
Huan Lei, Naveed Akhtar, Mubarak Shah, and Ajmal Mian: Geometric feature learning for 3D meshes.
Feature_GeometricNetpermissive0.690 390.884 130.754 490.795 380.647 450.818 490.422 690.802 390.612 340.604 350.945 560.462 520.189 780.563 600.853 290.726 230.765 390.632 320.904 320.821 460.606 56
Kangcheng Liu, Ben M. Chen: https://arxiv.org/abs/2012.09439. arXiv Preprint
FusionNet0.688 400.704 730.741 580.754 520.656 410.829 340.501 300.741 550.609 370.548 500.950 420.522 320.371 20.633 400.756 470.715 310.771 360.623 370.861 690.814 480.658 39
Feihu Zhang, Jin Fang, Benjamin Wah, Philip Torr: Deep FusionNet for Point Cloud Semantic Segmentation. ECCV 2020
Feature-Geometry Netpermissive0.685 410.866 160.748 510.819 230.645 470.794 650.450 540.802 390.587 480.604 350.945 560.464 510.201 730.554 630.840 340.723 260.732 560.602 460.907 300.822 450.603 59
KP-FCNN0.684 420.847 220.758 470.784 420.647 450.814 520.473 420.772 450.605 390.594 420.935 760.450 590.181 810.587 500.805 420.690 410.785 290.614 400.882 490.819 470.632 48
H. Thomas, C. Qi, J. Deschaud, B. Marcotegui, F. Goulette, L. Guibas.: KPConv: Flexible and Deformable Convolution for Point Clouds. ICCV 2019
DGNet0.684 420.712 720.784 330.782 440.658 400.835 300.499 340.823 310.641 230.597 400.950 420.487 420.281 350.575 550.619 710.647 590.764 400.620 390.871 640.846 340.688 32
VACNN++0.684 420.728 670.757 480.776 450.690 320.804 590.464 480.816 320.577 530.587 440.945 560.508 370.276 380.671 300.710 570.663 510.750 500.589 530.881 500.832 390.653 41
Superpoint Network0.683 450.851 210.728 620.800 370.653 430.806 570.468 450.804 370.572 540.602 370.946 530.453 580.239 580.519 720.822 370.689 430.762 430.595 500.895 410.827 410.630 49
PointContrast_LA_SEM0.683 450.757 540.784 330.786 400.639 490.824 400.408 720.775 440.604 400.541 520.934 800.532 290.269 450.552 640.777 450.645 620.793 250.640 290.913 280.824 420.671 36
VI-PointConv0.676 470.770 490.754 490.783 430.621 530.814 520.552 100.758 480.571 560.557 480.954 290.529 300.268 470.530 700.682 630.675 460.719 590.603 450.888 460.833 370.665 37
Xingyi Li, Wenxuan Wu, Xiaoli Z. Fern, Li Fuxin: The Devils in the Point Clouds: Studying the Robustness of Point Cloud Convolutions.
ROSMRF3D0.673 480.789 360.748 510.763 500.635 510.814 520.407 740.747 520.581 520.573 450.950 420.484 430.271 430.607 470.754 480.649 560.774 330.596 480.883 480.823 430.606 56
SALANet0.670 490.816 300.770 400.768 470.652 440.807 560.451 510.747 520.659 180.545 510.924 860.473 480.149 930.571 570.811 410.635 650.746 510.623 370.892 430.794 600.570 69
PointConvpermissive0.666 500.781 400.759 450.699 640.644 480.822 420.475 410.779 430.564 590.504 680.953 320.428 680.203 720.586 520.754 480.661 520.753 480.588 540.902 350.813 500.642 44
Wenxuan Wu, Zhongang Qi, Li Fuxin: PointConv: Deep Convolutional Networks on 3D Point Clouds. CVPR 2019
PointASNLpermissive0.666 500.703 740.781 350.751 540.655 420.830 330.471 430.769 460.474 820.537 540.951 380.475 470.279 370.635 380.698 620.675 460.751 490.553 680.816 800.806 520.703 28
Xu Yan, Chaoda Zheng, Zhen Li, Sheng Wang, Shuguang Cui: PointASNL: Robust Point Clouds Processing using Nonlocal Neural Networks with Adaptive Sampling. CVPR 2020
PPCNN++permissive0.663 520.746 570.708 660.722 560.638 500.820 450.451 510.566 870.599 430.541 520.950 420.510 360.313 200.648 350.819 390.616 700.682 750.590 520.869 650.810 510.656 40
Pyunghwan Ahn, Juyoung Yang, Eojindl Yi, Chanho Lee, Junmo Kim: Projection-based Point Convolution for Efficient Point Cloud Segmentation. IEEE Access
DCM-Net0.658 530.778 410.702 690.806 330.619 540.813 550.468 450.693 680.494 740.524 600.941 670.449 600.298 260.510 740.821 380.675 460.727 580.568 600.826 780.803 540.637 46
Jonas Schult*, Francis Engelmann*, Theodora Kontogianni, Bastian Leibe: DualConvMesh-Net: Joint Geodesic and Euclidean Convolutions on 3D Meshes. CVPR 2020 [Oral]
HPGCNN0.656 540.698 750.743 560.650 800.564 710.820 450.505 280.758 480.631 270.479 730.945 560.480 450.226 590.572 560.774 460.690 410.735 540.614 400.853 720.776 750.597 62
Jisheng Dang, Qingyong Hu, Yulan Guo, Jun Yang: HPGCNN.
SAFNet-segpermissive0.654 550.752 550.734 600.664 770.583 660.815 510.399 760.754 500.639 240.535 560.942 650.470 490.309 220.665 310.539 770.650 550.708 640.635 310.857 710.793 620.642 44
Linqing Zhao, Jiwen Lu, Jie Zhou: Similarity-Aware Fusion Network for 3D Semantic Segmentation. IROS 2021
RandLA-Netpermissive0.645 560.778 410.731 610.699 640.577 670.829 340.446 560.736 560.477 810.523 620.945 560.454 560.269 450.484 810.749 510.618 680.738 520.599 470.827 770.792 650.621 51
MVPNetpermissive0.641 570.831 250.715 640.671 740.590 620.781 710.394 780.679 700.642 220.553 490.937 730.462 520.256 510.649 340.406 900.626 660.691 720.666 220.877 540.792 650.608 55
Maximilian Jaritz, Jiayuan Gu, Hao Su: Multi-view PointNet for 3D Scene Understanding. GMDL Workshop, ICCV 2019
PointConv-SFPN0.641 570.776 430.703 680.721 570.557 740.826 370.451 510.672 720.563 600.483 720.943 640.425 710.162 880.644 360.726 530.659 530.709 630.572 570.875 560.786 700.559 75
PointMRNet0.640 590.717 710.701 700.692 670.576 680.801 610.467 470.716 610.563 600.459 780.953 320.429 670.169 850.581 530.854 280.605 710.710 610.550 700.894 420.793 620.575 67
FPConvpermissive0.639 600.785 380.760 440.713 620.603 570.798 630.392 790.534 920.603 410.524 600.948 480.457 540.250 530.538 680.723 550.598 750.696 700.614 400.872 610.799 550.567 72
Yiqun Lin, Zizheng Yan, Haibin Huang, Dong Du, Ligang Liu, Shuguang Cui, Xiaoguang Han: FPConv: Learning Local Flattening for Point Convolution. CVPR 2020
PD-Net0.638 610.797 340.769 410.641 850.590 620.820 450.461 490.537 910.637 250.536 550.947 500.388 820.206 690.656 320.668 660.647 590.732 560.585 550.868 660.793 620.473 94
PointSPNet0.637 620.734 630.692 770.714 610.576 680.797 640.446 560.743 540.598 440.437 830.942 650.403 780.150 920.626 420.800 440.649 560.697 690.557 660.846 740.777 740.563 73
SConv0.636 630.830 260.697 730.752 530.572 700.780 730.445 580.716 610.529 650.530 570.951 380.446 620.170 840.507 760.666 670.636 640.682 750.541 750.886 470.799 550.594 63
Supervoxel-CNN0.635 640.656 810.711 650.719 580.613 550.757 820.444 610.765 470.534 640.566 460.928 840.478 460.272 410.636 370.531 790.664 500.645 850.508 830.864 680.792 650.611 52
joint point-basedpermissive0.634 650.614 880.778 360.667 760.633 520.825 380.420 700.804 370.467 840.561 470.951 380.494 390.291 290.566 580.458 850.579 820.764 400.559 650.838 750.814 480.598 61
Hung-Yueh Chiang, Yen-Liang Lin, Yueh-Cheng Liu, Winston H. Hsu: A Unified Point-Based Framework for 3D Segmentation. 3DV 2019
PointMTL0.632 660.731 650.688 800.675 710.591 610.784 700.444 610.565 880.610 350.492 700.949 460.456 550.254 520.587 500.706 580.599 740.665 810.612 430.868 660.791 690.579 66
3DSM_DMMF0.631 670.626 850.745 540.801 360.607 560.751 830.506 270.729 590.565 580.491 710.866 1000.434 630.197 760.595 480.630 700.709 330.705 660.560 630.875 560.740 850.491 89
APCF-Net0.631 670.742 600.687 820.672 720.557 740.792 680.408 720.665 730.545 620.508 650.952 360.428 680.186 790.634 390.702 600.620 670.706 650.555 670.873 590.798 570.581 65
Haojia, Lin: Adaptive Pyramid Context Fusion for Point Cloud Perception. GRSL
PointNet2-SFPN0.631 670.771 470.692 770.672 720.524 790.837 270.440 640.706 660.538 630.446 800.944 620.421 730.219 640.552 640.751 500.591 780.737 530.543 740.901 370.768 770.557 76
FusionAwareConv0.630 700.604 900.741 580.766 490.590 620.747 840.501 300.734 570.503 730.527 580.919 900.454 560.323 170.550 660.420 890.678 450.688 730.544 720.896 400.795 590.627 50
Jiazhao Zhang, Chenyang Zhu, Lintao Zheng, Kai Xu: Fusion-Aware Point Convolution for Online Semantic 3D Scene Segmentation. CVPR 2020
DenSeR0.628 710.800 330.625 920.719 580.545 770.806 570.445 580.597 820.448 890.519 630.938 720.481 440.328 150.489 800.499 840.657 540.759 450.592 510.881 500.797 580.634 47
SegGroup_sempermissive0.627 720.818 290.747 530.701 630.602 580.764 790.385 830.629 790.490 770.508 650.931 830.409 760.201 730.564 590.725 540.618 680.692 710.539 760.873 590.794 600.548 79
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
SIConv0.625 730.830 260.694 750.757 510.563 720.772 770.448 550.647 760.520 670.509 640.949 460.431 660.191 770.496 780.614 720.647 590.672 790.535 780.876 550.783 710.571 68
HPEIN0.618 740.729 660.668 830.647 820.597 600.766 780.414 710.680 690.520 670.525 590.946 530.432 640.215 660.493 790.599 730.638 630.617 900.570 580.897 390.806 520.605 58
Li Jiang, Hengshuang Zhao, Shu Liu, Xiaoyong Shen, Chi-Wing Fu, Jiaya Jia: Hierarchical Point-Edge Interaction Network for Point Cloud Semantic Segmentation. ICCV 2019
SPH3D-GCNpermissive0.610 750.858 200.772 380.489 970.532 780.792 680.404 750.643 780.570 570.507 670.935 760.414 750.046 1020.510 740.702 600.602 730.705 660.549 710.859 700.773 760.534 82
Huan Lei, Naveed Akhtar, and Ajmal Mian: Spherical Kernel for Efficient Graph Convolution on 3D Point Clouds. TPAMI 2020
AttAN0.609 760.760 520.667 840.649 810.521 800.793 660.457 500.648 750.528 660.434 850.947 500.401 790.153 910.454 830.721 560.648 580.717 600.536 770.904 320.765 780.485 90
Gege Zhang, Qinghua Ma, Licheng Jiao, Fang Liu and Qigong Sun: AttAN: Attention Adversarial Networks for 3D Point Cloud Semantic Segmentation. IJCAI2020
wsss-transformer0.600 770.634 840.743 560.697 660.601 590.781 710.437 660.585 850.493 750.446 800.933 810.394 800.011 1040.654 330.661 690.603 720.733 550.526 790.832 760.761 800.480 91
dtc_net0.596 780.683 760.725 630.715 600.549 760.803 600.444 610.647 760.493 750.495 690.941 670.409 760.000 1060.424 880.544 760.598 750.703 680.522 800.912 290.792 650.520 85
LAP-D0.594 790.720 690.692 770.637 860.456 890.773 760.391 810.730 580.587 480.445 820.940 700.381 830.288 300.434 860.453 870.591 780.649 830.581 560.777 840.749 840.610 54
DPC0.592 800.720 690.700 710.602 900.480 850.762 810.380 840.713 640.585 510.437 830.940 700.369 850.288 300.434 860.509 830.590 800.639 880.567 610.772 850.755 820.592 64
Francis Engelmann, Theodora Kontogianni, Bastian Leibe: Dilated Point Convolutions: On the Receptive Field Size of Point Convolutions on 3D Point Clouds. ICRA 2020
CCRFNet0.589 810.766 510.659 870.683 690.470 880.740 860.387 820.620 810.490 770.476 740.922 880.355 880.245 560.511 730.511 820.571 830.643 860.493 870.872 610.762 790.600 60
ROSMRF0.580 820.772 460.707 670.681 700.563 720.764 790.362 860.515 930.465 850.465 770.936 750.427 700.207 680.438 840.577 740.536 860.675 780.486 880.723 910.779 720.524 84
SD-DETR0.576 830.746 570.609 960.445 1010.517 810.643 970.366 850.714 630.456 870.468 760.870 990.432 640.264 480.558 620.674 640.586 810.688 730.482 890.739 890.733 870.537 81
SQN_0.1%0.569 840.676 780.696 740.657 780.497 820.779 740.424 680.548 890.515 690.376 900.902 970.422 720.357 50.379 910.456 860.596 770.659 820.544 720.685 940.665 980.556 77
TextureNetpermissive0.566 850.672 800.664 850.671 740.494 830.719 870.445 580.678 710.411 950.396 880.935 760.356 870.225 610.412 890.535 780.565 840.636 890.464 910.794 830.680 950.568 71
Jingwei Huang, Haotian Zhang, Li Yi, Thomas Funkerhouser, Matthias Niessner, Leonidas Guibas: TextureNet: Consistent Local Parametrizations for Learning from High-Resolution Signals on Meshes. CVPR
DVVNet0.562 860.648 820.700 710.770 460.586 650.687 910.333 900.650 740.514 700.475 750.906 940.359 860.223 630.340 930.442 880.422 970.668 800.501 840.708 920.779 720.534 82
Pointnet++ & Featurepermissive0.557 870.735 620.661 860.686 680.491 840.744 850.392 790.539 900.451 880.375 910.946 530.376 840.205 700.403 900.356 930.553 850.643 860.497 850.824 790.756 810.515 86
GMLPs0.538 880.495 980.693 760.647 820.471 870.793 660.300 930.477 940.505 720.358 920.903 960.327 910.081 990.472 820.529 800.448 950.710 610.509 810.746 870.737 860.554 78
PanopticFusion-label0.529 890.491 990.688 800.604 890.386 940.632 980.225 1030.705 670.434 920.293 980.815 1010.348 890.241 570.499 770.669 650.507 880.649 830.442 970.796 820.602 1010.561 74
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
subcloud_weak0.516 900.676 780.591 990.609 870.442 900.774 750.335 890.597 820.422 940.357 930.932 820.341 900.094 980.298 950.528 810.473 930.676 770.495 860.602 1000.721 900.349 101
Online SegFusion0.515 910.607 890.644 900.579 920.434 910.630 990.353 870.628 800.440 900.410 860.762 1040.307 930.167 860.520 710.403 910.516 870.565 930.447 950.678 950.701 920.514 87
Davide Menini, Suryansh Kumar, Martin R. Oswald, Erik Sandstroem, Cristian Sminchisescu, Luc van Gool: A Real-Time Learning Framework for Joint 3D Reconstruction and Semantic Segmentation. Robotics and Automation Letters Submission
3DMV, FTSDF0.501 920.558 940.608 970.424 1030.478 860.690 900.246 990.586 840.468 830.450 790.911 920.394 800.160 890.438 840.212 1000.432 960.541 980.475 900.742 880.727 880.477 92
PCNN0.498 930.559 930.644 900.560 940.420 930.711 890.229 1010.414 950.436 910.352 940.941 670.324 920.155 900.238 1000.387 920.493 890.529 990.509 810.813 810.751 830.504 88
3DMV0.484 940.484 1000.538 1010.643 840.424 920.606 1020.310 910.574 860.433 930.378 890.796 1020.301 940.214 670.537 690.208 1010.472 940.507 1020.413 1000.693 930.602 1010.539 80
Angela Dai, Matthias Niessner: 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation. ECCV'18
PointCNN with RGBpermissive0.458 950.577 920.611 950.356 1050.321 1020.715 880.299 950.376 990.328 1020.319 960.944 620.285 960.164 870.216 1030.229 980.484 910.545 970.456 930.755 860.709 910.475 93
Yangyan Li, Rui Bu, Mingchao Sun, Baoquan Chen: PointCNN. NeurIPS 2018
FCPNpermissive0.447 960.679 770.604 980.578 930.380 950.682 920.291 960.106 1050.483 800.258 1030.920 890.258 1000.025 1030.231 1020.325 940.480 920.560 950.463 920.725 900.666 970.231 105
Dario Rethage, Johanna Wald, Jürgen Sturm, Nassir Navab, Federico Tombari: Fully-Convolutional Point Networks for Large-Scale Point Clouds. ECCV 2018
DGCNN_reproducecopyleft0.446 970.474 1010.623 930.463 990.366 970.651 950.310 910.389 980.349 1000.330 950.937 730.271 980.126 950.285 960.224 990.350 1020.577 920.445 960.625 980.723 890.394 97
Yue Wang, Yongbin Sun, Ziwei Liu, Sanjay E. Sarma, Michael M. Bronstein, Justin M. Solomon: Dynamic Graph CNN for Learning on Point Clouds. TOG 2019
PNET20.442 980.548 950.548 1000.597 910.363 980.628 1000.300 930.292 1000.374 970.307 970.881 980.268 990.186 790.238 1000.204 1020.407 980.506 1030.449 940.667 960.620 1000.462 95
SurfaceConvPF0.442 980.505 970.622 940.380 1040.342 1000.654 940.227 1020.397 970.367 980.276 1000.924 860.240 1010.198 750.359 920.262 960.366 990.581 910.435 980.640 970.668 960.398 96
Hao Pan, Shilin Liu, Yang Liu, Xin Tong: Convolutional Neural Networks on 3D Surfaces Using Parallel Frames.
Tangent Convolutionspermissive0.438 1000.437 1030.646 890.474 980.369 960.645 960.353 870.258 1020.282 1040.279 990.918 910.298 950.147 940.283 970.294 950.487 900.562 940.427 990.619 990.633 990.352 100
Maxim Tatarchenko, Jaesik Park, Vladlen Koltun, Qian-Yi Zhou: Tangent convolutions for dense prediction in 3d. CVPR 2018
3DWSSS0.425 1010.525 960.647 880.522 950.324 1010.488 1050.077 1060.712 650.353 990.401 870.636 1060.281 970.176 820.340 930.565 750.175 1060.551 960.398 1010.370 1060.602 1010.361 99
SPLAT Netcopyleft0.393 1020.472 1020.511 1020.606 880.311 1030.656 930.245 1000.405 960.328 1020.197 1040.927 850.227 1030.000 1060.001 1070.249 970.271 1050.510 1000.383 1030.593 1010.699 930.267 103
Hang Su, Varun Jampani, Deqing Sun, Subhransu Maji, Evangelos Kalogerakis, Ming-Hsuan Yang, Jan Kautz: SPLATNet: Sparse Lattice Networks for Point Cloud Processing. CVPR 2018
ScanNet+FTSDF0.383 1030.297 1050.491 1030.432 1020.358 990.612 1010.274 970.116 1040.411 950.265 1010.904 950.229 1020.079 1000.250 980.185 1030.320 1030.510 1000.385 1020.548 1020.597 1040.394 97
PointNet++permissive0.339 1040.584 910.478 1040.458 1000.256 1050.360 1060.250 980.247 1030.278 1050.261 1020.677 1050.183 1040.117 960.212 1040.145 1050.364 1000.346 1060.232 1060.548 1020.523 1050.252 104
Charles R. Qi, Li Yi, Hao Su, Leonidas J. Guibas: pointnet++: deep hierarchical feature learning on point sets in a metric space.
SSC-UNetpermissive0.308 1050.353 1040.290 1060.278 1060.166 1060.553 1030.169 1050.286 1010.147 1060.148 1060.908 930.182 1050.064 1010.023 1060.018 1070.354 1010.363 1040.345 1040.546 1040.685 940.278 102
ScanNetpermissive0.306 1060.203 1060.366 1050.501 960.311 1030.524 1040.211 1040.002 1070.342 1010.189 1050.786 1030.145 1060.102 970.245 990.152 1040.318 1040.348 1050.300 1050.460 1050.437 1060.182 106
Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner: ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. CVPR'17
ERROR0.054 1070.000 1070.041 1070.172 1070.030 1070.062 1070.001 1070.035 1060.004 1070.051 1070.143 1070.019 1070.003 1050.041 1050.050 1060.003 1070.054 1070.018 1070.005 1070.264 1070.082 107


This table lists the benchmark results for the 3D semantic instance scenario.




Method Infoavg ap 50%bathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Queryformer0.787 11.000 10.933 10.601 350.754 10.886 50.558 20.661 260.767 30.665 50.716 30.639 120.808 31.000 10.844 10.897 20.804 21.000 10.624 3
MAFT0.786 21.000 10.894 50.807 130.694 60.893 30.486 50.674 220.740 50.786 10.704 50.727 10.739 71.000 10.707 100.849 50.756 111.000 10.685 1
Mask3D0.780 31.000 10.786 280.716 260.696 50.885 60.500 40.714 180.810 20.672 40.715 40.679 80.809 21.000 10.831 20.833 90.787 41.000 10.602 7
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
SPFormerpermissive0.770 40.903 410.903 20.806 140.609 180.886 40.568 10.815 60.705 80.711 20.655 70.652 110.685 121.000 10.789 40.809 150.776 71.000 10.583 12
Sun Jiahao, Qing Chunmei, Tan Junpeng, Xu Xiangmin: Superpoint Transformer for 3D Scene Instance Segmentation. AAAI 2023 [Oral]
SoftGroup++0.769 51.000 10.803 210.937 10.684 70.865 80.213 210.870 20.664 100.571 110.758 10.702 50.807 41.000 10.653 170.902 10.792 31.000 10.626 2
ISBNetpermissive0.763 61.000 10.873 60.717 250.666 100.858 120.508 30.667 240.764 40.643 60.676 60.688 70.825 11.000 10.773 50.741 280.777 61.000 10.556 18
Tuan Duc Ngo, Binh-Son Hua, Khoi Nguyen: ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution. CVPR 2023
SoftGrouppermissive0.761 71.000 10.808 180.845 80.716 20.862 100.243 180.824 40.655 120.620 70.734 20.699 60.791 60.981 260.716 80.844 60.769 81.000 10.594 10
Thang Vu, Kookhoi Kim, Tung M. Luu, Xuan Thanh Nguyen, Chang D. Yoo: SoftGroup for 3D Instance Segmentaiton on Point Clouds. CVPR 2022 [Oral]
TD3D0.751 81.000 10.774 290.867 70.621 140.934 10.404 80.706 190.812 10.605 90.633 120.626 130.690 111.000 10.640 190.820 120.777 51.000 10.612 5
PBNetpermissive0.747 91.000 10.818 140.837 100.713 30.844 130.457 70.647 290.711 70.614 80.617 140.657 100.650 141.000 10.692 110.822 110.765 101.000 10.595 9
W.Zhao, Y.Yan, C.Yang, J.Ye,X.Yang,K.Huang: Divide and Conquer: 3D Instance Segmentation With Point-Wise Binarization. ICCV 2023
GraphCut0.732 101.000 10.788 260.724 240.642 120.859 110.248 170.787 110.618 150.596 100.653 90.722 30.583 321.000 10.766 60.861 30.825 11.000 10.504 24
IPCA-Inst0.731 111.000 10.788 270.884 60.698 40.788 280.252 160.760 130.646 130.511 190.637 110.665 90.804 51.000 10.644 180.778 180.747 131.000 10.561 16
TopoSeg0.725 121.000 10.806 200.933 20.668 90.758 310.272 150.734 170.630 140.549 150.654 80.606 140.697 100.966 280.612 230.839 70.754 121.000 10.573 13
DKNet0.718 131.000 10.814 150.782 170.619 150.872 70.224 190.751 150.569 190.677 30.585 170.724 20.633 240.981 260.515 330.819 130.736 141.000 10.617 4
Yizheng Wu, Min Shi, Shuaiyuan Du, Hao Lu, Zhiguo Cao, Weicai Zhong: 3D Instances as 1D Kernels. ECCV 2022
SSEC0.707 141.000 10.850 80.924 30.648 110.747 340.162 230.862 30.572 180.520 170.624 130.549 170.649 221.000 10.560 280.706 340.768 91.000 10.591 11
HAISpermissive0.699 151.000 10.849 90.820 110.675 80.808 220.279 130.757 140.465 240.517 180.596 150.559 160.600 261.000 10.654 160.767 200.676 180.994 360.560 17
Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang: Hierarchical Aggregation for 3D Instance Segmentation. ICCV 2021
SSTNetpermissive0.698 161.000 10.697 450.888 50.556 240.803 230.387 90.626 310.417 280.556 140.585 180.702 40.600 261.000 10.824 30.720 330.692 161.000 10.509 23
Zhihao Liang, Zhihao Li, Songcen Xu, Mingkui Tan, Kui Jia: Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks. ICCV2021
DualGroup0.694 171.000 10.799 230.811 120.622 130.817 170.376 100.805 90.590 170.487 220.568 210.525 210.650 140.835 400.600 240.829 100.655 201.000 10.526 20
SphereSeg0.680 181.000 10.856 70.744 230.618 160.893 20.151 240.651 280.713 60.537 160.579 200.430 300.651 131.000 10.389 430.744 270.697 150.991 380.601 8
Box2Mask0.677 191.000 10.847 100.771 190.509 330.816 180.277 140.558 380.482 210.562 130.640 100.448 260.700 81.000 10.666 120.852 40.578 320.997 310.488 28
Julian Chibane, Francis Engelmann, Tuan Anh Tran, Gerard Pons-Moll: Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation Using Bounding Boxes. ECCV 2022
OccuSeg+instance0.672 201.000 10.758 370.682 290.576 220.842 140.477 60.504 430.524 200.567 120.585 190.451 250.557 341.000 10.751 70.797 160.563 351.000 10.467 32
Lei Han, Tian Zheng, Lan Xu, Lu Fang: OccuSeg: Occupancy-aware 3D Instance Segmentation. CVPR2020
Mask-Group0.664 211.000 10.822 130.764 220.616 170.815 190.139 280.694 210.597 160.459 260.566 220.599 150.600 260.516 500.715 90.819 140.635 241.000 10.603 6
Min Zhong, Xinghao Chen, Xiaokang Chen, Gang Zeng, Yunhe Wang: MaskGroup: Hierarchical Point Grouping and Masking for 3D Instance Segmentation. ICME 2022
INS-Conv-instance0.657 221.000 10.760 350.667 310.581 200.863 90.323 110.655 270.477 220.473 240.549 240.432 290.650 141.000 10.655 150.738 290.585 310.944 420.472 31
CSC-Pretrained0.648 231.000 10.810 160.768 200.523 310.813 200.143 270.819 50.389 310.422 350.511 280.443 270.650 141.000 10.624 210.732 300.634 251.000 10.375 39
PE0.645 241.000 10.773 310.798 160.538 260.786 290.088 360.799 100.350 350.435 330.547 250.545 180.646 230.933 300.562 270.761 230.556 400.997 310.501 26
Biao Zhang, Peter Wonka: Point Cloud Instance Segmentation using Probabilistic Embeddings. CVPR 2021
RPGN0.643 251.000 10.758 360.582 410.539 250.826 160.046 400.765 120.372 330.436 320.588 160.539 200.650 141.000 10.577 250.750 250.653 220.997 310.495 27
Shichao Dong, Guosheng Lin, Tzu-Yi Hung: Learning Regional Purity for Instance Segmentation on 3D Point Clouds. ECCV 2022
Dyco3Dcopyleft0.641 261.000 10.841 110.893 40.531 280.802 240.115 330.588 360.448 250.438 300.537 270.430 310.550 350.857 320.534 310.764 220.657 190.987 390.568 14
Tong He; Chunhua Shen; Anton van den Hengel: DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution. CVPR2021
GICN0.638 271.000 10.895 40.800 150.480 370.676 390.144 260.737 160.354 340.447 270.400 410.365 360.700 81.000 10.569 260.836 80.599 271.000 10.473 30
PointGroup0.636 281.000 10.765 320.624 330.505 350.797 250.116 320.696 200.384 320.441 280.559 230.476 230.596 291.000 10.666 120.756 240.556 390.997 310.513 22
Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia: PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation. CVPR 2020 [oral]
DD-UNet+Group0.635 290.667 420.797 250.714 270.562 230.774 300.146 250.810 80.429 270.476 230.546 260.399 330.633 241.000 10.632 200.722 320.609 261.000 10.514 21
H. Liu, R. Liu, K. Yang, J. Zhang, K. Peng, R. Stiefelhagen: HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor. ICCVW 2021
DENet0.629 301.000 10.797 240.608 340.589 190.627 430.219 200.882 10.310 370.402 400.383 430.396 340.650 141.000 10.663 140.543 510.691 171.000 10.568 15
3D-MPA0.611 311.000 10.833 120.765 210.526 300.756 320.136 300.588 360.470 230.438 310.432 370.358 380.650 140.857 320.429 390.765 210.557 381.000 10.430 34
Francis Engelmann, Martin Bokeloh, Alireza Fathi, Bastian Leibe, Matthias Nießner: 3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation. CVPR 2020
OSIS0.605 321.000 10.801 220.599 360.535 270.728 360.286 120.436 470.679 90.491 200.433 350.256 400.404 470.857 320.620 220.724 310.510 451.000 10.539 19
AOIA0.601 331.000 10.761 340.687 280.485 360.828 150.008 460.663 250.405 300.405 390.425 380.490 220.596 290.714 430.553 300.779 170.597 280.992 370.424 36
PCJC0.578 341.000 10.810 170.583 400.449 400.813 210.042 410.603 340.341 360.490 210.465 320.410 320.650 140.835 400.264 490.694 380.561 360.889 470.504 25
SSEN0.575 351.000 10.761 330.473 430.477 380.795 260.066 370.529 400.658 110.460 250.461 330.380 350.331 490.859 310.401 420.692 400.653 211.000 10.348 41
Dongsu Zhang, Junha Chun, Sang Kyun Cha, Young Min Kim: Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with Deep Metric Learning. Arxiv
RWSeg0.567 360.528 520.708 440.626 320.580 210.745 350.063 380.627 300.240 410.400 410.497 290.464 240.515 361.000 10.475 350.745 260.571 331.000 10.429 35
NeuralBF0.555 370.667 420.896 30.843 90.517 320.751 330.029 420.519 410.414 290.439 290.465 310.000 580.484 380.857 320.287 470.693 390.651 231.000 10.485 29
Weiwei Sun, Daniel Rebain, Renjie Liao, Vladimir Tankovich, Soroosh Yazdani, Kwang Moo Yi, Andrea Tagliasacchi: NeuralBF: Neural Bilateral Filtering for Top-down Instance Segmentation on Point Clouds. WACV 2023
MTML0.549 381.000 10.807 190.588 390.327 450.647 410.004 480.815 70.180 440.418 360.364 450.182 430.445 411.000 10.442 380.688 410.571 341.000 10.396 37
Jean Lahoud, Bernard Ghanem, Marc Pollefeys, Martin R. Oswald: 3D Instance Segmentation via Multi-task Metric Learning. ICCV 2019 [oral]
ClickSeg_Instance0.539 391.000 10.621 480.300 460.530 290.698 370.127 310.533 390.222 420.430 340.400 400.365 360.574 330.938 290.472 360.659 430.543 410.944 420.347 42
One_Thing_One_Clickpermissive0.529 400.667 420.718 400.777 180.399 410.683 380.000 510.669 230.138 470.391 420.374 440.539 190.360 480.641 470.556 290.774 190.593 290.997 310.251 47
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
Sparse R-CNN0.515 411.000 10.538 530.282 470.468 390.790 270.173 220.345 490.429 260.413 380.484 300.176 440.595 310.591 480.522 320.668 420.476 460.986 400.327 43
Occipital-SCS0.512 421.000 10.716 410.509 420.506 340.611 440.092 350.602 350.177 450.346 450.383 420.165 450.442 420.850 390.386 440.618 470.543 420.889 470.389 38
3D-BoNet0.488 431.000 10.672 470.590 380.301 470.484 540.098 340.620 320.306 380.341 460.259 490.125 470.434 440.796 420.402 410.499 530.513 440.909 460.439 33
Bo Yang, Jianan Wang, Ronald Clark, Qingyong Hu, Sen Wang, Andrew Markham, Niki Trigoni: Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds. NeurIPS 2019 Spotlight
PanopticFusion-inst0.478 440.667 420.712 430.595 370.259 500.550 500.000 510.613 330.175 460.250 510.434 340.437 280.411 460.857 320.485 340.591 500.267 560.944 420.359 40
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
SPG_WSIS0.470 450.667 420.685 460.677 300.372 430.562 480.000 510.482 440.244 400.316 480.298 460.052 530.442 430.857 320.267 480.702 350.559 371.000 10.287 45
SALoss-ResNet0.459 461.000 10.737 390.159 570.259 490.587 460.138 290.475 450.217 430.416 370.408 390.128 460.315 500.714 430.411 400.536 520.590 300.873 500.304 44
Zhidong Liang, Ming Yang, Hao Li, Chunxiang Wang: 3D Instance Embedding Learning With a Structure-Aware Loss Function for Point Cloud Segmentation. IEEE Robotics and Automation Letters (IROS2020)
MASCpermissive0.447 470.528 520.555 510.381 440.382 420.633 420.002 490.509 420.260 390.361 440.432 360.327 390.451 400.571 490.367 450.639 450.386 470.980 410.276 46
Chen Liu, Yasutaka Furukawa: MASC: Multi-scale Affinity with Sparse Convolution for 3D Instance Segmentation.
SegGroup_inspermissive0.445 480.667 420.773 300.185 540.317 460.656 400.000 510.407 480.134 480.381 430.267 480.217 420.476 390.714 430.452 370.629 460.514 431.000 10.222 50
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
3D-SISpermissive0.382 491.000 10.432 550.245 490.190 510.577 470.013 450.263 510.033 540.320 470.240 500.075 490.422 450.857 320.117 530.699 360.271 550.883 490.235 49
Ji Hou, Angela Dai, Matthias Niessner: 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans. CVPR 2019
Hier3Dcopyleft0.323 500.667 420.542 520.264 480.157 540.550 490.000 510.205 540.009 550.270 500.218 510.075 490.500 370.688 460.007 590.698 370.301 520.459 560.200 51
Tan: HCFS3D: Hierarchical Coupled Feature Selection Network for 3D Semantic and Instance Segmentation.
UNet-backbone0.319 510.667 420.715 420.233 500.189 520.479 550.008 460.218 520.067 530.201 530.173 520.107 480.123 550.438 510.150 510.615 480.355 480.916 450.093 58
R-PointNet0.306 520.500 540.405 560.311 450.348 440.589 450.054 390.068 570.126 490.283 490.290 470.028 540.219 530.214 540.331 460.396 570.275 530.821 520.245 48
Region-18class0.284 530.250 580.751 380.228 520.270 480.521 510.000 510.468 460.008 570.205 520.127 530.000 580.068 570.070 570.262 500.652 440.323 500.740 530.173 52
SemRegionNet-20cls0.250 540.333 550.613 490.229 510.163 530.493 520.000 510.304 500.107 500.147 550.100 540.052 520.231 510.119 550.039 550.445 550.325 490.654 540.141 54
tmp0.248 550.667 420.437 540.188 530.153 550.491 530.000 510.208 530.094 520.153 540.099 550.057 510.217 540.119 550.039 550.466 540.302 510.640 550.140 55
3D-BEVIS0.248 550.667 420.566 500.076 580.035 590.394 570.027 440.035 580.098 510.099 570.030 580.025 550.098 560.375 530.126 520.604 490.181 570.854 510.171 53
Cathrin Elich, Francis Engelmann, Jonas Schult, Theodora Kontogianni, Bastian Leibe: 3D-BEVIS: Birds-Eye-View Instance Segmentation.
ASIS0.199 570.333 550.253 580.167 560.140 560.438 560.000 510.177 550.008 560.121 560.069 560.004 570.231 520.429 520.036 570.445 560.273 540.333 580.119 57
Sgpn_scannet0.143 580.208 590.390 570.169 550.065 570.275 580.029 430.069 560.000 580.087 580.043 570.014 560.027 590.000 580.112 540.351 580.168 580.438 570.138 56
MaskRCNN 2d->3d Proj0.058 590.333 550.002 590.000 590.053 580.002 590.002 500.021 590.000 580.045 590.024 590.238 410.065 580.000 580.014 580.107 590.020 590.110 590.006 59


This table lists the benchmark results for the 2D semantic label scenario.


Method Infoavg ioubathtubbedbookshelfcabinetchaircountercurtaindeskdoorfloorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwallwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Virtual MVFusion (R)0.745 10.861 10.839 10.881 10.672 10.512 10.422 150.898 10.723 10.714 10.954 20.454 10.509 10.773 10.895 10.756 10.820 10.653 10.935 10.891 10.728 1
Abhijit Kundu, Xiaoqi Yin, Alireza Fathi, David Ross, Brian Brewington, Thomas Funkhouser, Caroline Pantofaru: Virtual Multi-view Fusion for 3D Semantic Segmentation. ECCV 2020
BPNet_2Dcopyleft0.670 20.822 30.795 30.836 20.659 20.481 20.451 110.769 30.656 30.567 30.931 30.395 40.390 40.700 30.534 30.689 90.770 20.574 30.865 60.831 30.675 4
Wenbo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia and Tien-Tsin Wong: Bidirectional Projection Network for Cross Dimension Scene Understanding. CVPR 2021 (Oral)
CU-Hybrid-2D Net0.636 30.825 20.820 20.179 200.648 30.463 30.549 20.742 60.676 20.628 20.961 10.420 20.379 50.684 60.381 150.732 20.723 30.599 20.827 130.851 20.634 6
CMX0.613 40.681 70.725 90.502 120.634 50.297 150.478 90.830 20.651 40.537 60.924 40.375 50.315 120.686 50.451 120.714 40.543 180.504 50.894 40.823 40.688 3
DMMF_3d0.605 50.651 80.744 70.782 30.637 40.387 40.536 30.732 70.590 60.540 50.856 180.359 90.306 130.596 110.539 20.627 180.706 40.497 70.785 180.757 160.476 19
MCA-Net0.595 60.533 170.756 60.746 40.590 80.334 70.506 60.670 120.587 70.500 100.905 80.366 80.352 80.601 100.506 60.669 150.648 70.501 60.839 120.769 120.516 18
RFBNet0.592 70.616 90.758 50.659 50.581 90.330 80.469 100.655 150.543 120.524 70.924 40.355 100.336 100.572 140.479 80.671 130.648 70.480 90.814 160.814 50.614 9
FAN_NV_RVC0.586 80.510 180.764 40.079 230.620 70.330 80.494 70.753 40.573 80.556 40.884 130.405 30.303 140.718 20.452 110.672 120.658 50.509 40.898 30.813 60.727 2
DCRedNet0.583 90.682 60.723 100.542 110.510 170.310 120.451 110.668 130.549 110.520 80.920 60.375 50.446 20.528 170.417 130.670 140.577 150.478 100.862 70.806 70.628 8
MIX6D_RVC0.582 100.695 40.687 140.225 180.632 60.328 100.550 10.748 50.623 50.494 130.890 110.350 120.254 200.688 40.454 100.716 30.597 140.489 80.881 50.768 130.575 12
SSMAcopyleft0.577 110.695 40.716 120.439 140.563 110.314 110.444 130.719 80.551 100.503 90.887 120.346 130.348 90.603 90.353 170.709 50.600 120.457 120.901 20.786 80.599 11
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. International Journal of Computer Vision, 2019
UNIV_CNP_RVC_UE0.566 120.569 160.686 160.435 150.524 140.294 160.421 160.712 90.543 120.463 150.872 140.320 140.363 70.611 80.477 90.686 100.627 90.443 150.862 70.775 110.639 5
EMSAFormer0.564 130.581 130.736 80.564 100.546 130.219 200.517 40.675 110.486 170.427 190.904 90.352 110.320 110.589 120.528 40.708 60.464 210.413 190.847 110.786 80.611 10
SN_RN152pyrx8_RVCcopyleft0.546 140.572 140.663 180.638 70.518 150.298 140.366 210.633 180.510 150.446 170.864 160.296 170.267 170.542 160.346 180.704 70.575 160.431 160.853 100.766 140.630 7
UDSSEG_RVC0.545 150.610 110.661 190.588 80.556 120.268 180.482 80.642 170.572 90.475 140.836 200.312 150.367 60.630 70.189 200.639 170.495 200.452 130.826 140.756 170.541 14
segfomer with 6d0.542 160.594 120.687 140.146 210.579 100.308 130.515 50.703 100.472 180.498 110.868 150.369 70.282 150.589 120.390 140.701 80.556 170.416 180.860 90.759 150.539 16
FuseNetpermissive0.535 170.570 150.681 170.182 190.512 160.290 170.431 140.659 140.504 160.495 120.903 100.308 160.428 30.523 180.365 160.676 110.621 110.470 110.762 190.779 100.541 14
Caner Hazirbas, Lingni Ma, Csaba Domokos, Daniel Cremers: FuseNet: Incorporating Depth into Semantic Segmentation via Fusion-based CNN Architecture. ACCV 2016
AdapNet++copyleft0.503 180.613 100.722 110.418 160.358 230.337 60.370 200.479 210.443 190.368 210.907 70.207 200.213 220.464 210.525 50.618 190.657 60.450 140.788 170.721 200.408 22
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. International Journal of Computer Vision, 2019
3DMV (2d proj)0.498 190.481 210.612 200.579 90.456 190.343 50.384 180.623 190.525 140.381 200.845 190.254 190.264 190.557 150.182 210.581 210.598 130.429 170.760 200.661 220.446 21
Angela Dai, Matthias Niessner: 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation. ECCV'18
MSeg1080_RVCpermissive0.485 200.505 190.709 130.092 220.427 200.241 190.411 170.654 160.385 230.457 160.861 170.053 230.279 160.503 190.481 70.645 160.626 100.365 210.748 210.725 190.529 17
John Lambert*, Zhuang Liu*, Ozan Sener, James Hays, Vladlen Koltun: MSeg: A Composite Dataset for Multi-domain Semantic Segmentation. CVPR 2020
ILC-PSPNet0.475 210.490 200.581 210.289 170.507 180.067 230.379 190.610 200.417 210.435 180.822 220.278 180.267 170.503 190.228 190.616 200.533 190.375 200.820 150.729 180.560 13
Enet (reimpl)0.376 220.264 230.452 230.452 130.365 210.181 210.143 230.456 220.409 220.346 220.769 230.164 210.218 210.359 220.123 230.403 230.381 230.313 230.571 220.685 210.472 20
Re-implementation of Adam Paszke, Abhishek Chaurasia, Sangpil Kim, Eugenio Culurciello: ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation.
ScanNet (2d proj)permissive0.330 230.293 220.521 220.657 60.361 220.161 220.250 220.004 230.440 200.183 230.836 200.125 220.060 230.319 230.132 220.417 220.412 220.344 220.541 230.427 230.109 23
Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner: ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. CVPR'17
DMMF0.003 240.000 240.005 240.000 240.000 240.037 240.001 240.000 240.001 240.005 240.003 240.000 240.000 240.000 240.000 240.000 240.002 240.001 240.000 240.006 240.000 24


This table lists the benchmark results for the 2D semantic instance scenario.




Method Infoavg apbathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
UniDet_RVC0.205 10.381 10.323 10.037 10.226 10.177 10.063 10.277 10.120 10.067 10.131 10.074 20.317 10.080 10.235 10.289 10.141 10.678 10.080 1
MaskRCNN_ScanNetpermissive0.119 20.129 20.212 20.002 20.112 20.148 20.014 20.205 20.044 20.066 20.078 20.095 10.142 20.030 20.128 20.139 20.080 20.459 20.057 2
Re-implementation of Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross Girshick: Mask R-CNN. ICCV'17


This table lists the benchmark results for the scene type classification scenario.




Method Infoavg recallapartmentbathroombedroom / hotelbookstore / libraryconference roomcopy/mail roomhallwaykitchenlaundry roomliving room / loungemiscofficestorage / basement / garage
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
multi-taskpermissive0.700 10.500 11.000 10.882 20.500 21.000 11.000 10.500 21.000 11.000 10.778 10.000 20.938 10.000 2
Shengyu Huang, Mikhail Usvyatsov, Konrad Schindler: Indoor Scene Recognition in 3D. IROS 2020
3DASPP-SCE0.691 20.500 10.938 20.824 31.000 11.000 10.500 21.000 10.857 20.500 20.556 30.000 20.812 20.500 1
SE-ResNeXt-SSMA0.498 30.000 40.812 30.941 10.500 20.500 30.500 20.500 20.429 40.500 20.667 20.500 10.625 30.000 2
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. arXiv
resnet50_scannet0.353 40.250 30.812 30.529 40.500 20.500 30.000 40.500 20.571 30.000 40.556 30.000 20.375 40.000 2