Presenting the ScanNet200 Benchmark

We present the ScanNet200 benchmark, which studies an order of magnitude more class categories than previous version of ScanNet. The scene geometry is shared within the two tasks, but the parsing of surface annotation allows for a larger vocabulary and more realistic setting for in the wild 3D understanding methods.

The ScanNet200 benchmark includes both finer-grained categories as well as a large number of previously unaddressed classes. This induces a much more challenging setting regarding the diversity of naturally observed semantic classes seen in the raw ScanNet RGB-D observations, where the data also reflects naturally encountered class imbalances. The difference in category frequencies between ScanNet and ScanNet200 can be seen in the Figure above.

ScanNet200 Benchmark

This table lists the benchmark results for the ScanNet200 3D semantic label scenario.




Method Infoavg iouhead ioucommon ioutail iouwallchairfloortabledoorcouchcabinetshelfdeskoffice chairbedpillowsinkpicturewindowtoiletbookshelfmonitorcurtainbookarmchaircoffee tableboxrefrigeratorlampkitchen cabinettowelclothestvnightstandcounterdresserstoolcushionplantceilingbathtubend tabledining tablekeyboardbagbackpacktoilet paperprintertv standwhiteboardblanketshower curtaintrash canclosetstairsmicrowavestoveshoecomputer towerbottlebinottomanbenchboardwashing machinemirrorcopierbasketsofa chairfile cabinetfanlaptopshowerpaperpersonpaper towel dispenserovenblindsrackplateblackboardpianosuitcaserailradiatorrecycling bincontainerwardrobesoap dispensertelephonebucketclockstandlightlaundry basketpipeclothes dryerguitartoilet paper holderseatspeakercolumnbicycleladderbathroom stallshower wallcupjacketstorage bincoffee makerdishwasherpaper towel rollmachinematwindowsillbartoasterbulletin boardironing boardfireplacesoap dishkitchen counterdoorframetoilet paper dispensermini fridgefire extinguisherballhatshower curtain rodwater coolerpaper cuttertrayshower doorpillarledgetoaster ovenmousetoilet seat cover dispenserfurniturecartstorage containerscaletissue boxlight switchcratepower outletdecorationsignprojectorcloset doorvacuum cleanercandleplungerstuffed animalheadphonesdish rackbroomguitar caserange hooddustpanhair dryerwater bottlehandicap barpurseventshower floorwater pitchermailboxbowlpaper bagalarm clockmusic standprojector screendividerlaundry detergentbathroom counterobjectbathroom vanitycloset walllaundry hamperbathroom stall doorceiling lighttrash bindumbbellstair railtubebathroom cabinetcd casecloset rodcoffee kettlestructureshower headkeyboard pianocase of water bottlescoat rackstorage organizerfolded chairfire alarmpower stripcalendarposterpotted plantluggagemattress
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
OctFormer ScanNet200permissive0.326 40.539 40.265 30.131 30.806 30.670 40.943 30.535 40.662 10.705 70.423 40.407 30.505 50.003 60.765 40.582 20.686 60.227 70.680 30.943 30.601 10.854 50.892 10.335 10.417 80.357 40.724 50.453 30.632 30.596 10.432 20.783 40.512 80.021 60.244 60.637 10.000 10.787 30.873 30.743 60.000 80.000 50.534 30.110 10.499 20.289 30.626 30.620 60.168 80.204 10.849 20.679 20.117 10.633 40.684 10.650 30.552 10.684 50.312 20.000 30.175 30.429 40.865 20.413 10.837 40.000 10.145 30.626 30.451 20.487 40.513 10.000 10.529 30.613 40.000 40.033 20.000 10.000 30.828 10.871 10.622 20.587 40.411 20.137 60.645 50.343 30.000 30.000 30.000 10.022 60.000 20.026 80.829 50.000 10.022 40.089 30.842 10.253 70.318 80.296 10.178 30.291 20.224 10.584 20.200 60.132 40.000 30.128 20.227 70.000 10.230 40.047 50.149 20.331 40.412 30.618 20.164 40.102 40.522 10.000 10.655 20.378 40.469 60.000 10.000 50.000 30.105 30.000 40.000 30.483 20.000 30.000 20.028 20.000 10.000 10.906 10.000 10.339 60.000 10.000 40.457 40.000 10.612 30.000 10.000 10.408 10.000 70.900 40.000 40.000 30.000 10.029 30.000 10.074 80.455 60.479 20.427 40.079 60.140 60.496 30.414 50.022 10.000 10.471 50.000 10.000 20.000 40.722 20.000 20.000 10.000 10.138 50.000 20.000 20.000 30.000 1
Peng-Shuai Wang: OctFormer: Octree-based Transformers for 3D Point Clouds. SIGGRAPH 2023
OA-CNN-L_ScanNet2000.333 20.558 10.269 20.124 40.821 10.703 10.946 10.569 10.662 10.748 20.487 10.455 10.572 20.000 80.789 20.534 30.736 20.271 10.713 10.949 10.498 70.877 20.860 30.332 20.706 10.474 10.788 30.406 40.637 20.495 30.355 40.805 20.592 60.015 70.396 10.602 40.000 10.799 20.876 10.713 80.276 10.000 50.493 40.080 40.448 60.363 10.661 20.833 20.262 20.125 20.823 40.665 30.076 40.720 10.557 30.637 40.517 40.672 60.227 40.000 30.158 40.496 30.843 50.352 40.835 50.000 10.103 60.711 10.527 10.526 20.320 30.000 10.568 20.625 30.067 10.000 40.000 10.001 20.806 20.836 20.621 30.591 30.373 30.314 20.668 20.398 20.003 20.000 30.000 10.016 80.024 10.043 60.906 20.000 10.052 30.000 70.384 30.330 50.342 50.100 40.223 20.183 40.112 30.476 40.313 20.130 50.196 20.112 30.370 50.000 10.234 30.071 40.160 10.403 20.398 50.492 70.197 10.076 50.272 30.000 10.200 80.560 20.735 30.000 10.000 50.000 30.110 20.002 30.021 20.412 30.000 30.000 20.000 40.000 10.000 10.794 40.000 10.445 10.000 10.022 20.509 30.000 10.517 70.000 10.000 10.001 80.245 20.915 20.024 20.089 10.000 10.262 10.000 10.103 60.524 20.392 40.515 20.013 80.251 30.411 60.662 10.001 50.000 10.473 40.000 10.000 20.150 30.699 30.000 20.000 10.000 10.166 20.000 20.024 10.000 30.000 1
PPT-SpUNet-F.T.0.332 30.556 20.270 10.123 50.816 20.682 20.946 10.549 30.657 30.756 10.459 30.376 40.550 30.001 70.807 10.616 10.727 30.267 20.691 20.942 40.530 40.872 30.874 20.330 30.542 50.374 30.792 20.400 50.673 10.572 20.433 10.793 30.623 20.008 80.351 30.594 50.000 10.783 40.876 10.833 20.213 20.000 50.537 20.091 20.519 10.304 20.620 40.942 10.264 10.124 30.855 10.695 10.086 30.646 30.506 70.658 20.535 20.715 20.314 10.000 30.241 10.608 20.897 10.359 30.858 30.000 10.076 80.611 40.392 30.509 30.378 20.000 10.579 10.565 70.000 40.000 40.000 10.000 30.755 30.806 40.661 10.572 60.350 40.181 40.660 30.300 50.000 30.000 30.000 10.023 50.000 20.042 70.930 10.000 10.000 60.077 40.584 20.392 30.339 60.185 30.171 40.308 10.006 70.563 30.256 30.150 10.000 30.002 70.345 60.000 10.045 50.197 10.063 30.323 50.453 10.600 30.163 50.037 60.349 20.000 10.672 10.679 10.753 10.000 10.000 50.000 30.117 10.000 40.000 30.291 50.000 30.000 20.039 10.000 10.000 10.899 20.000 10.374 50.000 10.000 40.545 20.000 10.634 10.000 10.000 10.074 50.223 30.914 30.000 40.021 20.000 10.000 40.000 10.112 30.498 50.649 10.383 50.095 10.135 70.449 50.432 40.008 30.000 10.518 20.000 10.000 20.000 40.796 10.000 20.000 10.000 10.138 50.000 20.000 20.000 30.000 1
Minkowski 34Dpermissive0.253 70.463 70.154 80.102 70.771 70.650 70.932 60.483 70.571 70.710 60.331 70.250 70.492 60.044 30.703 70.419 80.606 80.227 70.621 70.865 80.531 30.771 80.813 50.291 40.484 60.242 70.612 80.282 80.440 80.351 60.299 60.622 70.593 50.027 50.293 50.310 80.000 10.757 50.858 60.737 70.150 40.164 10.368 80.084 30.381 80.142 80.357 60.720 40.214 60.092 70.724 70.596 80.056 70.655 20.525 60.581 80.352 80.594 70.056 80.000 30.014 80.224 70.772 60.205 80.720 70.000 10.159 20.531 70.163 80.294 70.136 80.000 10.169 70.589 60.000 40.000 40.000 10.002 10.663 40.466 80.265 80.582 50.337 50.016 70.559 60.084 80.000 30.000 30.000 10.036 30.000 20.125 30.670 70.000 10.102 10.071 50.164 60.406 20.386 40.046 80.068 80.159 60.117 20.284 70.111 80.094 70.000 30.000 80.197 80.000 10.044 60.013 60.002 60.228 80.307 80.588 40.025 80.545 10.134 80.000 10.655 20.302 60.282 80.000 10.060 10.000 30.035 80.000 40.000 30.097 80.000 30.000 20.005 30.000 10.000 10.096 80.000 10.334 70.000 10.000 40.274 70.000 10.513 80.000 10.000 10.280 30.194 40.897 50.000 40.000 30.000 10.000 40.000 10.108 50.279 80.189 70.141 80.059 70.272 20.307 80.445 20.003 40.000 10.353 70.000 10.026 10.000 40.581 70.001 10.000 10.000 10.093 80.002 10.000 20.000 30.000 1
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019
CSC-Pretrainpermissive0.249 80.455 80.171 70.079 80.766 80.659 60.930 80.494 50.542 80.700 80.314 80.215 80.430 80.121 10.697 80.441 70.683 70.235 50.609 80.895 70.476 80.816 70.770 80.186 50.634 20.216 80.734 40.340 70.471 70.307 70.293 80.591 80.542 70.076 30.205 70.464 60.000 10.484 80.832 80.766 40.052 70.000 50.413 70.059 70.418 70.222 70.318 80.609 70.206 70.112 40.743 50.625 50.076 40.579 80.548 50.590 70.371 70.552 80.081 70.003 20.142 50.201 80.638 80.233 70.686 80.000 10.142 40.444 80.375 40.247 80.198 50.000 10.128 80.454 80.019 20.097 10.000 10.000 30.553 70.557 70.373 40.545 70.164 80.014 80.547 70.174 60.000 30.002 10.000 10.037 20.000 20.063 50.664 80.000 10.000 60.130 20.170 50.152 80.335 70.079 60.110 60.175 50.098 40.175 80.166 70.045 80.207 10.014 50.465 20.000 10.001 80.001 80.046 40.299 60.327 70.537 50.033 70.012 80.186 60.000 10.205 70.377 50.463 70.000 10.058 20.000 30.055 60.041 10.000 30.105 70.000 30.000 20.000 40.000 10.000 10.398 60.000 10.308 80.000 10.000 40.319 60.000 10.543 60.000 10.000 10.062 70.004 60.862 80.000 40.000 30.000 10.000 40.000 10.123 20.316 70.225 60.250 70.094 20.180 40.332 70.441 30.000 60.000 10.310 80.000 10.000 20.000 40.592 60.000 20.000 10.000 10.203 10.000 20.000 20.000 30.000 1
Ji Hou, Benjamin Graham, Matthias Nießner, Saining Xie: Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts. CVPR 2021
AWCS0.305 50.508 50.225 50.142 20.782 50.634 80.937 50.489 60.578 50.721 30.364 60.355 50.515 40.023 50.764 50.523 40.707 50.264 30.633 50.922 50.507 60.886 10.804 60.179 60.436 70.300 50.656 70.529 20.501 60.394 40.296 70.820 10.603 30.131 20.179 80.619 20.000 10.707 70.865 50.773 30.171 30.010 40.484 50.063 60.463 50.254 50.332 70.649 50.220 50.100 50.729 60.613 60.071 60.582 70.628 20.702 10.424 60.749 10.137 60.000 30.142 50.360 50.863 30.305 50.877 20.000 10.173 10.606 50.337 50.478 50.154 60.000 10.253 50.664 20.000 40.000 40.000 10.000 30.626 60.782 50.302 70.602 20.185 70.282 30.651 40.317 40.000 30.000 30.000 10.022 60.000 20.154 10.876 40.000 10.014 50.063 60.029 80.553 10.467 20.084 50.124 50.157 70.049 60.373 50.252 40.097 60.000 30.219 10.542 10.000 10.392 10.172 30.000 70.339 30.417 20.533 60.093 60.115 30.195 50.000 10.516 50.288 70.741 20.000 10.001 40.233 20.056 50.000 40.159 10.334 40.077 20.000 20.000 40.000 10.000 10.749 50.000 10.411 30.000 10.008 30.452 50.000 10.595 40.000 10.000 10.220 40.006 50.894 60.006 30.000 30.000 10.000 40.000 10.112 30.504 30.404 30.551 10.093 30.129 80.484 40.381 80.000 60.000 10.396 60.000 10.000 20.620 20.402 80.000 20.000 10.000 10.142 40.000 20.000 20.512 20.000 1
LGroundpermissive0.272 60.485 60.184 60.106 60.778 60.676 30.932 60.479 80.572 60.718 50.399 50.265 60.453 70.085 20.745 60.446 60.726 40.232 60.622 60.901 60.512 50.826 60.786 70.178 70.549 40.277 60.659 60.381 60.518 50.295 80.323 50.777 50.599 40.028 40.321 40.363 70.000 10.708 60.858 60.746 50.063 60.022 30.457 60.077 50.476 30.243 60.402 50.397 80.233 40.077 80.720 80.610 70.103 20.629 50.437 80.626 50.446 50.702 30.190 50.005 10.058 70.322 60.702 70.244 60.768 60.000 10.134 50.552 60.279 70.395 60.147 70.000 10.207 60.612 50.000 40.000 40.000 10.000 30.658 50.566 60.323 60.525 80.229 60.179 50.467 80.154 70.000 30.002 10.000 10.051 10.000 20.127 20.703 60.000 10.000 60.216 10.112 70.358 40.547 10.187 20.092 70.156 80.055 50.296 60.252 40.143 20.000 30.014 50.398 30.000 10.028 70.173 20.000 70.265 70.348 60.415 80.179 20.019 70.218 40.000 10.597 40.274 80.565 40.000 10.012 30.000 30.039 70.022 20.000 30.117 60.000 30.000 20.000 40.000 10.000 10.324 70.000 10.384 40.000 10.000 40.251 80.000 10.566 50.000 10.000 10.066 60.404 10.886 70.199 10.000 30.000 10.059 20.000 10.136 10.540 10.127 80.295 60.085 50.143 50.514 20.413 60.000 60.000 10.498 30.000 10.000 20.000 40.623 50.000 20.000 10.000 10.132 70.000 20.000 20.000 30.000 1
David Rozenberszki, Or Litany, Angela Dai: Language-Grounded Indoor 3D Semantic Segmentation in the Wild. arXiv
CeCo0.340 10.551 30.247 40.181 10.784 40.661 50.939 40.564 20.624 40.721 30.484 20.429 20.575 10.027 40.774 30.503 50.753 10.242 40.656 40.945 20.534 20.865 40.860 30.177 80.616 30.400 20.818 10.579 10.615 40.367 50.408 30.726 60.633 10.162 10.360 20.619 20.000 10.828 10.873 30.924 10.109 50.083 20.564 10.057 80.475 40.266 40.781 10.767 30.257 30.100 50.825 30.663 40.048 80.620 60.551 40.595 60.532 30.692 40.246 30.000 30.213 20.615 10.861 40.376 20.900 10.000 10.102 70.660 20.321 60.547 10.226 40.000 10.311 40.742 10.011 30.006 30.000 10.000 30.546 80.824 30.345 50.665 10.450 10.435 10.683 10.411 10.338 10.000 30.000 10.030 40.000 20.068 40.892 30.000 10.063 20.000 70.257 40.304 60.387 30.079 60.228 10.190 30.000 80.586 10.347 10.133 30.000 30.037 40.377 40.000 10.384 20.006 70.003 50.421 10.410 40.643 10.171 30.121 20.142 70.000 10.510 60.447 30.474 50.000 10.000 50.286 10.083 40.000 40.000 30.603 10.096 10.063 10.000 40.000 10.000 10.898 30.000 10.429 20.000 10.400 10.550 10.000 10.633 20.000 10.000 10.377 20.000 70.916 10.000 40.000 30.000 10.000 40.000 10.102 70.499 40.296 50.463 30.089 40.304 10.740 10.401 70.010 20.000 10.560 10.000 10.000 20.709 10.652 40.000 20.000 10.000 10.143 30.000 20.000 20.609 10.000 1
: Understanding Imbalanced Semantic Segmentation Through Neural Collapse.


This table lists the benchmark results for the ScanNet200 3D semantic instance scenario.




Method Infoavg aphead apcommon aptail apchairtabledoorcouchcabinetshelfdeskoffice chairbedpillowsinkpicturewindowtoiletbookshelfmonitorcurtainbookarmchaircoffee tableboxrefrigeratorlampkitchen cabinettowelclothestvnightstandcounterdresserstoolcushionplantceilingbathtubend tabledining tablekeyboardbagbackpacktoilet paperprintertv standwhiteboardblanketshower curtaintrash canclosetstairsmicrowavestoveshoecomputer towerbottlebinottomanbenchboardwashing machinemirrorcopierbasketsofa chairfile cabinetfanlaptopshowerpaperpersonpaper towel dispenserovenblindsrackplateblackboardpianosuitcaserailradiatorrecycling bincontainerwardrobesoap dispensertelephonebucketclockstandlightlaundry basketpipeclothes dryerguitartoilet paper holderseatspeakercolumnbicycleladderbathroom stallshower wallcupjacketstorage bincoffee makerdishwasherpaper towel rollmachinematwindowsillbartoasterbulletin boardironing boardfireplacesoap dishkitchen counterdoorframetoilet paper dispensermini fridgefire extinguisherballhatshower curtain rodwater coolerpaper cuttertrayshower doorpillarledgetoaster ovenmousetoilet seat cover dispenserfurniturecartstorage containerscaletissue boxlight switchcratepower outletdecorationsignprojectorcloset doorvacuum cleanercandleplungerstuffed animalheadphonesdish rackbroomguitar caserange hooddustpanhair dryerwater bottlehandicap barpurseventshower floorwater pitchermailboxbowlpaper bagalarm clockmusic standprojector screendividerlaundry detergentbathroom counterobjectbathroom vanitycloset walllaundry hamperbathroom stall doorceiling lighttrash bindumbbellstair railtubebathroom cabinetcd casecloset rodcoffee kettlestructureshower headkeyboard pianocase of water bottlescoat rackstorage organizerfolded chairfire alarmpower stripcalendarposterpotted plantluggagemattress
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Mask3D Scannet2000.278 10.383 10.263 10.168 10.661 20.465 10.572 10.665 30.391 10.121 40.304 10.015 20.647 10.349 10.474 10.489 10.321 10.816 50.351 30.722 10.402 40.195 10.515 30.082 10.795 10.215 20.396 10.377 10.082 40.724 10.586 10.015 20.277 10.377 50.201 10.475 20.572 10.778 30.089 10.759 10.556 10.068 10.506 10.467 10.323 30.778 20.427 10.027 20.789 10.744 10.003 10.570 20.561 10.337 10.265 10.711 10.258 10.031 10.569 10.311 10.441 10.179 11.000 10.000 10.233 20.411 20.283 20.380 10.667 10.016 10.048 30.418 20.139 10.173 10.000 10.086 10.014 20.500 10.384 10.497 10.044 30.032 20.752 10.287 10.003 10.000 10.007 10.208 10.000 10.001 20.349 10.008 20.014 20.509 10.500 10.323 10.023 20.176 10.107 10.105 30.000 10.605 10.378 10.016 10.000 10.400 10.192 10.000 10.048 20.037 20.000 10.275 10.119 10.810 10.258 10.006 30.083 50.000 10.568 20.377 20.708 10.000 10.005 20.147 10.014 20.000 20.556 10.085 10.325 10.500 10.083 10.004 20.000 10.590 10.000 10.365 10.000 10.116 10.491 10.000 10.626 10.000 10.000 10.579 10.391 10.050 40.000 10.028 10.000 10.222 10.000 10.063 10.302 10.356 10.149 40.573 10.415 10.013 50.002 40.004 10.000 10.005 40.000 10.000 10.444 10.514 10.000 10.028 10.000 20.156 20.267 10.000 21.000 10.000 1
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
LGround Inst.permissive0.154 30.275 30.108 30.060 30.573 30.381 30.434 30.654 40.190 40.141 20.097 30.000 30.503 30.180 30.252 30.242 40.242 30.881 30.448 10.494 30.429 30.078 20.364 50.024 30.654 20.213 40.222 30.239 30.099 30.616 20.363 30.000 30.092 30.444 30.000 30.383 40.209 50.815 20.030 30.000 30.166 30.002 40.295 50.099 40.364 20.778 20.177 30.001 40.427 50.585 40.000 20.470 30.268 50.205 30.045 30.642 20.007 30.000 30.333 50.148 30.407 30.130 21.000 10.000 10.156 40.189 30.097 40.169 20.000 50.000 20.056 20.400 30.000 30.000 20.000 10.000 20.556 10.278 30.203 30.323 40.019 40.000 30.402 40.026 30.000 20.000 10.000 30.044 30.000 10.000 30.037 40.000 30.000 30.181 20.000 20.127 30.006 40.028 40.023 30.115 20.000 10.327 20.267 20.000 20.000 10.000 40.028 30.000 10.000 30.000 30.000 10.003 30.048 20.135 40.222 20.089 20.278 10.000 10.514 30.333 40.611 20.000 10.000 30.000 30.000 30.000 20.000 30.037 30.000 30.000 30.000 30.000 30.000 10.322 20.000 10.209 20.000 10.000 30.278 20.000 10.302 30.000 10.000 10.143 30.148 30.000 50.000 10.000 30.000 10.000 20.000 10.015 30.064 50.000 30.272 20.031 50.000 40.257 20.028 20.000 20.000 10.041 20.000 10.000 10.000 20.222 50.000 10.000 20.000 20.000 50.000 20.000 20.000 20.000 1
David Rozenberszki, Or Litany, Angela Dai: Language-Grounded Indoor 3D Semantic Segmentation in the Wild.
Minkowski 34D Inst.permissive0.130 40.246 40.083 40.043 50.547 50.236 40.415 40.672 20.141 50.133 30.067 40.000 30.521 20.114 50.238 40.289 20.232 40.883 20.182 50.373 50.486 10.076 30.488 40.022 40.529 40.199 50.110 40.217 40.100 20.460 40.319 40.000 30.025 50.472 10.000 30.394 30.210 40.537 40.004 40.000 30.083 50.000 50.299 40.061 50.201 50.761 40.084 40.008 30.720 30.557 50.000 20.317 50.280 30.094 50.020 50.564 50.000 40.000 30.400 30.048 40.259 40.101 31.000 10.000 10.190 30.142 50.094 50.137 30.089 30.000 20.101 10.355 50.000 30.000 20.000 10.000 20.000 30.444 20.082 50.384 20.000 50.000 30.334 50.004 50.000 20.000 10.000 30.041 40.000 10.000 30.026 50.000 30.000 30.000 40.000 20.082 50.022 30.000 50.021 40.088 40.000 10.241 40.033 40.000 20.000 10.067 30.000 50.000 10.000 30.000 30.000 10.000 40.026 40.262 20.016 40.000 40.278 10.000 10.500 40.394 10.028 50.000 10.000 30.000 30.000 30.000 20.000 30.019 40.000 30.000 30.000 30.000 30.000 10.156 50.000 10.032 50.000 10.000 30.194 50.000 10.248 40.000 10.000 10.099 40.019 40.308 20.000 10.000 30.000 10.000 20.000 10.007 40.122 20.000 30.175 30.063 20.000 40.271 10.000 50.000 20.000 10.000 50.000 10.000 10.000 20.278 20.000 10.000 20.000 20.111 30.000 20.000 20.000 20.000 1
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019
TD3D Scannet2000.211 20.332 20.177 20.103 20.662 10.413 20.463 20.705 10.192 30.145 10.266 20.215 10.452 40.209 20.222 50.219 50.315 20.893 10.380 20.617 20.439 20.047 40.646 10.080 20.610 30.253 10.237 20.293 20.135 10.379 50.494 20.048 10.252 20.451 20.184 20.483 10.395 20.852 10.083 20.551 20.278 20.036 20.337 20.266 20.544 10.963 10.079 50.039 10.740 20.604 20.000 20.586 10.283 20.282 20.059 20.633 30.028 20.004 20.559 20.309 20.420 20.028 51.000 10.000 10.456 10.411 10.372 10.060 40.046 40.000 20.040 40.694 10.083 20.000 20.000 10.000 20.000 30.083 40.252 20.260 50.200 10.160 10.669 20.111 20.000 20.000 10.006 20.169 20.000 10.007 10.296 20.032 10.074 10.139 30.000 20.321 20.031 10.108 20.088 20.157 10.000 10.231 50.026 50.000 20.000 10.356 20.052 20.000 10.240 10.147 10.000 10.015 20.046 30.144 30.073 30.414 10.222 40.000 10.806 10.343 30.486 30.000 10.008 10.038 20.083 10.002 10.028 20.074 20.032 20.150 20.039 20.008 10.000 10.250 40.000 10.125 40.000 10.052 20.260 30.000 10.143 50.000 10.000 10.543 20.207 20.404 10.000 10.003 20.000 10.000 20.000 10.037 20.093 40.272 20.342 10.039 40.281 20.249 30.224 10.000 20.000 10.074 10.000 10.000 10.000 20.278 20.000 10.000 20.889 10.323 10.000 20.014 10.000 20.000 1
CSC-Pretrain Inst.permissive0.123 50.223 50.082 50.046 40.564 40.152 50.394 50.578 50.235 20.116 50.034 50.000 30.348 50.119 40.297 20.285 30.202 50.838 40.323 40.407 40.184 50.037 50.516 20.013 50.424 50.214 30.093 50.105 50.078 50.542 30.250 50.000 30.064 40.444 30.000 30.224 50.231 30.537 40.001 50.000 30.126 40.004 30.308 30.193 30.244 40.343 50.228 20.000 50.441 40.588 30.000 20.338 40.275 40.189 40.030 40.600 40.000 40.000 30.378 40.000 50.108 50.098 41.000 10.000 10.096 50.172 40.144 30.011 50.125 20.000 20.000 50.376 40.000 30.000 20.000 10.000 20.000 30.042 50.141 40.377 30.051 20.000 30.483 30.017 40.000 20.000 10.000 30.022 50.000 10.000 30.065 30.000 30.000 30.000 40.000 20.094 40.000 50.042 30.000 50.064 50.000 10.259 30.089 30.000 20.000 10.000 40.022 40.000 10.000 30.000 30.000 10.000 40.018 50.111 50.000 50.000 40.278 10.000 10.444 50.333 40.333 40.000 10.000 30.000 30.000 30.000 20.000 30.000 50.000 30.000 30.000 30.000 30.000 10.267 30.000 10.184 30.000 10.000 30.211 40.000 10.378 20.000 10.000 10.063 50.000 50.275 30.000 10.000 30.000 10.000 20.000 10.007 50.105 30.000 30.032 50.045 30.198 30.171 40.028 20.000 20.000 10.006 30.000 10.000 10.000 20.278 20.000 10.000 20.000 20.044 40.000 20.000 20.000 20.000 1
Ji Hou, Benjamin Graham, Matthias Nießner, Saining Xie: Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts. CVPR 2021


ScanNet Benchmark

This table lists the benchmark results for the 3D semantic label scenario.


Method Infoavg ioubathtubbedbookshelfcabinetchaircountercurtaindeskdoorfloorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwallwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Mix3Dpermissive0.781 10.964 10.855 10.843 120.781 30.858 90.575 40.831 240.685 90.714 20.979 10.594 40.310 200.801 10.892 110.841 20.819 30.723 30.940 90.887 20.725 17
Alexey Nekrasov, Jonas Schult, Or Litany, Bastian Leibe, Francis Engelmann: Mix3D: Out-of-Context Data Augmentation for 3D Scenes. 3DV 2021 (Oral)
Swin3Dpermissive0.779 20.861 160.818 110.836 150.790 10.875 20.576 30.905 30.704 30.739 10.969 70.611 10.349 60.756 150.958 10.702 340.805 120.708 60.916 240.898 10.801 1
PPT-SpUNet-Joint0.766 30.932 20.794 250.829 170.751 130.854 110.540 140.903 40.630 260.672 80.963 90.565 150.357 40.788 20.900 70.737 190.802 130.685 120.950 20.887 20.780 2
OctFormerpermissive0.766 30.925 30.808 170.849 60.786 20.846 170.566 60.876 110.690 70.674 70.960 110.576 110.226 560.753 170.904 50.777 60.815 60.722 40.923 200.877 80.776 4
Peng-Shuai Wang: OctFormer: Octree-based Transformers for 3D Point Clouds. SIGGRAPH 2023
CU-Hybrid Net0.764 50.924 40.819 90.840 130.757 90.853 120.580 10.848 180.709 20.643 160.958 140.587 80.295 260.753 170.884 150.758 130.815 60.725 20.927 190.867 140.743 9
OccuSeg+Semantic0.764 50.758 500.796 230.839 140.746 150.907 10.562 70.850 170.680 110.672 80.978 20.610 20.335 100.777 50.819 350.847 10.830 10.691 100.972 10.885 40.727 15
O-CNNpermissive0.762 70.924 40.823 60.844 110.770 50.852 130.577 20.847 200.711 10.640 200.958 140.592 50.217 620.762 120.888 120.758 130.813 80.726 10.932 170.868 130.744 8
Peng-Shuai Wang, Yang Liu, Yu-Xiao Guo, Chun-Yu Sun, Xin Tong: O-CNN: Octree-based Convolutional Neural Networks for 3D Shape Analysis. SIGGRAPH 2017
OA-CNN-L_ScanNet200.756 80.783 380.826 50.858 40.776 40.837 240.548 110.896 70.649 190.675 60.962 100.586 90.335 100.771 80.802 390.770 90.787 250.691 100.936 120.880 70.761 6
PointTransformerV20.752 90.742 570.809 160.872 10.758 80.860 80.552 90.891 80.610 340.687 30.960 110.559 180.304 230.766 100.926 30.767 100.797 170.644 260.942 70.876 110.722 19
Xiaoyang Wu, Yixing Lao, Li Jiang, Xihui Liu, Hengshuang Zhao: Point Transformer V2: Grouped Vector Attention and Partition-based Pooling. NeurIPS 2022
DMF-Net0.752 90.906 80.793 270.802 330.689 300.825 350.556 80.867 130.681 100.602 340.960 110.555 200.365 30.779 40.859 200.747 160.795 210.717 50.917 230.856 220.764 5
PointConvFormer0.749 110.793 350.790 280.807 280.750 140.856 100.524 200.881 100.588 450.642 190.977 40.591 60.274 370.781 30.929 20.804 30.796 180.642 270.947 40.885 40.715 22
Wenxuan Wu, Qi Shan, Li Fuxin: PointConvFormer: Revenge of the Point-based Convolution.
BPNetcopyleft0.749 110.909 60.818 110.811 250.752 110.839 230.485 370.842 210.673 120.644 150.957 170.528 290.305 220.773 70.859 200.788 40.818 50.693 90.916 240.856 220.723 18
Wenbo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia, Tien-Tsin Wong: Bidirectional Projection Network for Cross Dimension Scene Understanding. CVPR 2021 (Oral)
MSP0.748 130.623 830.804 190.859 30.745 160.824 370.501 280.912 20.690 70.685 40.956 180.567 140.320 160.768 90.918 40.720 250.802 130.676 160.921 210.881 60.779 3
StratifiedFormerpermissive0.747 140.901 90.803 200.845 100.757 90.846 170.512 240.825 270.696 60.645 140.956 180.576 110.262 470.744 220.861 190.742 170.770 340.705 70.899 370.860 190.734 10
Xin Lai*, Jianhui Liu*, Li Jiang, Liwei Wang, Hengshuang Zhao, Shu Liu, Xiaojuan Qi, Jiaya Jia: Stratified Transformer for 3D Point Cloud Segmentation. CVPR 2022
VMNetpermissive0.746 150.870 140.838 20.858 40.729 210.850 150.501 280.874 120.587 460.658 120.956 180.564 160.299 240.765 110.900 70.716 280.812 90.631 320.939 100.858 200.709 23
Zeyu HU, Xuyang Bai, Jiaxiang Shang, Runze Zhang, Jiayu Dong, Xin Wang, Guangyuan Sun, Hongbo Fu, Chiew-Lan Tai: VMNet: Voxel-Mesh Network for Geodesic-Aware 3D Semantic Segmentation. ICCV 2021 (Oral)
Virtual MVFusion0.746 150.771 440.819 90.848 80.702 280.865 70.397 740.899 50.699 40.664 110.948 450.588 70.330 120.746 210.851 270.764 110.796 180.704 80.935 130.866 150.728 13
Abhijit Kundu, Xiaoqi Yin, Alireza Fathi, David Ross, Brian Brewington, Thomas Funkhouser, Caroline Pantofaru: Virtual Multi-view Fusion for 3D Semantic Segmentation. ECCV 2020
Retro-FPN0.744 170.842 220.800 210.767 450.740 170.836 260.541 130.914 10.672 130.626 230.958 140.552 210.272 390.777 50.886 140.696 350.801 150.674 180.941 80.858 200.717 20
EQ-Net0.743 180.620 840.799 220.849 60.730 200.822 390.493 350.897 60.664 140.681 50.955 210.562 170.378 10.760 130.903 60.738 180.801 150.673 190.907 290.877 80.745 7
Zetong Yang*, Li Jiang*, Yanan Sun, Bernt Schiele, Jiaya JIa: A Unified Query-based Paradigm for Point Cloud Understanding. CVPR 2022
SAT0.742 190.860 170.765 390.819 200.769 60.848 160.533 160.829 250.663 150.631 220.955 210.586 90.274 370.753 170.896 90.729 200.760 410.666 210.921 210.855 240.733 11
LRPNet0.742 190.816 300.806 180.807 280.752 110.828 330.575 40.839 230.699 40.637 210.954 260.520 310.320 160.755 160.834 310.760 120.772 310.676 160.915 260.862 170.717 20
TXC0.740 210.842 220.832 40.805 320.715 250.846 170.473 390.885 90.615 300.671 100.971 60.547 220.320 160.697 260.799 410.777 60.819 30.682 140.946 50.871 120.696 27
LargeKernel3D0.739 220.909 60.820 80.806 300.740 170.852 130.545 120.826 260.594 440.643 160.955 210.541 240.263 460.723 240.858 220.775 80.767 350.678 150.933 150.848 290.694 28
Yukang Chen*, Jianhui Liu*, Xiangyu Zhang, Xiaojuan Qi, Jiaya Jia: LargeKernel3D: Scaling up Kernels in 3D Sparse CNNs. CVPR 2023
MinkowskiNetpermissive0.736 230.859 180.818 110.832 160.709 260.840 220.521 220.853 160.660 170.643 160.951 350.544 230.286 310.731 230.893 100.675 430.772 310.683 130.874 550.852 270.727 15
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019
IPCA0.731 240.890 100.837 30.864 20.726 220.873 30.530 190.824 280.489 770.647 130.978 20.609 30.336 90.624 410.733 490.758 130.776 290.570 560.949 30.877 80.728 13
PointTransformer++0.725 250.727 650.811 150.819 200.765 70.841 210.502 270.814 330.621 290.623 250.955 210.556 190.284 320.620 420.866 170.781 50.757 440.648 240.932 170.862 170.709 23
SparseConvNet0.725 250.647 800.821 70.846 90.721 230.869 40.533 160.754 470.603 400.614 270.955 210.572 130.325 140.710 250.870 160.724 230.823 20.628 330.934 140.865 160.683 31
MatchingNet0.724 270.812 320.812 140.810 260.735 190.834 280.495 340.860 150.572 520.602 340.954 260.512 330.280 340.757 140.845 290.725 220.780 270.606 420.937 110.851 280.700 26
INS-Conv-semantic0.717 280.751 530.759 420.812 240.704 270.868 50.537 150.842 210.609 360.608 300.953 290.534 260.293 270.616 430.864 180.719 270.793 220.640 280.933 150.845 330.663 36
PointMetaBase0.714 290.835 240.785 290.821 180.684 320.846 170.531 180.865 140.614 310.596 380.953 290.500 360.246 520.674 270.888 120.692 360.764 370.624 340.849 700.844 340.675 33
contrastBoundarypermissive0.705 300.769 470.775 340.809 270.687 310.820 420.439 620.812 340.661 160.591 400.945 530.515 320.171 800.633 380.856 230.720 250.796 180.668 200.889 440.847 300.689 29
Liyao Tang, Yibing Zhan, Zhe Chen, Baosheng Yu, Dacheng Tao: Contrastive Boundary Learning for Point Cloud Segmentation. CVPR2022
RFCR0.702 310.889 110.745 510.813 230.672 350.818 460.493 350.815 320.623 270.610 280.947 470.470 460.249 510.594 460.848 280.705 320.779 280.646 250.892 420.823 400.611 50
Jingyu Gong, Jiachen Xu, Xin Tan, Haichuan Song, Yanyun Qu, Yuan Xie, Lizhuang Ma: Omni-Supervised Point Cloud Segmentation via Gradual Receptive Field Component Reasoning. CVPR2021
One Thing One Click0.701 320.825 280.796 230.723 520.716 240.832 290.433 640.816 300.634 240.609 290.969 70.418 710.344 70.559 580.833 320.715 290.808 110.560 610.902 340.847 300.680 32
JSENetpermissive0.699 330.881 130.762 400.821 180.667 360.800 590.522 210.792 390.613 320.607 310.935 730.492 380.205 670.576 510.853 250.691 370.758 430.652 230.872 580.828 370.649 40
Zeyu HU, Mingmin Zhen, Xuyang BAI, Hongbo Fu, Chiew-lan Tai: JSENet: Joint Semantic Segmentation and Edge Detection Network for 3D Point Clouds. ECCV 2020
One-Thing-One-Click0.693 340.743 560.794 250.655 760.684 320.822 390.497 330.719 570.622 280.617 260.977 40.447 580.339 80.750 200.664 650.703 330.790 240.596 460.946 50.855 240.647 41
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
PicassoNet-IIpermissive0.692 350.732 610.772 350.786 370.677 340.866 60.517 230.848 180.509 690.626 230.952 330.536 250.225 580.545 640.704 560.689 400.810 100.564 600.903 330.854 260.729 12
Huan Lei, Naveed Akhtar, Mubarak Shah, and Ajmal Mian: Geometric feature learning for 3D meshes.
Feature_GeometricNetpermissive0.690 360.884 120.754 460.795 360.647 420.818 460.422 660.802 370.612 330.604 320.945 530.462 490.189 750.563 570.853 250.726 210.765 360.632 310.904 310.821 430.606 54
Kangcheng Liu, Ben M. Chen: https://arxiv.org/abs/2012.09439. arXiv Preprint
FusionNet0.688 370.704 700.741 550.754 490.656 380.829 310.501 280.741 520.609 360.548 470.950 390.522 300.371 20.633 380.756 440.715 290.771 330.623 350.861 660.814 450.658 37
Feihu Zhang, Jin Fang, Benjamin Wah, Philip Torr: Deep FusionNet for Point Cloud Semantic Segmentation. ECCV 2020
Feature-Geometry Netpermissive0.685 380.866 150.748 480.819 200.645 440.794 620.450 510.802 370.587 460.604 320.945 530.464 480.201 700.554 600.840 300.723 240.732 530.602 440.907 290.822 420.603 57
KP-FCNN0.684 390.847 210.758 440.784 390.647 420.814 490.473 390.772 420.605 380.594 390.935 730.450 560.181 780.587 470.805 380.690 380.785 260.614 380.882 480.819 440.632 46
H. Thomas, C. Qi, J. Deschaud, B. Marcotegui, F. Goulette, L. Guibas.: KPConv: Flexible and Deformable Convolution for Point Clouds. ICCV 2019
DGNet0.684 390.712 690.784 300.782 410.658 370.835 270.499 320.823 290.641 210.597 370.950 390.487 390.281 330.575 520.619 680.647 560.764 370.620 370.871 610.846 320.688 30
VACNN++0.684 390.728 640.757 450.776 420.690 290.804 560.464 450.816 300.577 510.587 410.945 530.508 350.276 360.671 280.710 540.663 480.750 470.589 510.881 490.832 360.653 39
Superpoint Network0.683 420.851 200.728 590.800 350.653 400.806 540.468 420.804 350.572 520.602 340.946 500.453 550.239 550.519 690.822 330.689 400.762 400.595 480.895 400.827 380.630 47
PointContrast_LA_SEM0.683 420.757 510.784 300.786 370.639 460.824 370.408 690.775 410.604 390.541 490.934 770.532 270.269 420.552 610.777 420.645 590.793 220.640 280.913 270.824 390.671 34
VI-PointConv0.676 440.770 460.754 460.783 400.621 500.814 490.552 90.758 450.571 540.557 450.954 260.529 280.268 440.530 670.682 600.675 430.719 560.603 430.888 450.833 350.665 35
Xingyi Li, Wenxuan Wu, Xiaoli Z. Fern, Li Fuxin: The Devils in the Point Clouds: Studying the Robustness of Point Cloud Convolutions.
ROSMRF3D0.673 450.789 360.748 480.763 470.635 480.814 490.407 710.747 490.581 500.573 420.950 390.484 400.271 410.607 440.754 450.649 530.774 300.596 460.883 470.823 400.606 54
SALANet0.670 460.816 300.770 370.768 440.652 410.807 530.451 480.747 490.659 180.545 480.924 830.473 450.149 900.571 540.811 370.635 620.746 480.623 350.892 420.794 570.570 67
PointConvpermissive0.666 470.781 390.759 420.699 610.644 450.822 390.475 380.779 400.564 570.504 650.953 290.428 650.203 690.586 490.754 450.661 490.753 450.588 520.902 340.813 470.642 42
Wenxuan Wu, Zhongang Qi, Li Fuxin: PointConv: Deep Convolutional Networks on 3D Point Clouds. CVPR 2019
PointASNLpermissive0.666 470.703 710.781 320.751 510.655 390.830 300.471 410.769 430.474 800.537 510.951 350.475 440.279 350.635 360.698 590.675 430.751 460.553 660.816 770.806 490.703 25
Xu Yan, Chaoda Zheng, Zhen Li, Sheng Wang, Shuguang Cui: PointASNL: Robust Point Clouds Processing using Nonlocal Neural Networks with Adaptive Sampling. CVPR 2020
PPCNN++permissive0.663 490.746 540.708 630.722 530.638 470.820 420.451 480.566 840.599 420.541 490.950 390.510 340.313 190.648 330.819 350.616 670.682 720.590 500.869 620.810 480.656 38
Pyunghwan Ahn, Juyoung Yang, Eojindl Yi, Chanho Lee, Junmo Kim: Projection-based Point Convolution for Efficient Point Cloud Segmentation. IEEE Access
DCM-Net0.658 500.778 400.702 660.806 300.619 510.813 520.468 420.693 650.494 720.524 570.941 640.449 570.298 250.510 710.821 340.675 430.727 550.568 580.826 750.803 510.637 44
Jonas Schult*, Francis Engelmann*, Theodora Kontogianni, Bastian Leibe: DualConvMesh-Net: Joint Geodesic and Euclidean Convolutions on 3D Meshes. CVPR 2020 [Oral]
HPGCNN0.656 510.698 720.743 530.650 770.564 680.820 420.505 260.758 450.631 250.479 700.945 530.480 420.226 560.572 530.774 430.690 380.735 510.614 380.853 690.776 720.597 60
Jisheng Dang, Qingyong Hu, Yulan Guo, Jun Yang: HPGCNN.
SAFNet-segpermissive0.654 520.752 520.734 570.664 740.583 630.815 480.399 730.754 470.639 220.535 530.942 620.470 460.309 210.665 290.539 740.650 520.708 610.635 300.857 680.793 590.642 42
Linqing Zhao, Jiwen Lu, Jie Zhou: Similarity-Aware Fusion Network for 3D Semantic Segmentation. IROS 2021
RandLA-Netpermissive0.645 530.778 400.731 580.699 610.577 640.829 310.446 530.736 530.477 790.523 590.945 530.454 530.269 420.484 780.749 480.618 650.738 490.599 450.827 740.792 620.621 49
MVPNetpermissive0.641 540.831 250.715 610.671 710.590 590.781 680.394 750.679 670.642 200.553 460.937 700.462 490.256 480.649 320.406 870.626 630.691 690.666 210.877 510.792 620.608 53
Maximilian Jaritz, Jiayuan Gu, Hao Su: Multi-view PointNet for 3D Scene Understanding. GMDL Workshop, ICCV 2019
PointConv-SFPN0.641 540.776 420.703 650.721 540.557 710.826 340.451 480.672 690.563 580.483 690.943 610.425 680.162 850.644 340.726 500.659 500.709 600.572 550.875 530.786 670.559 72
PointMRNet0.640 560.717 680.701 670.692 640.576 650.801 580.467 440.716 580.563 580.459 750.953 290.429 640.169 820.581 500.854 240.605 680.710 580.550 670.894 410.793 590.575 65
FPConvpermissive0.639 570.785 370.760 410.713 590.603 540.798 600.392 760.534 890.603 400.524 570.948 450.457 510.250 500.538 650.723 520.598 720.696 670.614 380.872 580.799 520.567 69
Yiqun Lin, Zizheng Yan, Haibin Huang, Dong Du, Ligang Liu, Shuguang Cui, Xiaoguang Han: FPConv: Learning Local Flattening for Point Convolution. CVPR 2020
PD-Net0.638 580.797 340.769 380.641 820.590 590.820 420.461 460.537 880.637 230.536 520.947 470.388 790.206 660.656 300.668 630.647 560.732 530.585 530.868 630.793 590.473 91
PointSPNet0.637 590.734 600.692 740.714 580.576 650.797 610.446 530.743 510.598 430.437 800.942 620.403 750.150 890.626 400.800 400.649 530.697 660.557 640.846 710.777 710.563 70
SConv0.636 600.830 260.697 700.752 500.572 670.780 700.445 550.716 580.529 630.530 540.951 350.446 590.170 810.507 730.666 640.636 610.682 720.541 720.886 460.799 520.594 61
Supervoxel-CNN0.635 610.656 780.711 620.719 550.613 520.757 790.444 580.765 440.534 620.566 430.928 810.478 430.272 390.636 350.531 760.664 470.645 820.508 800.864 650.792 620.611 50
joint point-basedpermissive0.634 620.614 850.778 330.667 730.633 490.825 350.420 670.804 350.467 820.561 440.951 350.494 370.291 280.566 550.458 820.579 790.764 370.559 630.838 720.814 450.598 59
Hung-Yueh Chiang, Yen-Liang Lin, Yueh-Cheng Liu, Winston H. Hsu: A Unified Point-Based Framework for 3D Segmentation. 3DV 2019
PointMTL0.632 630.731 620.688 770.675 680.591 580.784 670.444 580.565 850.610 340.492 670.949 430.456 520.254 490.587 470.706 550.599 710.665 780.612 410.868 630.791 660.579 64
APCF-Net0.631 640.742 570.687 790.672 690.557 710.792 650.408 690.665 700.545 600.508 620.952 330.428 650.186 760.634 370.702 570.620 640.706 620.555 650.873 560.798 540.581 63
Haojia, Lin: Adaptive Pyramid Context Fusion for Point Cloud Perception. GRSL
3DSM_DMMF0.631 640.626 820.745 510.801 340.607 530.751 800.506 250.729 560.565 560.491 680.866 970.434 600.197 730.595 450.630 670.709 310.705 630.560 610.875 530.740 820.491 86
PointNet2-SFPN0.631 640.771 440.692 740.672 690.524 760.837 240.440 610.706 630.538 610.446 770.944 590.421 700.219 610.552 610.751 470.591 750.737 500.543 710.901 360.768 740.557 73
FusionAwareConv0.630 670.604 870.741 550.766 460.590 590.747 810.501 280.734 540.503 710.527 550.919 870.454 530.323 150.550 630.420 860.678 420.688 700.544 690.896 390.795 560.627 48
Jiazhao Zhang, Chenyang Zhu, Lintao Zheng, Kai Xu: Fusion-Aware Point Convolution for Online Semantic 3D Scene Segmentation. CVPR 2020
DenSeR0.628 680.800 330.625 890.719 550.545 740.806 540.445 550.597 790.448 860.519 600.938 690.481 410.328 130.489 770.499 810.657 510.759 420.592 490.881 490.797 550.634 45
SegGroup_sempermissive0.627 690.818 290.747 500.701 600.602 550.764 760.385 800.629 760.490 750.508 620.931 800.409 730.201 700.564 560.725 510.618 650.692 680.539 730.873 560.794 570.548 76
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
SIConv0.625 700.830 260.694 720.757 480.563 690.772 740.448 520.647 730.520 650.509 610.949 430.431 630.191 740.496 750.614 690.647 560.672 760.535 750.876 520.783 680.571 66
HPEIN0.618 710.729 630.668 800.647 790.597 570.766 750.414 680.680 660.520 650.525 560.946 500.432 610.215 630.493 760.599 700.638 600.617 870.570 560.897 380.806 490.605 56
Li Jiang, Hengshuang Zhao, Shu Liu, Xiaoyong Shen, Chi-Wing Fu, Jiaya Jia: Hierarchical Point-Edge Interaction Network for Point Cloud Semantic Segmentation. ICCV 2019
SPH3D-GCNpermissive0.610 720.858 190.772 350.489 940.532 750.792 650.404 720.643 750.570 550.507 640.935 730.414 720.046 990.510 710.702 570.602 700.705 630.549 680.859 670.773 730.534 79
Huan Lei, Naveed Akhtar, and Ajmal Mian: Spherical Kernel for Efficient Graph Convolution on 3D Point Clouds. TPAMI 2020
AttAN0.609 730.760 490.667 810.649 780.521 770.793 630.457 470.648 720.528 640.434 820.947 470.401 760.153 880.454 800.721 530.648 550.717 570.536 740.904 310.765 750.485 87
Gege Zhang, Qinghua Ma, Licheng Jiao, Fang Liu and Qigong Sun: AttAN: Attention Adversarial Networks for 3D Point Cloud Semantic Segmentation. IJCAI2020
wsss-transformer0.600 740.634 810.743 530.697 630.601 560.781 680.437 630.585 820.493 730.446 770.933 780.394 770.011 1010.654 310.661 660.603 690.733 520.526 760.832 730.761 770.480 88
dtc_net0.596 750.683 730.725 600.715 570.549 730.803 570.444 580.647 730.493 730.495 660.941 640.409 730.000 1030.424 850.544 730.598 720.703 650.522 770.912 280.792 620.520 82
LAP-D0.594 760.720 660.692 740.637 830.456 860.773 730.391 780.730 550.587 460.445 790.940 670.381 800.288 290.434 830.453 840.591 750.649 800.581 540.777 810.749 810.610 52
DPC0.592 770.720 660.700 680.602 870.480 820.762 780.380 810.713 610.585 490.437 800.940 670.369 820.288 290.434 830.509 800.590 770.639 850.567 590.772 820.755 790.592 62
Francis Engelmann, Theodora Kontogianni, Bastian Leibe: Dilated Point Convolutions: On the Receptive Field Size of Point Convolutions on 3D Point Clouds. ICRA 2020
CCRFNet0.589 780.766 480.659 840.683 660.470 850.740 830.387 790.620 780.490 750.476 710.922 850.355 850.245 530.511 700.511 790.571 800.643 830.493 840.872 580.762 760.600 58
ROSMRF0.580 790.772 430.707 640.681 670.563 690.764 760.362 830.515 900.465 830.465 740.936 720.427 670.207 650.438 810.577 710.536 830.675 750.486 850.723 880.779 690.524 81
SD-DETR0.576 800.746 540.609 930.445 980.517 780.643 940.366 820.714 600.456 840.468 730.870 960.432 610.264 450.558 590.674 610.586 780.688 700.482 860.739 860.733 840.537 78
SQN_0.1%0.569 810.676 750.696 710.657 750.497 790.779 710.424 650.548 860.515 670.376 870.902 940.422 690.357 40.379 880.456 830.596 740.659 790.544 690.685 910.665 950.556 74
TextureNetpermissive0.566 820.672 770.664 820.671 710.494 800.719 840.445 550.678 680.411 920.396 850.935 730.356 840.225 580.412 860.535 750.565 810.636 860.464 880.794 800.680 920.568 68
Jingwei Huang, Haotian Zhang, Li Yi, Thomas Funkerhouser, Matthias Niessner, Leonidas Guibas: TextureNet: Consistent Local Parametrizations for Learning from High-Resolution Signals on Meshes. CVPR
DVVNet0.562 830.648 790.700 680.770 430.586 620.687 880.333 870.650 710.514 680.475 720.906 910.359 830.223 600.340 900.442 850.422 940.668 770.501 810.708 890.779 690.534 79
Pointnet++ & Featurepermissive0.557 840.735 590.661 830.686 650.491 810.744 820.392 760.539 870.451 850.375 880.946 500.376 810.205 670.403 870.356 900.553 820.643 830.497 820.824 760.756 780.515 83
GMLPs0.538 850.495 950.693 730.647 790.471 840.793 630.300 900.477 910.505 700.358 890.903 930.327 880.081 960.472 790.529 770.448 920.710 580.509 780.746 840.737 830.554 75
PanopticFusion-label0.529 860.491 960.688 770.604 860.386 910.632 950.225 1000.705 640.434 890.293 950.815 980.348 860.241 540.499 740.669 620.507 850.649 800.442 940.796 790.602 980.561 71
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
subcloud_weak0.516 870.676 750.591 960.609 840.442 870.774 720.335 860.597 790.422 910.357 900.932 790.341 870.094 950.298 920.528 780.473 900.676 740.495 830.602 970.721 870.349 98
Online SegFusion0.515 880.607 860.644 870.579 890.434 880.630 960.353 840.628 770.440 870.410 830.762 1010.307 900.167 830.520 680.403 880.516 840.565 900.447 920.678 920.701 890.514 84
Davide Menini, Suryansh Kumar, Martin R. Oswald, Erik Sandstroem, Cristian Sminchisescu, Luc van Gool: A Real-Time Learning Framework for Joint 3D Reconstruction and Semantic Segmentation. Robotics and Automation Letters Submission
3DMV, FTSDF0.501 890.558 910.608 940.424 1000.478 830.690 870.246 960.586 810.468 810.450 760.911 890.394 770.160 860.438 810.212 970.432 930.541 950.475 870.742 850.727 850.477 89
PCNN0.498 900.559 900.644 870.560 910.420 900.711 860.229 980.414 920.436 880.352 910.941 640.324 890.155 870.238 970.387 890.493 860.529 960.509 780.813 780.751 800.504 85
3DMV0.484 910.484 970.538 980.643 810.424 890.606 990.310 880.574 830.433 900.378 860.796 990.301 910.214 640.537 660.208 980.472 910.507 990.413 970.693 900.602 980.539 77
Angela Dai, Matthias Niessner: 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation. ECCV'18
PointCNN with RGBpermissive0.458 920.577 890.611 920.356 1020.321 990.715 850.299 920.376 960.328 990.319 930.944 590.285 930.164 840.216 1000.229 950.484 880.545 940.456 900.755 830.709 880.475 90
Yangyan Li, Rui Bu, Mingchao Sun, Baoquan Chen: PointCNN. NeurIPS 2018
FCPNpermissive0.447 930.679 740.604 950.578 900.380 920.682 890.291 930.106 1020.483 780.258 1000.920 860.258 970.025 1000.231 990.325 910.480 890.560 920.463 890.725 870.666 940.231 102
Dario Rethage, Johanna Wald, Jürgen Sturm, Nassir Navab, Federico Tombari: Fully-Convolutional Point Networks for Large-Scale Point Clouds. ECCV 2018
DGCNN_reproducecopyleft0.446 940.474 980.623 900.463 960.366 940.651 920.310 880.389 950.349 970.330 920.937 700.271 950.126 920.285 930.224 960.350 990.577 890.445 930.625 950.723 860.394 94
Yue Wang, Yongbin Sun, Ziwei Liu, Sanjay E. Sarma, Michael M. Bronstein, Justin M. Solomon: Dynamic Graph CNN for Learning on Point Clouds. TOG 2019
PNET20.442 950.548 920.548 970.597 880.363 950.628 970.300 900.292 970.374 940.307 940.881 950.268 960.186 760.238 970.204 990.407 950.506 1000.449 910.667 930.620 970.462 92
SurfaceConvPF0.442 950.505 940.622 910.380 1010.342 970.654 910.227 990.397 940.367 950.276 970.924 830.240 980.198 720.359 890.262 930.366 960.581 880.435 950.640 940.668 930.398 93
Hao Pan, Shilin Liu, Yang Liu, Xin Tong: Convolutional Neural Networks on 3D Surfaces Using Parallel Frames.
Tangent Convolutionspermissive0.438 970.437 1000.646 860.474 950.369 930.645 930.353 840.258 990.282 1010.279 960.918 880.298 920.147 910.283 940.294 920.487 870.562 910.427 960.619 960.633 960.352 97
Maxim Tatarchenko, Jaesik Park, Vladlen Koltun, Qian-Yi Zhou: Tangent convolutions for dense prediction in 3d. CVPR 2018
3DWSSS0.425 980.525 930.647 850.522 920.324 980.488 1020.077 1030.712 620.353 960.401 840.636 1030.281 940.176 790.340 900.565 720.175 1030.551 930.398 980.370 1030.602 980.361 96
SPLAT Netcopyleft0.393 990.472 990.511 990.606 850.311 1000.656 900.245 970.405 930.328 990.197 1010.927 820.227 1000.000 1030.001 1040.249 940.271 1020.510 970.383 1000.593 980.699 900.267 100
Hang Su, Varun Jampani, Deqing Sun, Subhransu Maji, Evangelos Kalogerakis, Ming-Hsuan Yang, Jan Kautz: SPLATNet: Sparse Lattice Networks for Point Cloud Processing. CVPR 2018
ScanNet+FTSDF0.383 1000.297 1020.491 1000.432 990.358 960.612 980.274 940.116 1010.411 920.265 980.904 920.229 990.079 970.250 950.185 1000.320 1000.510 970.385 990.548 990.597 1010.394 94
PointNet++permissive0.339 1010.584 880.478 1010.458 970.256 1020.360 1030.250 950.247 1000.278 1020.261 990.677 1020.183 1010.117 930.212 1010.145 1020.364 970.346 1030.232 1030.548 990.523 1020.252 101
Charles R. Qi, Li Yi, Hao Su, Leonidas J. Guibas: pointnet++: deep hierarchical feature learning on point sets in a metric space.
SSC-UNetpermissive0.308 1020.353 1010.290 1030.278 1030.166 1030.553 1000.169 1020.286 980.147 1030.148 1030.908 900.182 1020.064 980.023 1030.018 1040.354 980.363 1010.345 1010.546 1010.685 910.278 99
ScanNetpermissive0.306 1030.203 1030.366 1020.501 930.311 1000.524 1010.211 1010.002 1040.342 980.189 1020.786 1000.145 1030.102 940.245 960.152 1010.318 1010.348 1020.300 1020.460 1020.437 1030.182 103
Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner: ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. CVPR'17
ERROR0.054 1040.000 1040.041 1040.172 1040.030 1040.062 1040.001 1040.035 1030.004 1040.051 1040.143 1040.019 1040.003 1020.041 1020.050 1030.003 1040.054 1040.018 1040.005 1040.264 1040.082 104


This table lists the benchmark results for the 3D semantic instance scenario.




Method Infoavg apbathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Queryformer0.583 10.926 20.702 10.393 210.504 10.733 60.276 20.527 190.373 40.479 10.534 30.533 80.697 30.720 150.436 70.745 20.592 10.958 30.363 7
PBNetpermissive0.573 20.926 20.575 100.619 10.472 20.736 40.239 50.487 250.383 20.459 30.506 60.533 70.585 60.767 70.404 80.717 30.559 50.969 20.381 4
Mask3D0.566 30.926 20.597 50.408 180.420 40.737 30.239 40.598 80.386 10.458 40.549 10.568 50.716 20.601 250.480 30.646 90.575 30.922 50.364 6
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
ISBNetpermissive0.559 40.926 20.597 60.390 220.436 30.722 70.276 30.556 150.380 30.450 50.505 70.583 20.730 10.575 260.455 50.603 150.573 40.979 10.332 13
Tuan Duc Ngo, Binh-Son Hua, Khoi Nguyen: ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution. CVPR 2023
GraphCut0.552 51.000 10.611 40.438 130.392 70.714 80.139 80.598 90.327 70.389 70.510 50.598 10.427 230.754 100.463 40.761 10.588 20.903 90.329 14
SPFormerpermissive0.549 60.745 140.640 20.484 60.395 60.739 20.311 10.566 130.335 60.468 20.492 80.555 60.478 140.747 120.436 60.712 40.540 60.893 130.343 12
Sun Jiahao, Qing Chunmei, Tan Junpeng, Xu Xiangmin: Superpoint Transformer for 3D Scene Instance Segmentation. AAAI 2023 [Oral]
DKNet0.532 70.815 90.624 30.517 30.377 90.749 10.107 100.509 220.304 90.437 60.475 90.581 30.539 90.775 60.339 130.640 110.506 90.901 100.385 3
Yizheng Wu, Min Shi, Shuaiyuan Du, Hao Lu, Zhiguo Cao, Weicai Zhong: 3D Instances as 1D Kernels. ECCV 2022
IPCA-Inst0.520 80.889 60.551 140.548 20.418 50.665 180.064 190.585 100.260 170.277 190.471 110.500 90.644 40.785 40.369 90.591 180.511 70.878 180.362 8
SoftGroup++0.513 90.704 200.578 90.398 200.363 140.704 90.061 200.647 40.297 140.378 100.537 20.343 110.614 50.828 30.295 170.710 60.505 110.875 200.394 1
SSTNetpermissive0.506 100.738 170.549 150.497 50.316 190.693 120.178 70.377 330.198 220.330 120.463 130.576 40.515 110.857 20.494 10.637 120.457 150.943 40.290 23
Zhihao Liang, Zhihao Li, Songcen Xu, Mingkui Tan, Kui Jia: Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks. ICCV2021
SoftGrouppermissive0.504 110.667 270.579 70.372 250.381 80.694 110.072 160.677 20.303 100.387 80.531 40.319 150.582 70.754 90.318 140.643 100.492 120.907 80.388 2
Thang Vu, Kookhoi Kim, Tung M. Luu, Xuan Thanh Nguyen, Chang D. Yoo: SoftGroup for 3D Instance Segmentaiton on Point Clouds. CVPR 2022 [Oral]
TD3D0.489 120.852 70.511 230.434 140.322 180.735 50.101 130.512 210.355 50.349 110.468 120.283 190.514 120.676 210.268 220.671 70.510 80.908 70.329 15
OccuSeg+instance0.486 130.802 110.536 170.428 160.369 110.702 100.205 60.331 380.301 110.379 90.474 100.327 120.437 190.862 10.485 20.601 160.394 250.846 290.273 25
Lei Han, Tian Zheng, Lan Xu, Lu Fang: OccuSeg: Occupancy-aware 3D Instance Segmentation. CVPR2020
TopoSeg0.479 140.704 200.564 110.467 90.366 120.633 260.068 170.554 160.262 160.328 130.447 150.323 130.534 100.722 140.288 190.614 130.482 130.912 60.358 10
DualGroup0.469 150.815 90.552 130.398 190.374 100.683 140.130 90.539 180.310 80.327 140.407 180.276 200.447 180.535 300.342 120.659 80.455 160.900 120.301 19
SSEC0.465 160.667 270.578 80.502 40.362 150.641 250.035 280.605 60.291 150.323 150.451 140.296 170.417 250.677 200.245 260.501 340.506 100.900 110.366 5
HAISpermissive0.457 170.704 200.561 120.457 100.364 130.673 150.046 270.547 170.194 230.308 160.426 160.288 180.454 170.711 160.262 230.563 250.434 190.889 150.344 11
Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang: Hierarchical Aggregation for 3D Instance Segmentation. ICCV 2021
DD-UNet+Group0.436 180.630 340.508 260.480 70.310 200.624 290.065 180.638 50.174 240.256 230.384 210.194 310.428 210.759 80.289 180.574 220.400 230.849 270.291 22
H. Liu, R. Liu, K. Yang, J. Zhang, K. Peng, R. Stiefelhagen: HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor. ICCVW 2021
INS-Conv-instance0.435 190.716 190.495 280.355 270.331 160.689 130.102 120.394 320.208 210.280 170.395 200.250 230.544 80.741 130.309 160.536 310.391 260.842 320.258 29
Mask-Group0.434 200.778 120.516 210.471 80.330 170.658 190.029 300.526 200.249 180.256 220.400 190.309 160.384 290.296 460.368 100.575 210.425 200.877 190.362 9
Min Zhong, Xinghao Chen, Xiaokang Chen, Gang Zeng, Yunhe Wang: MaskGroup: Hierarchical Point Grouping and Masking for 3D Instance Segmentation. ICME 2022
Box2Mask0.433 210.741 150.463 330.433 150.283 220.625 280.103 110.298 420.125 320.260 210.424 170.322 140.472 150.701 180.363 110.711 50.309 400.882 160.272 27
Julian Chibane, Francis Engelmann, Tuan Anh Tran, Gerard Pons-Moll: Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation Using Bounding Boxes. ECCV 2022
RPGN0.428 220.630 340.508 250.367 260.249 290.658 200.016 370.673 30.131 300.234 260.383 220.270 210.434 200.748 110.274 210.609 140.406 220.842 310.267 28
Shichao Dong, Guosheng Lin, Tzu-Yi Hung: Learning Regional Purity for Instance Segmentation on 3D Point Clouds. ECCV 2022
DENet0.413 230.741 150.520 190.237 380.284 210.523 360.097 140.691 10.138 270.209 360.229 380.238 250.390 270.707 170.310 150.448 410.470 140.892 140.310 17
PointGroup0.407 240.639 330.496 270.415 170.243 310.645 240.021 350.570 120.114 330.211 340.359 240.217 290.428 220.660 220.256 240.562 260.341 320.860 230.291 21
Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia: PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation. CVPR 2020 [oral]
CSC-Pretrained0.405 250.738 170.465 320.331 310.205 350.655 210.051 240.601 70.092 360.211 350.329 270.198 300.459 160.775 50.195 330.524 330.400 240.878 170.184 37
PE0.396 260.667 270.467 310.446 120.243 300.624 300.022 340.577 110.106 340.219 290.340 250.239 240.487 130.475 370.225 280.541 300.350 300.818 330.273 26
Biao Zhang, Peter Wonka: Point Cloud Instance Segmentation using Probabilistic Embeddings. CVPR 2021
Dyco3Dcopyleft0.395 270.642 320.518 200.447 110.259 280.666 170.050 250.251 460.166 250.231 270.362 230.232 260.331 320.535 290.229 270.587 190.438 180.850 250.317 16
Tong He; Chunhua Shen; Anton van den Hengel: DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution. CVPR2021
OSIS0.392 280.778 120.530 180.220 400.278 230.567 330.083 150.330 390.299 120.270 200.310 300.143 360.260 360.624 240.277 200.568 240.361 280.865 220.301 18
AOIA0.387 290.704 200.515 220.385 230.225 340.669 160.005 430.482 260.126 310.181 390.269 350.221 280.426 240.478 360.218 290.592 170.371 270.851 240.242 31
SSEN0.384 300.852 70.494 290.192 410.226 330.648 230.022 330.398 310.299 130.277 180.317 290.231 270.194 430.514 330.196 310.586 200.444 170.843 300.184 36
Dongsu Zhang, Junha Chun, Sang Kyun Cha, Young Min Kim: Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with Deep Metric Learning. Arxiv
PCJC0.375 310.704 200.542 160.284 350.197 370.649 220.006 410.426 270.138 280.242 240.304 310.183 340.388 280.629 230.141 430.546 290.344 310.738 390.283 24
SphereSeg0.357 320.651 310.411 350.345 280.264 270.630 270.059 210.289 440.212 190.240 250.336 260.158 350.305 330.557 270.159 390.455 400.341 330.726 410.294 20
3D-MPA0.355 330.457 450.484 300.299 330.277 240.591 320.047 260.332 360.212 200.217 300.278 320.193 320.413 260.410 400.195 320.574 230.352 290.849 260.213 34
Francis Engelmann, Martin Bokeloh, Alireza Fathi, Bastian Leibe, Matthias Nießner: 3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation. CVPR 2020
NeuralBF0.353 340.593 360.511 240.375 240.264 260.597 310.008 390.332 370.160 260.229 280.274 340.000 560.206 400.678 190.155 400.485 360.422 210.816 340.254 30
Weiwei Sun, Daniel Rebain, Renjie Liao, Vladimir Tankovich, Soroosh Yazdani, Kwang Moo Yi, Andrea Tagliasacchi: NeuralBF: Neural Bilateral Filtering for Top-down Instance Segmentation on Point Clouds. WACV 2023
RWSeg0.348 350.475 420.456 340.320 320.275 250.476 380.020 360.491 240.056 430.212 330.320 280.261 220.302 340.520 310.182 350.557 270.285 420.867 210.197 35
GICN0.341 360.580 370.371 370.344 290.198 360.469 390.052 230.564 140.093 350.212 320.212 400.127 380.347 310.537 280.206 300.525 320.329 350.729 400.241 32
One_Thing_One_Clickpermissive0.326 370.472 430.361 380.232 390.183 380.555 340.000 490.498 230.038 450.195 370.226 390.362 100.168 440.469 380.251 250.553 280.335 340.846 280.117 45
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
Occipital-SCS0.320 380.679 260.352 390.334 300.229 320.436 400.025 310.412 300.058 410.161 440.240 370.085 400.262 350.496 350.187 340.467 380.328 360.775 350.231 33
Sparse R-CNN0.292 390.704 200.213 490.153 430.154 400.551 350.053 220.212 470.132 290.174 410.274 330.070 420.363 300.441 390.176 360.424 430.234 440.758 370.161 41
MTML0.282 400.577 380.380 360.182 420.107 460.430 410.001 460.422 280.057 420.179 400.162 430.070 430.229 380.511 340.161 370.491 350.313 370.650 460.162 39
Jean Lahoud, Bernard Ghanem, Marc Pollefeys, Martin R. Oswald: 3D Instance Segmentation via Multi-task Metric Learning. ICCV 2019 [oral]
SALoss-ResNet0.262 410.667 270.335 400.067 500.123 440.427 420.022 320.280 450.058 400.216 310.211 410.039 460.142 460.519 320.106 470.338 470.310 390.721 420.138 42
Zhidong Liang, Ming Yang, Hao Li, Chunxiang Wang: 3D Instance Embedding Learning With a Structure-Aware Loss Function for Point Cloud Segmentation. IEEE Robotics and Automation Letters (IROS2020)
MASCpermissive0.254 420.463 440.249 480.113 440.167 390.412 440.000 480.374 340.073 370.173 420.243 360.130 370.228 390.368 420.160 380.356 450.208 450.711 430.136 43
Chen Liu, Yasutaka Furukawa: MASC: Multi-scale Affinity with Sparse Convolution for 3D Instance Segmentation.
3D-BoNet0.253 430.519 400.324 430.251 370.137 430.345 490.031 290.419 290.069 380.162 430.131 450.052 440.202 420.338 440.147 420.301 500.303 410.651 450.178 38
Bo Yang, Jianan Wang, Ronald Clark, Qingyong Hu, Sen Wang, Andrew Markham, Niki Trigoni: Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds. NeurIPS 2019 Spotlight
SPG_WSIS0.251 440.380 470.274 460.289 340.144 410.413 430.000 490.311 400.065 390.113 460.130 460.029 480.204 410.388 410.108 460.459 390.311 380.769 360.127 44
SegGroup_inspermissive0.246 450.556 390.335 410.062 520.115 450.490 370.000 490.297 430.018 490.186 380.142 440.083 410.233 370.216 480.153 410.469 370.251 430.744 380.083 48
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
PanopticFusion-inst0.214 460.250 510.330 420.275 360.103 470.228 550.000 490.345 350.024 470.088 480.203 420.186 330.167 450.367 430.125 440.221 530.112 550.666 440.162 40
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
UNet-backbone0.161 470.519 400.259 470.084 460.059 490.325 510.002 440.093 520.009 510.077 500.064 490.045 450.044 530.161 500.045 490.331 480.180 470.566 470.033 56
3D-SISpermissive0.161 470.407 460.155 530.068 490.043 530.346 480.001 450.134 490.005 520.088 470.106 480.037 470.135 480.321 450.028 520.339 460.116 540.466 500.093 47
Ji Hou, Angela Dai, Matthias Niessner: 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans. CVPR 2019
R-PointNet0.158 490.356 480.173 510.113 450.140 420.359 450.012 380.023 540.039 440.134 450.123 470.008 520.089 490.149 510.117 450.221 520.128 520.563 480.094 46
Region-18class0.146 500.175 550.321 440.080 470.062 480.357 460.000 490.307 410.002 530.066 510.044 510.000 560.018 550.036 550.054 480.447 420.133 500.472 490.060 51
SemRegionNet-20cls0.121 510.296 500.203 500.071 480.058 500.349 470.000 490.150 480.019 480.054 520.034 530.017 510.052 510.042 540.013 550.209 540.183 460.371 510.057 52
3D-BEVIS0.117 520.250 510.308 450.020 560.009 570.269 540.006 420.008 550.029 460.037 550.014 560.003 540.036 540.147 520.042 500.381 440.118 530.362 520.069 50
Cathrin Elich, Francis Engelmann, Jonas Schult, Theodora Kontogianni, Bastian Leibe: 3D-BEVIS: Birds-Eye-View Instance Segmentation.
Hier3Dcopyleft0.117 520.222 530.161 520.054 540.027 540.289 520.000 490.124 500.001 550.079 490.061 500.027 490.141 470.240 470.005 560.310 490.129 510.153 560.081 49
Tan: HCFS3D: Hierarchical Coupled Feature Selection Network for 3D Semantic and Instance Segmentation.
tmp0.113 540.333 490.151 540.056 530.053 510.344 500.000 490.105 510.016 500.049 530.035 520.020 500.053 500.048 530.013 540.183 550.173 480.344 530.054 53
ASIS0.085 550.037 560.080 560.066 510.047 520.282 530.000 490.052 530.002 540.047 540.026 540.001 550.046 520.194 490.031 510.264 510.140 490.167 550.047 55
Sgpn_scannet0.049 560.023 570.134 550.031 550.013 560.144 560.006 400.008 560.000 560.028 560.017 550.003 530.009 570.000 560.021 530.122 560.095 560.175 540.054 54
MaskRCNN 2d->3d Proj0.022 570.185 540.000 570.000 570.015 550.000 570.000 470.006 570.000 560.010 570.006 570.107 390.012 560.000 560.002 570.027 570.004 570.022 570.001 57


This table lists the benchmark results for the 2D semantic label scenario.


Method Infoavg ioubathtubbedbookshelfcabinetchaircountercurtaindeskdoorfloorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwallwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Virtual MVFusion (R)0.745 10.861 10.839 10.881 10.672 10.512 10.422 150.898 10.723 10.714 10.954 20.454 10.509 10.773 10.895 10.756 10.820 10.653 10.935 10.891 10.728 1
Abhijit Kundu, Xiaoqi Yin, Alireza Fathi, David Ross, Brian Brewington, Thomas Funkhouser, Caroline Pantofaru: Virtual Multi-view Fusion for 3D Semantic Segmentation. ECCV 2020
BPNet_2Dcopyleft0.670 20.822 30.795 30.836 20.659 20.481 20.451 110.769 30.656 30.567 30.931 30.395 40.390 40.700 30.534 30.689 90.770 20.574 30.865 60.831 30.675 4
Wenbo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia and Tien-Tsin Wong: Bidirectional Projection Network for Cross Dimension Scene Understanding. CVPR 2021 (Oral)
CU-Hybrid-2D Net0.636 30.825 20.820 20.179 200.648 30.463 30.549 20.742 60.676 20.628 20.961 10.420 20.379 50.684 60.381 150.732 20.723 30.599 20.827 130.851 20.634 6
CMX0.613 40.681 70.725 90.502 120.634 50.297 150.478 90.830 20.651 40.537 60.924 40.375 50.315 120.686 50.451 120.714 40.543 180.504 50.894 40.823 40.688 3
DMMF_3d0.605 50.651 80.744 70.782 30.637 40.387 40.536 30.732 70.590 60.540 50.856 180.359 90.306 130.596 110.539 20.627 180.706 40.497 70.785 180.757 160.476 19
MCA-Net0.595 60.533 170.756 60.746 40.590 80.334 70.506 60.670 120.587 70.500 100.905 80.366 80.352 80.601 100.506 60.669 150.648 70.501 60.839 120.769 120.516 18
RFBNet0.592 70.616 90.758 50.659 50.581 90.330 80.469 100.655 150.543 120.524 70.924 40.355 100.336 100.572 140.479 80.671 130.648 70.480 90.814 160.814 50.614 9
FAN_NV_RVC0.586 80.510 180.764 40.079 230.620 70.330 80.494 70.753 40.573 80.556 40.884 130.405 30.303 140.718 20.452 110.672 120.658 50.509 40.898 30.813 60.727 2
DCRedNet0.583 90.682 60.723 100.542 110.510 170.310 120.451 110.668 130.549 110.520 80.920 60.375 50.446 20.528 170.417 130.670 140.577 150.478 100.862 70.806 70.628 8
MIX6D_RVC0.582 100.695 40.687 140.225 180.632 60.328 100.550 10.748 50.623 50.494 130.890 110.350 120.254 200.688 40.454 100.716 30.597 140.489 80.881 50.768 130.575 12
SSMAcopyleft0.577 110.695 40.716 120.439 140.563 110.314 110.444 130.719 80.551 100.503 90.887 120.346 130.348 90.603 90.353 170.709 50.600 120.457 120.901 20.786 80.599 11
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. International Journal of Computer Vision, 2019
UNIV_CNP_RVC_UE0.566 120.569 160.686 160.435 150.524 140.294 160.421 160.712 90.543 120.463 150.872 140.320 140.363 70.611 80.477 90.686 100.627 90.443 150.862 70.775 110.639 5
EMSAFormer0.564 130.581 130.736 80.564 100.546 130.219 200.517 40.675 110.486 170.427 190.904 90.352 110.320 110.589 120.528 40.708 60.464 210.413 190.847 110.786 80.611 10
SN_RN152pyrx8_RVCcopyleft0.546 140.572 140.663 180.638 70.518 150.298 140.366 210.633 180.510 150.446 170.864 160.296 170.267 170.542 160.346 180.704 70.575 160.431 160.853 100.766 140.630 7
UDSSEG_RVC0.545 150.610 110.661 190.588 80.556 120.268 180.482 80.642 170.572 90.475 140.836 200.312 150.367 60.630 70.189 200.639 170.495 200.452 130.826 140.756 170.541 14
segfomer with 6d0.542 160.594 120.687 140.146 210.579 100.308 130.515 50.703 100.472 180.498 110.868 150.369 70.282 150.589 120.390 140.701 80.556 170.416 180.860 90.759 150.539 16
FuseNetpermissive0.535 170.570 150.681 170.182 190.512 160.290 170.431 140.659 140.504 160.495 120.903 100.308 160.428 30.523 180.365 160.676 110.621 110.470 110.762 190.779 100.541 14
Caner Hazirbas, Lingni Ma, Csaba Domokos, Daniel Cremers: FuseNet: Incorporating Depth into Semantic Segmentation via Fusion-based CNN Architecture. ACCV 2016
AdapNet++copyleft0.503 180.613 100.722 110.418 160.358 230.337 60.370 200.479 210.443 190.368 210.907 70.207 200.213 220.464 210.525 50.618 190.657 60.450 140.788 170.721 200.408 22
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. International Journal of Computer Vision, 2019
3DMV (2d proj)0.498 190.481 210.612 200.579 90.456 190.343 50.384 180.623 190.525 140.381 200.845 190.254 190.264 190.557 150.182 210.581 210.598 130.429 170.760 200.661 220.446 21
Angela Dai, Matthias Niessner: 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation. ECCV'18
MSeg1080_RVCpermissive0.485 200.505 190.709 130.092 220.427 200.241 190.411 170.654 160.385 230.457 160.861 170.053 230.279 160.503 190.481 70.645 160.626 100.365 210.748 210.725 190.529 17
John Lambert*, Zhuang Liu*, Ozan Sener, James Hays, Vladlen Koltun: MSeg: A Composite Dataset for Multi-domain Semantic Segmentation. CVPR 2020
ILC-PSPNet0.475 210.490 200.581 210.289 170.507 180.067 230.379 190.610 200.417 210.435 180.822 220.278 180.267 170.503 190.228 190.616 200.533 190.375 200.820 150.729 180.560 13
Enet (reimpl)0.376 220.264 230.452 230.452 130.365 210.181 210.143 230.456 220.409 220.346 220.769 230.164 210.218 210.359 220.123 230.403 230.381 230.313 230.571 220.685 210.472 20
Re-implementation of Adam Paszke, Abhishek Chaurasia, Sangpil Kim, Eugenio Culurciello: ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation.
ScanNet (2d proj)permissive0.330 230.293 220.521 220.657 60.361 220.161 220.250 220.004 230.440 200.183 230.836 200.125 220.060 230.319 230.132 220.417 220.412 220.344 220.541 230.427 230.109 23
Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner: ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. CVPR'17
DMMF0.003 240.000 240.005 240.000 240.000 240.037 240.001 240.000 240.001 240.005 240.003 240.000 240.000 240.000 240.000 240.000 240.002 240.001 240.000 240.006 240.000 24


This table lists the benchmark results for the 2D semantic instance scenario.




Method Infoavg apbathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
UniDet_RVC0.205 10.381 10.323 10.037 10.226 10.177 10.063 10.277 10.120 10.067 10.131 10.074 20.317 10.080 10.235 10.289 10.141 10.678 10.080 1
MaskRCNN_ScanNetpermissive0.119 20.129 20.212 20.002 20.112 20.148 20.014 20.205 20.044 20.066 20.078 20.095 10.142 20.030 20.128 20.139 20.080 20.459 20.057 2
Re-implementation of Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross Girshick: Mask R-CNN. ICCV'17


This table lists the benchmark results for the scene type classification scenario.




Method Infoavg recallapartmentbathroombedroom / hotelbookstore / libraryconference roomcopy/mail roomhallwaykitchenlaundry roomliving room / loungemiscofficestorage / basement / garage
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
multi-taskpermissive0.700 10.500 11.000 10.882 20.500 21.000 11.000 10.500 21.000 11.000 10.778 10.000 20.938 10.000 2
Shengyu Huang, Mikhail Usvyatsov, Konrad Schindler: Indoor Scene Recognition in 3D. IROS 2020
3DASPP-SCE0.691 20.500 10.938 20.824 31.000 11.000 10.500 21.000 10.857 20.500 20.556 30.000 20.812 20.500 1
SE-ResNeXt-SSMA0.498 30.000 40.812 30.941 10.500 20.500 30.500 20.500 20.429 40.500 20.667 20.500 10.625 30.000 2
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. arXiv
resnet50_scannet0.353 40.250 30.812 30.529 40.500 20.500 30.000 40.500 20.571 30.000 40.556 30.000 20.375 40.000 2