Presenting the ScanNet200 Benchmark

We present the ScanNet200 benchmark, which studies an order of magnitude more class categories than previous version of ScanNet. The scene geometry is shared within the two tasks, but the parsing of surface annotation allows for a larger vocabulary and more realistic setting for in the wild 3D understanding methods.

The ScanNet200 benchmark includes both finer-grained categories as well as a large number of previously unaddressed classes. This induces a much more challenging setting regarding the diversity of naturally observed semantic classes seen in the raw ScanNet RGB-D observations, where the data also reflects naturally encountered class imbalances. The difference in category frequencies between ScanNet and ScanNet200 can be seen in the Figure above.

ScanNet200 Benchmark

This table lists the benchmark results for the ScanNet200 3D semantic label scenario.




Method Infoavg iouhead ioucommon ioutail iouwallchairfloortabledoorcouchcabinetshelfdeskoffice chairbedpillowsinkpicturewindowtoiletbookshelfmonitorcurtainbookarmchaircoffee tableboxrefrigeratorlampkitchen cabinettowelclothestvnightstandcounterdresserstoolcushionplantceilingbathtubend tabledining tablekeyboardbagbackpacktoilet paperprintertv standwhiteboardblanketshower curtaintrash canclosetstairsmicrowavestoveshoecomputer towerbottlebinottomanbenchboardwashing machinemirrorcopierbasketsofa chairfile cabinetfanlaptopshowerpaperpersonpaper towel dispenserovenblindsrackplateblackboardpianosuitcaserailradiatorrecycling bincontainerwardrobesoap dispensertelephonebucketclockstandlightlaundry basketpipeclothes dryerguitartoilet paper holderseatspeakercolumnbicycleladderbathroom stallshower wallcupjacketstorage bincoffee makerdishwasherpaper towel rollmachinematwindowsillbartoasterbulletin boardironing boardfireplacesoap dishkitchen counterdoorframetoilet paper dispensermini fridgefire extinguisherballhatshower curtain rodwater coolerpaper cuttertrayshower doorpillarledgetoaster ovenmousetoilet seat cover dispenserfurniturecartstorage containerscaletissue boxlight switchcratepower outletdecorationsignprojectorcloset doorvacuum cleanercandleplungerstuffed animalheadphonesdish rackbroomguitar caserange hooddustpanhair dryerwater bottlehandicap barpurseventshower floorwater pitchermailboxbowlpaper bagalarm clockmusic standprojector screendividerlaundry detergentbathroom counterobjectbathroom vanitycloset walllaundry hamperbathroom stall doorceiling lighttrash bindumbbellstair railtubebathroom cabinetcd casecloset rodcoffee kettlestructureshower headkeyboard pianocase of water bottlescoat rackstorage organizerfolded chairfire alarmpower stripcalendarposterpotted plantluggagemattress
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
CeCo0.340 10.551 10.247 10.181 10.784 10.661 20.939 10.564 10.624 10.721 10.484 10.429 10.575 10.027 40.774 10.503 20.753 10.242 20.656 10.945 10.534 10.865 20.860 10.177 50.616 20.400 10.818 10.579 10.615 10.367 20.408 10.726 30.633 10.162 10.360 10.619 10.000 10.828 10.873 10.924 10.109 30.083 20.564 10.057 50.475 20.266 10.781 10.767 10.257 10.100 20.825 10.663 10.048 50.620 30.551 20.595 30.532 10.692 30.246 10.000 30.213 10.615 10.861 20.376 10.900 10.000 10.102 50.660 10.321 30.547 10.226 10.000 10.311 10.742 10.011 20.006 20.000 10.000 20.546 50.824 10.345 20.665 10.450 10.435 10.683 10.411 10.338 10.000 30.000 10.030 40.000 10.068 40.892 10.000 10.063 20.000 50.257 10.304 40.387 30.079 30.228 10.190 10.000 50.586 10.347 10.133 20.000 20.037 20.377 40.000 10.384 20.006 40.003 20.421 10.410 20.643 10.171 20.121 20.142 40.000 10.510 40.447 10.474 30.000 10.000 50.286 10.083 10.000 30.000 20.603 10.096 10.063 10.000 20.000 10.000 10.898 10.000 10.429 10.000 10.400 10.550 10.000 10.633 10.000 10.000 10.377 10.000 50.916 10.000 30.000 10.000 10.000 20.000 10.102 50.499 30.296 20.463 20.089 30.304 10.740 10.401 40.010 10.000 10.560 10.000 10.000 20.709 10.652 10.000 20.000 10.000 10.143 20.000 20.000 10.609 10.000 1
: Understanding Imbalanced Semantic Segmentation Through Neural Collapse.
Minkowski 34Dpermissive0.253 40.463 40.154 50.102 40.771 40.650 40.932 30.483 40.571 40.710 40.331 40.250 40.492 30.044 30.703 40.419 50.606 50.227 50.621 40.865 50.531 20.771 50.813 20.291 10.484 40.242 40.612 50.282 50.440 50.351 30.299 30.622 40.593 40.027 50.293 30.310 50.000 10.757 20.858 30.737 50.150 20.164 10.368 50.084 10.381 50.142 50.357 30.720 20.214 40.092 40.724 40.596 50.056 40.655 10.525 40.581 50.352 50.594 40.056 50.000 30.014 50.224 40.772 30.205 50.720 40.000 10.159 20.531 40.163 50.294 40.136 50.000 10.169 40.589 40.000 30.000 30.000 10.002 10.663 10.466 50.265 50.582 30.337 20.016 40.559 30.084 50.000 20.000 30.000 10.036 30.000 10.125 30.670 40.000 10.102 10.071 30.164 30.406 20.386 40.046 50.068 50.159 30.117 10.284 40.111 50.094 40.000 20.000 50.197 50.000 10.044 30.013 30.002 30.228 50.307 50.588 20.025 50.545 10.134 50.000 10.655 10.302 30.282 50.000 10.060 10.000 30.035 50.000 30.000 20.097 50.000 30.000 20.005 10.000 10.000 10.096 50.000 10.334 40.000 10.000 30.274 40.000 10.513 50.000 10.000 10.280 20.194 20.897 20.000 30.000 10.000 10.000 20.000 10.108 40.279 50.189 40.141 50.059 50.272 20.307 50.445 10.003 20.000 10.353 40.000 10.026 10.000 30.581 40.001 10.000 10.000 10.093 50.002 10.000 10.000 30.000 1
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019
AWCS0.305 20.508 20.225 20.142 20.782 20.634 50.937 20.489 30.578 20.721 10.364 30.355 20.515 20.023 50.764 20.523 10.707 30.264 10.633 20.922 20.507 40.886 10.804 30.179 30.436 50.300 20.656 40.529 20.501 30.394 10.296 40.820 10.603 20.131 20.179 50.619 10.000 10.707 40.865 20.773 20.171 10.010 40.484 20.063 30.463 30.254 20.332 40.649 30.220 30.100 20.729 30.613 30.071 30.582 40.628 10.702 10.424 30.749 10.137 30.000 30.142 20.360 20.863 10.305 20.877 20.000 10.173 10.606 20.337 20.478 20.154 30.000 10.253 20.664 20.000 30.000 30.000 10.000 20.626 30.782 20.302 40.602 20.185 40.282 20.651 20.317 20.000 20.000 30.000 10.022 50.000 10.154 10.876 20.000 10.014 30.063 40.029 50.553 10.467 20.084 20.124 20.157 40.049 40.373 20.252 20.097 30.000 20.219 10.542 10.000 10.392 10.172 20.000 40.339 20.417 10.533 40.093 30.115 30.195 20.000 10.516 30.288 40.741 10.000 10.001 40.233 20.056 20.000 30.159 10.334 20.077 20.000 20.000 20.000 10.000 10.749 20.000 10.411 20.000 10.008 20.452 20.000 10.595 20.000 10.000 10.220 30.006 30.894 30.006 20.000 10.000 10.000 20.000 10.112 30.504 20.404 10.551 10.093 20.129 50.484 30.381 50.000 30.000 10.396 30.000 10.000 20.620 20.402 50.000 20.000 10.000 10.142 30.000 20.000 10.512 20.000 1
LGroundpermissive0.272 30.485 30.184 30.106 30.778 30.676 10.932 30.479 50.572 30.718 30.399 20.265 30.453 40.085 20.745 30.446 30.726 20.232 40.622 30.901 30.512 30.826 30.786 40.178 40.549 30.277 30.659 30.381 30.518 20.295 50.323 20.777 20.599 30.028 40.321 20.363 40.000 10.708 30.858 30.746 40.063 40.022 30.457 30.077 20.476 10.243 30.402 20.397 50.233 20.077 50.720 50.610 40.103 10.629 20.437 50.626 20.446 20.702 20.190 20.005 10.058 40.322 30.702 40.244 30.768 30.000 10.134 40.552 30.279 40.395 30.147 40.000 10.207 30.612 30.000 30.000 30.000 10.000 20.658 20.566 30.323 30.525 50.229 30.179 30.467 50.154 40.000 20.002 10.000 10.051 10.000 10.127 20.703 30.000 10.000 40.216 10.112 40.358 30.547 10.187 10.092 40.156 50.055 30.296 30.252 20.143 10.000 20.014 30.398 30.000 10.028 40.173 10.000 40.265 40.348 30.415 50.179 10.019 40.218 10.000 10.597 20.274 50.565 20.000 10.012 30.000 30.039 40.022 20.000 20.117 30.000 30.000 20.000 20.000 10.000 10.324 40.000 10.384 30.000 10.000 30.251 50.000 10.566 30.000 10.000 10.066 40.404 10.886 40.199 10.000 10.000 10.059 10.000 10.136 10.540 10.127 50.295 30.085 40.143 40.514 20.413 30.000 30.000 10.498 20.000 10.000 20.000 30.623 20.000 20.000 10.000 10.132 40.000 20.000 10.000 30.000 1
David Rozenberszki, Or Litany, Angela Dai: Language-Grounded Indoor 3D Semantic Segmentation in the Wild. arXiv
CSC-Pretrainpermissive0.249 50.455 50.171 40.079 50.766 50.659 30.930 50.494 20.542 50.700 50.314 50.215 50.430 50.121 10.697 50.441 40.683 40.235 30.609 50.895 40.476 50.816 40.770 50.186 20.634 10.216 50.734 20.340 40.471 40.307 40.293 50.591 50.542 50.076 30.205 40.464 30.000 10.484 50.832 50.766 30.052 50.000 50.413 40.059 40.418 40.222 40.318 50.609 40.206 50.112 10.743 20.625 20.076 20.579 50.548 30.590 40.371 40.552 50.081 40.003 20.142 20.201 50.638 50.233 40.686 50.000 10.142 30.444 50.375 10.247 50.198 20.000 10.128 50.454 50.019 10.097 10.000 10.000 20.553 40.557 40.373 10.545 40.164 50.014 50.547 40.174 30.000 20.002 10.000 10.037 20.000 10.063 50.664 50.000 10.000 40.130 20.170 20.152 50.335 50.079 30.110 30.175 20.098 20.175 50.166 40.045 50.207 10.014 30.465 20.000 10.001 50.001 50.046 10.299 30.327 40.537 30.033 40.012 50.186 30.000 10.205 50.377 20.463 40.000 10.058 20.000 30.055 30.041 10.000 20.105 40.000 30.000 20.000 20.000 10.000 10.398 30.000 10.308 50.000 10.000 30.319 30.000 10.543 40.000 10.000 10.062 50.004 40.862 50.000 30.000 10.000 10.000 20.000 10.123 20.316 40.225 30.250 40.094 10.180 30.332 40.441 20.000 30.000 10.310 50.000 10.000 20.000 30.592 30.000 20.000 10.000 10.203 10.000 20.000 10.000 30.000 1
Ji Hou, Benjamin Graham, Matthias Nießner, Saining Xie: Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts. CVPR 2021


This table lists the benchmark results for the ScanNet200 3D semantic instance scenario.




Method Infoavg ap 25%head ap 25%common ap 25%tail ap 25%chairtabledoorcouchcabinetshelfdeskoffice chairbedpillowsinkpicturewindowtoiletbookshelfmonitorcurtainbookarmchaircoffee tableboxrefrigeratorlampkitchen cabinettowelclothestvnightstandcounterdresserstoolcushionplantceilingbathtubend tabledining tablekeyboardbagbackpacktoilet paperprintertv standwhiteboardblanketshower curtaintrash canclosetstairsmicrowavestoveshoecomputer towerbottlebinottomanbenchboardwashing machinemirrorcopierbasketsofa chairfile cabinetfanlaptopshowerpaperpersonpaper towel dispenserovenblindsrackplateblackboardpianosuitcaserailradiatorrecycling bincontainerwardrobesoap dispensertelephonebucketclockstandlightlaundry basketpipeclothes dryerguitartoilet paper holderseatspeakercolumnbicycleladderbathroom stallshower wallcupjacketstorage bincoffee makerdishwasherpaper towel rollmachinematwindowsillbartoasterbulletin boardironing boardfireplacesoap dishkitchen counterdoorframetoilet paper dispensermini fridgefire extinguisherballhatshower curtain rodwater coolerpaper cuttertrayshower doorpillarledgetoaster ovenmousetoilet seat cover dispenserfurniturecartstorage containerscaletissue boxlight switchcratepower outletdecorationsignprojectorcloset doorvacuum cleanercandleplungerstuffed animalheadphonesdish rackbroomguitar caserange hooddustpanhair dryerwater bottlehandicap barpurseventshower floorwater pitchermailboxbowlpaper bagalarm clockmusic standprojector screendividerlaundry detergentbathroom counterobjectbathroom vanitycloset walllaundry hamperbathroom stall doorceiling lighttrash bindumbbellstair railtubebathroom cabinetcd casecloset rodcoffee kettlestructureshower headkeyboard pianocase of water bottlescoat rackstorage organizerfolded chairfire alarmpower stripcalendarposterpotted plantluggagemattress
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Mask3D Scannet2000.445 10.653 10.392 10.254 10.844 20.746 20.818 10.888 40.556 10.262 10.890 10.025 21.000 10.608 10.930 10.694 30.721 10.930 50.686 30.966 10.615 40.440 10.725 40.201 10.890 30.414 40.827 10.552 10.158 50.806 10.924 10.042 30.512 20.412 50.226 10.604 30.830 11.000 10.125 10.792 10.815 10.097 10.648 10.551 20.354 41.000 10.630 10.241 21.000 10.853 10.204 10.974 40.841 10.778 10.358 20.927 10.300 10.045 10.640 10.363 10.745 20.710 11.000 10.000 10.330 20.943 10.315 20.600 11.000 10.027 10.080 50.556 50.500 10.409 10.000 10.194 11.000 10.500 10.493 20.761 20.053 40.042 30.780 20.454 10.009 10.333 10.050 10.321 10.000 10.084 10.552 20.008 20.027 20.750 10.500 10.442 30.657 10.765 20.120 20.183 30.021 21.000 10.510 20.016 10.000 10.400 10.619 10.000 10.396 10.290 10.000 10.741 10.699 11.000 10.260 10.017 30.125 50.000 10.792 40.399 41.000 10.000 10.049 30.265 10.063 30.000 31.000 10.335 20.381 10.500 10.250 10.004 20.000 10.727 20.000 10.538 30.000 10.188 10.677 20.000 10.930 10.000 10.000 10.966 10.391 10.908 20.000 10.028 10.000 11.000 10.000 10.152 10.451 20.458 10.971 10.573 10.606 10.167 50.625 10.004 10.000 10.058 50.000 10.000 11.000 11.000 10.000 10.056 10.000 20.200 30.309 10.000 21.000 10.000 1
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
TD3D Scannet2000.379 20.603 20.306 20.190 20.885 10.755 10.800 20.958 10.390 20.260 20.866 20.232 10.979 20.523 30.869 30.559 50.689 21.000 10.795 10.905 20.748 10.173 50.825 10.173 20.970 10.457 10.615 20.456 20.200 10.621 40.906 20.553 10.517 10.510 10.220 20.715 10.706 21.000 10.113 20.792 10.717 20.073 20.635 20.557 10.638 11.000 10.205 50.146 31.000 10.769 50.186 21.000 10.710 50.778 10.415 10.834 40.226 20.021 20.590 20.356 20.817 10.477 51.000 10.000 10.635 10.843 20.427 10.270 40.125 20.000 20.102 31.000 10.125 20.000 20.000 10.000 20.000 30.125 40.370 30.622 50.221 10.196 20.836 10.288 20.000 20.093 20.020 20.294 20.000 10.075 20.667 10.038 10.111 10.250 40.000 40.526 20.495 30.908 10.111 30.259 10.003 30.667 20.045 50.000 20.000 10.400 10.274 30.000 10.274 20.226 20.000 10.520 20.302 50.731 20.103 30.458 10.500 10.000 11.000 10.472 10.792 30.000 10.088 20.061 20.250 10.009 20.250 20.333 30.181 20.396 20.051 20.012 10.000 10.458 40.000 10.424 50.000 10.101 20.390 50.000 10.833 20.000 10.000 10.857 20.222 31.000 10.000 10.003 20.000 10.000 20.000 10.102 20.275 50.400 20.735 20.061 30.433 30.533 30.625 10.000 20.000 10.259 40.000 10.000 10.000 20.500 20.000 10.000 21.000 10.600 10.000 20.250 10.000 20.000 1
LGround Inst.permissive0.314 30.529 30.225 30.155 30.810 30.625 30.798 30.940 20.372 30.217 30.484 30.000 30.927 30.528 20.826 50.694 20.605 31.000 10.731 20.846 30.716 30.350 20.589 50.123 40.857 40.457 20.578 30.376 40.183 20.765 30.800 30.000 40.278 40.500 20.000 30.659 20.569 41.000 10.093 30.000 30.539 30.010 30.578 50.378 40.571 21.000 10.337 30.252 10.530 50.814 30.000 40.744 50.743 30.746 30.346 30.863 30.067 30.000 30.400 30.167 30.667 30.488 41.000 10.000 10.208 40.783 30.166 40.375 20.071 50.000 20.200 10.607 40.000 30.000 20.000 10.000 21.000 10.500 10.517 10.716 40.221 20.000 40.706 30.085 50.000 20.000 30.000 30.077 40.000 10.063 30.278 30.000 30.000 30.500 20.083 30.181 50.515 20.286 30.144 10.219 20.042 10.582 40.400 30.000 20.000 10.000 50.305 20.000 10.000 40.036 30.000 10.413 30.500 20.533 50.250 20.200 20.500 10.000 11.000 10.472 11.000 10.000 10.000 40.000 30.250 10.000 30.000 30.333 30.000 30.000 30.000 30.000 30.000 10.600 30.000 10.594 20.000 10.000 30.500 30.000 10.647 50.000 10.000 10.429 30.333 20.500 50.000 10.000 30.000 10.000 20.000 10.069 30.696 10.050 50.556 30.031 50.042 50.750 10.250 40.000 20.000 10.630 10.000 10.000 10.000 20.500 20.000 10.000 20.000 20.400 20.000 20.000 20.000 20.000 1
David Rozenberszki, Or Litany, Angela Dai: Language-Grounded Indoor 3D Semantic Segmentation in the Wild.
Minkowski 34D Inst.permissive0.280 40.488 40.192 50.124 40.804 40.518 40.772 50.904 30.337 50.191 40.443 40.000 30.861 40.502 40.868 40.669 40.587 40.997 30.467 50.828 50.732 20.342 30.745 30.119 50.918 20.404 50.419 40.398 30.172 30.618 50.743 40.167 20.077 50.500 20.000 30.568 40.506 51.000 10.044 40.000 30.502 40.010 40.593 40.284 50.305 50.903 50.213 40.142 40.981 30.790 40.000 41.000 10.715 40.538 50.346 40.830 50.067 30.000 30.400 30.074 40.333 40.551 21.000 10.000 10.292 30.777 40.118 50.317 30.100 40.000 20.191 20.648 30.000 30.000 20.000 10.000 20.000 30.500 10.213 50.825 10.021 50.333 10.648 50.098 40.000 20.000 30.000 30.077 30.000 10.000 50.150 50.000 30.000 30.000 50.225 20.281 40.447 40.000 50.090 40.148 40.000 40.479 50.542 10.000 20.000 10.200 30.131 50.000 10.250 30.000 40.000 10.159 50.396 40.677 30.021 40.000 40.500 10.000 11.000 10.442 30.125 50.000 10.000 40.000 30.000 40.333 10.000 30.528 10.000 30.000 30.000 30.000 30.000 10.200 50.000 10.516 40.000 10.000 30.500 30.000 10.833 20.000 10.000 10.286 40.083 40.750 30.000 10.000 30.000 10.000 20.000 10.059 50.445 30.200 30.535 40.070 20.167 40.385 40.375 30.000 20.000 10.333 30.000 10.000 10.000 20.500 20.000 10.000 20.000 20.200 30.000 20.000 20.000 20.000 1
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019
CSC-Pretrain Inst.permissive0.275 50.466 50.218 40.110 50.783 50.383 50.783 40.829 50.367 40.168 50.305 50.000 30.661 50.413 50.869 20.719 10.546 50.997 30.685 40.841 40.555 50.277 40.768 20.132 30.779 50.448 30.364 50.212 50.161 40.768 20.692 50.000 40.395 30.500 20.000 30.450 50.591 31.000 10.020 50.000 30.423 50.007 50.625 30.420 30.505 31.000 10.353 20.119 50.571 40.819 20.014 31.000 10.774 20.689 40.311 50.866 20.067 30.000 30.400 30.000 50.278 50.501 31.000 10.000 10.162 50.584 50.286 30.206 50.125 20.000 20.084 40.649 20.000 30.000 20.000 10.000 20.000 30.125 40.312 40.727 30.221 20.000 40.667 40.114 30.000 20.000 30.000 30.065 50.000 10.004 40.278 30.000 30.000 30.500 20.000 40.571 10.000 50.250 40.019 50.145 50.000 40.667 20.200 40.000 20.000 10.200 30.258 40.000 10.000 40.000 40.000 10.369 40.429 30.613 40.000 50.000 40.500 10.000 10.500 50.333 50.500 40.000 10.106 10.000 30.000 40.000 30.000 30.333 30.000 30.000 30.000 30.000 30.000 10.918 10.000 10.638 10.000 10.000 30.750 10.000 10.833 20.000 10.000 10.143 50.000 50.750 30.000 10.000 30.000 10.000 20.000 10.063 40.377 40.200 30.222 50.055 40.500 20.677 20.250 40.000 20.000 10.500 20.000 10.000 10.000 20.500 20.000 10.000 20.000 20.115 50.000 20.000 20.000 20.000 1
Ji Hou, Benjamin Graham, Matthias Nießner, Saining Xie: Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts. CVPR 2021


ScanNet Benchmark

This table lists the benchmark results for the 3D semantic label scenario.


Method Infoavg ioubathtubbedbookshelfcabinetchaircountercurtaindeskdoorfloorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwallwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Mix3Dpermissive0.781 10.964 10.855 10.843 100.781 10.858 70.575 30.831 190.685 70.714 10.979 10.594 30.310 170.801 10.892 80.841 20.819 30.723 30.940 80.887 10.725 12
Alexey Nekrasov, Jonas Schult, Or Litany, Bastian Leibe, Francis Engelmann: Mix3D: Out-of-Context Data Augmentation for 3D Scenes. 3DV 2021 (Oral)
OccuSeg+Semantic0.764 20.758 460.796 200.839 120.746 110.907 10.562 50.850 130.680 90.672 50.978 20.610 10.335 80.777 40.819 320.847 10.830 10.691 80.972 10.885 20.727 10
CU-Hybrid Net0.764 20.924 20.819 80.840 110.757 60.853 90.580 10.848 140.709 20.643 120.958 100.587 70.295 230.753 140.884 120.758 110.815 60.725 20.927 170.867 100.743 5
O-CNNpermissive0.762 40.924 20.823 50.844 90.770 20.852 100.577 20.847 150.711 10.640 160.958 100.592 40.217 580.762 100.888 90.758 110.813 70.726 10.932 150.868 90.744 4
Peng-Shuai Wang, Yang Liu, Yu-Xiao Guo, Chun-Yu Sun, Xin Tong: O-CNN: Octree-based Convolutional Neural Networks for 3D Shape Analysis. SIGGRAPH 2017
PointTransformerV20.752 50.742 530.809 140.872 10.758 50.860 60.552 70.891 50.610 300.687 20.960 80.559 140.304 200.766 80.926 20.767 80.797 140.644 220.942 60.876 70.722 14
Xiaoyang Wu, Yixing Lao, Li Jiang, Xihui Liu, Hengshuang Zhao: Point Transformer V2: Grouped Vector Attention and Partition-based Pooling. NeurIPS 2022
DMF-Net0.752 50.906 60.793 230.802 290.689 270.825 310.556 60.867 90.681 80.602 300.960 80.555 160.365 30.779 30.859 170.747 140.795 180.717 40.917 200.856 180.764 2
PointConvFormer0.749 70.793 320.790 240.807 240.750 100.856 80.524 160.881 70.588 410.642 150.977 40.591 50.274 340.781 20.929 10.804 30.796 150.642 230.947 30.885 20.715 17
Wenxuan Wu, Qi Shan, Li Fuxin: PointConvFormer: Revenge of the Point-based Convolution.
BPNetcopyleft0.749 70.909 40.818 100.811 210.752 80.839 190.485 320.842 160.673 100.644 110.957 130.528 250.305 190.773 60.859 170.788 40.818 50.693 70.916 210.856 180.723 13
Wenbo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia, Tien-Tsin Wong: Bidirectional Projection Network for Cross Dimension Scene Understanding. CVPR 2021 (Oral)
MSP0.748 90.623 790.804 160.859 30.745 120.824 330.501 230.912 20.690 60.685 30.956 140.567 110.320 130.768 70.918 30.720 220.802 100.676 120.921 180.881 40.779 1
StratifiedFormerpermissive0.747 100.901 70.803 170.845 80.757 60.846 140.512 190.825 220.696 50.645 100.956 140.576 90.262 440.744 180.861 160.742 150.770 300.705 50.899 330.860 150.734 6
Xin Lai*, Jianhui Liu*, Li Jiang, Liwei Wang, Hengshuang Zhao, Shu Liu, Xiaojuan Qi, Jiaya Jia: Stratified Transformer for 3D Point Cloud Segmentation. CVPR 2022
VMNetpermissive0.746 110.870 120.838 20.858 40.729 170.850 120.501 230.874 80.587 420.658 80.956 140.564 120.299 210.765 90.900 50.716 250.812 80.631 280.939 90.858 160.709 18
Zeyu HU, Xuyang Bai, Jiaxiang Shang, Runze Zhang, Jiayu Dong, Xin Wang, Guangyuan Sun, Hongbo Fu, Chiew-Lan Tai: VMNet: Voxel-Mesh Network for Geodesic-Aware 3D Semantic Segmentation. ICCV 2021 (Oral)
Virtual MVFusion0.746 110.771 400.819 80.848 60.702 250.865 50.397 700.899 30.699 30.664 70.948 410.588 60.330 90.746 170.851 240.764 90.796 150.704 60.935 110.866 110.728 8
Abhijit Kundu, Xiaoqi Yin, Alireza Fathi, David Ross, Brian Brewington, Thomas Funkhouser, Caroline Pantofaru: Virtual Multi-view Fusion for 3D Semantic Segmentation. ECCV 2020
Retro-FPN0.744 130.842 190.800 180.767 410.740 130.836 220.541 100.914 10.672 110.626 190.958 100.552 170.272 360.777 40.886 110.696 320.801 110.674 140.941 70.858 160.717 15
EQ-Net0.743 140.620 800.799 190.849 50.730 160.822 350.493 300.897 40.664 120.681 40.955 180.562 130.378 10.760 110.903 40.738 160.801 110.673 150.907 260.877 50.745 3
Zetong Yang*, Li Jiang*, Yanan Sun, Bernt Schiele, Jiaya JIa: A Unified Query-based Paradigm for Point Cloud Understanding. CVPR 2022
LRPNet0.742 150.816 270.806 150.807 240.752 80.828 290.575 30.839 180.699 30.637 170.954 230.520 270.320 130.755 130.834 280.760 100.772 270.676 120.915 220.862 130.717 15
SAT0.742 150.860 140.765 350.819 160.769 30.848 130.533 120.829 200.663 130.631 180.955 180.586 80.274 340.753 140.896 60.729 170.760 370.666 170.921 180.855 200.733 7
TXC0.740 170.842 190.832 40.805 280.715 210.846 140.473 340.885 60.615 260.671 60.971 60.547 180.320 130.697 220.799 370.777 60.819 30.682 100.946 40.871 80.696 23
LargeKernel3D0.739 180.909 40.820 70.806 260.740 130.852 100.545 90.826 210.594 400.643 120.955 180.541 200.263 430.723 200.858 190.775 70.767 310.678 110.933 130.848 240.694 24
MinkowskiNetpermissive0.736 190.859 150.818 100.832 130.709 220.840 180.521 180.853 120.660 150.643 120.951 310.544 190.286 280.731 190.893 70.675 390.772 270.683 90.874 510.852 220.727 10
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019
IPCA0.731 200.890 80.837 30.864 20.726 180.873 20.530 150.824 230.489 730.647 90.978 20.609 20.336 70.624 370.733 460.758 110.776 250.570 530.949 20.877 50.728 8
SparseConvNet0.725 210.647 760.821 60.846 70.721 190.869 30.533 120.754 430.603 360.614 230.955 180.572 100.325 110.710 210.870 130.724 200.823 20.628 290.934 120.865 120.683 27
PointTransformer++0.725 210.727 600.811 130.819 160.765 40.841 170.502 220.814 290.621 250.623 200.955 180.556 150.284 290.620 380.866 140.781 50.757 400.648 200.932 150.862 130.709 18
MatchingNet0.724 230.812 290.812 120.810 220.735 150.834 240.495 290.860 110.572 480.602 300.954 230.512 290.280 310.757 120.845 260.725 190.780 230.606 390.937 100.851 230.700 21
INS-Conv-semantic0.717 240.751 490.759 380.812 200.704 240.868 40.537 110.842 160.609 320.608 260.953 260.534 210.293 240.616 390.864 150.719 240.793 190.640 240.933 130.845 290.663 32
PointMetaBase0.714 250.835 210.785 260.821 140.684 290.846 140.531 140.865 100.614 270.596 340.953 260.500 320.246 500.674 230.888 90.692 330.764 330.624 300.849 660.844 300.675 29
contrastBoundarypermissive0.705 260.769 430.775 310.809 230.687 280.820 380.439 580.812 300.661 140.591 360.945 490.515 280.171 760.633 340.856 200.720 220.796 150.668 160.889 400.847 260.689 25
Liyao Tang, Yibing Zhan, Zhe Chen, Baosheng Yu, Dacheng Tao: Contrastive Boundary Learning for Point Cloud Segmentation. CVPR2022
RFCR0.702 270.889 90.745 470.813 190.672 310.818 420.493 300.815 270.623 230.610 240.947 430.470 420.249 490.594 420.848 250.705 290.779 240.646 210.892 380.823 360.611 46
Jingyu Gong, Jiachen Xu, Xin Tan, Haichuan Song, Yanyun Qu, Yuan Xie, Lizhuang Ma: Omni-Supervised Point Cloud Segmentation via Gradual Receptive Field Component Reasoning. CVPR2021
One Thing One Click0.701 280.825 250.796 200.723 480.716 200.832 250.433 600.816 250.634 210.609 250.969 70.418 670.344 50.559 540.833 290.715 260.808 90.560 570.902 300.847 260.680 28
JSENetpermissive0.699 290.881 110.762 360.821 140.667 320.800 550.522 170.792 350.613 280.607 270.935 690.492 340.205 630.576 470.853 220.691 340.758 390.652 190.872 540.828 330.649 36
Zeyu HU, Mingmin Zhen, Xuyang BAI, Hongbo Fu, Chiew-lan Tai: JSENet: Joint Semantic Segmentation and Edge Detection Network for 3D Point Clouds. ECCV 2020
PicassoNet-IIpermissive0.696 300.704 650.790 240.787 330.709 220.837 200.459 420.815 270.543 570.615 220.956 140.529 230.250 470.551 590.790 380.703 300.799 130.619 340.908 250.848 240.700 21
Huan Lei, Naveed Akhtar, Mubarak Shah, and Ajmal Mian: Geometric feature learning for 3D meshes.
One-Thing-One-Click0.693 310.743 520.794 220.655 720.684 290.822 350.497 280.719 530.622 240.617 210.977 40.447 540.339 60.750 160.664 610.703 300.790 210.596 430.946 40.855 200.647 37
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
Feature_GeometricNetpermissive0.690 320.884 100.754 420.795 320.647 380.818 420.422 620.802 330.612 290.604 280.945 490.462 450.189 710.563 530.853 220.726 180.765 320.632 270.904 280.821 390.606 50
Kangcheng Liu, Ben M. Chen: https://arxiv.org/abs/2012.09439. arXiv Preprint
FusionNet0.688 330.704 650.741 510.754 450.656 340.829 270.501 230.741 480.609 320.548 430.950 350.522 260.371 20.633 340.756 410.715 260.771 290.623 310.861 620.814 410.658 33
Feihu Zhang, Jin Fang, Benjamin Wah, Philip Torr: Deep FusionNet for Point Cloud Semantic Segmentation. ECCV 2020
Feature-Geometry Netpermissive0.685 340.866 130.748 440.819 160.645 400.794 580.450 470.802 330.587 420.604 280.945 490.464 440.201 660.554 560.840 270.723 210.732 490.602 410.907 260.822 380.603 53
KP-FCNN0.684 350.847 180.758 400.784 350.647 380.814 450.473 340.772 380.605 340.594 350.935 690.450 520.181 740.587 430.805 350.690 350.785 220.614 350.882 440.819 400.632 42
H. Thomas, C. Qi, J. Deschaud, B. Marcotegui, F. Goulette, L. Guibas.: KPConv: Flexible and Deformable Convolution for Point Clouds. ICCV 2019
DGNet0.684 350.712 640.784 270.782 370.658 330.835 230.499 270.823 240.641 180.597 330.950 350.487 350.281 300.575 480.619 640.647 520.764 330.620 330.871 570.846 280.688 26
VACNN++0.684 350.728 590.757 410.776 380.690 260.804 520.464 400.816 250.577 470.587 370.945 490.508 310.276 330.671 240.710 510.663 440.750 430.589 480.881 450.832 320.653 35
Superpoint Network0.683 380.851 170.728 550.800 310.653 360.806 500.468 370.804 310.572 480.602 300.946 460.453 510.239 530.519 650.822 300.689 370.762 360.595 450.895 360.827 340.630 43
PointContrast_LA_SEM0.683 380.757 470.784 270.786 340.639 420.824 330.408 650.775 370.604 350.541 450.934 730.532 220.269 390.552 570.777 390.645 550.793 190.640 240.913 230.824 350.671 30
VI-PointConv0.676 400.770 420.754 420.783 360.621 460.814 450.552 70.758 410.571 500.557 410.954 230.529 230.268 410.530 630.682 560.675 390.719 520.603 400.888 410.833 310.665 31
Xingyi Li, Wenxuan Wu, Xiaoli Z. Fern, Li Fuxin: The Devils in the Point Clouds: Studying the Robustness of Point Cloud Convolutions.
ROSMRF3D0.673 410.789 330.748 440.763 430.635 440.814 450.407 670.747 450.581 460.573 380.950 350.484 360.271 380.607 400.754 420.649 490.774 260.596 430.883 430.823 360.606 50
SALANet0.670 420.816 270.770 330.768 400.652 370.807 490.451 440.747 450.659 160.545 440.924 790.473 410.149 860.571 500.811 340.635 580.746 440.623 310.892 380.794 530.570 63
PointASNLpermissive0.666 430.703 670.781 290.751 470.655 350.830 260.471 360.769 390.474 760.537 470.951 310.475 400.279 320.635 320.698 550.675 390.751 420.553 620.816 730.806 450.703 20
Xu Yan, Chaoda Zheng, Zhen Li, Sheng Wang, Shuguang Cui: PointASNL: Robust Point Clouds Processing using Nonlocal Neural Networks with Adaptive Sampling. CVPR 2020
PointConvpermissive0.666 430.781 350.759 380.699 570.644 410.822 350.475 330.779 360.564 530.504 610.953 260.428 610.203 650.586 450.754 420.661 450.753 410.588 490.902 300.813 430.642 38
Wenxuan Wu, Zhongang Qi, Li Fuxin: PointConv: Deep Convolutional Networks on 3D Point Clouds. CVPR 2019
PPCNN++permissive0.663 450.746 500.708 590.722 490.638 430.820 380.451 440.566 800.599 380.541 450.950 350.510 300.313 160.648 290.819 320.616 630.682 680.590 470.869 580.810 440.656 34
Pyunghwan Ahn, Juyoung Yang, Eojindl Yi, Chanho Lee, Junmo Kim: Projection-based Point Convolution for Efficient Point Cloud Segmentation. IEEE Access
DCM-Net0.658 460.778 360.702 620.806 260.619 470.813 480.468 370.693 610.494 680.524 530.941 600.449 530.298 220.510 670.821 310.675 390.727 510.568 550.826 710.803 470.637 40
Jonas Schult*, Francis Engelmann*, Theodora Kontogianni, Bastian Leibe: DualConvMesh-Net: Joint Geodesic and Euclidean Convolutions on 3D Meshes. CVPR 2020 [Oral]
HPGCNN0.656 470.698 680.743 490.650 730.564 640.820 380.505 210.758 410.631 220.479 660.945 490.480 380.226 540.572 490.774 400.690 350.735 470.614 350.853 650.776 680.597 56
Jisheng Dang, Qingyong Hu, Yulan Guo, Jun Yang: HPGCNN.
SAFNet-segpermissive0.654 480.752 480.734 530.664 700.583 590.815 440.399 690.754 430.639 190.535 490.942 580.470 420.309 180.665 250.539 700.650 480.708 570.635 260.857 640.793 550.642 38
Linqing Zhao, Jiwen Lu, Jie Zhou: Similarity-Aware Fusion Network for 3D Semantic Segmentation. IROS 2021
RandLA-Netpermissive0.645 490.778 360.731 540.699 570.577 600.829 270.446 490.736 490.477 750.523 550.945 490.454 490.269 390.484 740.749 450.618 610.738 450.599 420.827 700.792 580.621 45
MVPNetpermissive0.641 500.831 220.715 570.671 670.590 550.781 640.394 710.679 630.642 170.553 420.937 660.462 450.256 450.649 280.406 830.626 590.691 650.666 170.877 470.792 580.608 49
Maximilian Jaritz, Jiayuan Gu, Hao Su: Multi-view PointNet for 3D Scene Understanding. GMDL Workshop, ICCV 2019
PointConv-SFPN0.641 500.776 380.703 610.721 500.557 670.826 300.451 440.672 650.563 540.483 650.943 570.425 640.162 810.644 300.726 470.659 460.709 560.572 520.875 490.786 630.559 68
PointMRNet0.640 520.717 630.701 630.692 600.576 610.801 540.467 390.716 540.563 540.459 710.953 260.429 600.169 780.581 460.854 210.605 640.710 540.550 630.894 370.793 550.575 61
FPConvpermissive0.639 530.785 340.760 370.713 550.603 500.798 560.392 720.534 850.603 360.524 530.948 410.457 470.250 470.538 610.723 490.598 680.696 630.614 350.872 540.799 480.567 65
Yiqun Lin, Zizheng Yan, Haibin Huang, Dong Du, Ligang Liu, Shuguang Cui, Xiaoguang Han: FPConv: Learning Local Flattening for Point Convolution. CVPR 2020
PD-Net0.638 540.797 310.769 340.641 780.590 550.820 380.461 410.537 840.637 200.536 480.947 430.388 750.206 620.656 260.668 590.647 520.732 490.585 500.868 590.793 550.473 87
PointSPNet0.637 550.734 560.692 700.714 540.576 610.797 570.446 490.743 470.598 390.437 760.942 580.403 710.150 850.626 360.800 360.649 490.697 620.557 600.846 670.777 670.563 66
SConv0.636 560.830 230.697 660.752 460.572 630.780 660.445 510.716 540.529 600.530 500.951 310.446 550.170 770.507 690.666 600.636 570.682 680.541 680.886 420.799 480.594 57
Supervoxel-CNN0.635 570.656 740.711 580.719 510.613 480.757 750.444 540.765 400.534 590.566 390.928 770.478 390.272 360.636 310.531 720.664 430.645 780.508 760.864 610.792 580.611 46
joint point-basedpermissive0.634 580.614 810.778 300.667 690.633 450.825 310.420 630.804 310.467 780.561 400.951 310.494 330.291 250.566 510.458 780.579 750.764 330.559 590.838 680.814 410.598 55
Hung-Yueh Chiang, Yen-Liang Lin, Yueh-Cheng Liu, Winston H. Hsu: A Unified Point-Based Framework for 3D Segmentation. 3DV 2019
PointMTL0.632 590.731 570.688 730.675 640.591 540.784 630.444 540.565 810.610 300.492 630.949 390.456 480.254 460.587 430.706 520.599 670.665 740.612 380.868 590.791 620.579 60
3DSM_DMMF0.631 600.626 780.745 470.801 300.607 490.751 760.506 200.729 520.565 520.491 640.866 930.434 560.197 690.595 410.630 630.709 280.705 590.560 570.875 490.740 780.491 82
PointNet2-SFPN0.631 600.771 400.692 700.672 650.524 720.837 200.440 570.706 590.538 580.446 730.944 550.421 660.219 570.552 570.751 440.591 710.737 460.543 670.901 320.768 700.557 69
APCF-Net0.631 600.742 530.687 750.672 650.557 670.792 610.408 650.665 660.545 560.508 580.952 300.428 610.186 720.634 330.702 530.620 600.706 580.555 610.873 520.798 500.581 59
Haojia, Lin: Adaptive Pyramid Context Fusion for Point Cloud Perception. GRSL
FusionAwareConv0.630 630.604 830.741 510.766 420.590 550.747 770.501 230.734 500.503 670.527 510.919 830.454 490.323 120.550 600.420 820.678 380.688 660.544 650.896 350.795 520.627 44
Jiazhao Zhang, Chenyang Zhu, Lintao Zheng, Kai Xu: Fusion-Aware Point Convolution for Online Semantic 3D Scene Segmentation. CVPR 2020
DenSeR0.628 640.800 300.625 850.719 510.545 700.806 500.445 510.597 750.448 820.519 560.938 650.481 370.328 100.489 730.499 770.657 470.759 380.592 460.881 450.797 510.634 41
SegGroup_sempermissive0.627 650.818 260.747 460.701 560.602 510.764 720.385 760.629 720.490 710.508 580.931 760.409 690.201 660.564 520.725 480.618 610.692 640.539 690.873 520.794 530.548 72
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
SIConv0.625 660.830 230.694 680.757 440.563 650.772 700.448 480.647 690.520 620.509 570.949 390.431 590.191 700.496 710.614 650.647 520.672 720.535 710.876 480.783 640.571 62
HPEIN0.618 670.729 580.668 760.647 750.597 530.766 710.414 640.680 620.520 620.525 520.946 460.432 570.215 590.493 720.599 660.638 560.617 830.570 530.897 340.806 450.605 52
Li Jiang, Hengshuang Zhao, Shu Liu, Xiaoyong Shen, Chi-Wing Fu, Jiaya Jia: Hierarchical Point-Edge Interaction Network for Point Cloud Semantic Segmentation. ICCV 2019
SPH3D-GCNpermissive0.610 680.858 160.772 320.489 900.532 710.792 610.404 680.643 710.570 510.507 600.935 690.414 680.046 950.510 670.702 530.602 660.705 590.549 640.859 630.773 690.534 75
Huan Lei, Naveed Akhtar, and Ajmal Mian: Spherical Kernel for Efficient Graph Convolution on 3D Point Clouds. TPAMI 2020
AttAN0.609 690.760 450.667 770.649 740.521 730.793 590.457 430.648 680.528 610.434 780.947 430.401 720.153 840.454 760.721 500.648 510.717 530.536 700.904 280.765 710.485 83
Gege Zhang, Qinghua Ma, Licheng Jiao, Fang Liu and Qigong Sun: AttAN: Attention Adversarial Networks for 3D Point Cloud Semantic Segmentation. IJCAI2020
wsss-transformer0.600 700.634 770.743 490.697 590.601 520.781 640.437 590.585 780.493 690.446 730.933 740.394 730.011 970.654 270.661 620.603 650.733 480.526 720.832 690.761 730.480 84
dtc_net0.596 710.683 690.725 560.715 530.549 690.803 530.444 540.647 690.493 690.495 620.941 600.409 690.000 990.424 810.544 690.598 680.703 610.522 730.912 240.792 580.520 78
LAP-D0.594 720.720 610.692 700.637 790.456 820.773 690.391 740.730 510.587 420.445 750.940 630.381 760.288 260.434 790.453 800.591 710.649 760.581 510.777 770.749 770.610 48
DPC0.592 730.720 610.700 640.602 830.480 780.762 740.380 770.713 570.585 450.437 760.940 630.369 780.288 260.434 790.509 760.590 730.639 810.567 560.772 780.755 750.592 58
Francis Engelmann, Theodora Kontogianni, Bastian Leibe: Dilated Point Convolutions: On the Receptive Field Size of Point Convolutions on 3D Point Clouds. ICRA 2020
CCRFNet0.589 740.766 440.659 800.683 620.470 810.740 790.387 750.620 740.490 710.476 670.922 810.355 810.245 510.511 660.511 750.571 760.643 790.493 800.872 540.762 720.600 54
ROSMRF0.580 750.772 390.707 600.681 630.563 650.764 720.362 790.515 860.465 790.465 700.936 680.427 630.207 610.438 770.577 670.536 790.675 710.486 810.723 840.779 650.524 77
SD-DETR0.576 760.746 500.609 890.445 940.517 740.643 900.366 780.714 560.456 800.468 690.870 920.432 570.264 420.558 550.674 570.586 740.688 660.482 820.739 820.733 800.537 74
SQN_0.1%0.569 770.676 710.696 670.657 710.497 750.779 670.424 610.548 820.515 640.376 830.902 900.422 650.357 40.379 840.456 790.596 700.659 750.544 650.685 870.665 910.556 70
TextureNetpermissive0.566 780.672 730.664 780.671 670.494 760.719 800.445 510.678 640.411 880.396 810.935 690.356 800.225 550.412 820.535 710.565 770.636 820.464 840.794 760.680 880.568 64
Jingwei Huang, Haotian Zhang, Li Yi, Thomas Funkerhouser, Matthias Niessner, Leonidas Guibas: TextureNet: Consistent Local Parametrizations for Learning from High-Resolution Signals on Meshes. CVPR
DVVNet0.562 790.648 750.700 640.770 390.586 580.687 840.333 830.650 670.514 650.475 680.906 870.359 790.223 560.340 860.442 810.422 900.668 730.501 770.708 850.779 650.534 75
Pointnet++ & Featurepermissive0.557 800.735 550.661 790.686 610.491 770.744 780.392 720.539 830.451 810.375 840.946 460.376 770.205 630.403 830.356 860.553 780.643 790.497 780.824 720.756 740.515 79
GMLPs0.538 810.495 910.693 690.647 750.471 800.793 590.300 860.477 870.505 660.358 850.903 890.327 840.081 920.472 750.529 730.448 880.710 540.509 740.746 800.737 790.554 71
PanopticFusion-label0.529 820.491 920.688 730.604 820.386 870.632 910.225 960.705 600.434 850.293 910.815 940.348 820.241 520.499 700.669 580.507 810.649 760.442 900.796 750.602 940.561 67
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
subcloud_weak0.516 830.676 710.591 920.609 800.442 830.774 680.335 820.597 750.422 870.357 860.932 750.341 830.094 910.298 880.528 740.473 860.676 700.495 790.602 930.721 830.349 94
Online SegFusion0.515 840.607 820.644 830.579 850.434 840.630 920.353 800.628 730.440 830.410 790.762 970.307 860.167 790.520 640.403 840.516 800.565 860.447 880.678 880.701 850.514 80
Davide Menini, Suryansh Kumar, Martin R. Oswald, Erik Sandstroem, Cristian Sminchisescu, Luc van Gool: A Real-Time Learning Framework for Joint 3D Reconstruction and Semantic Segmentation. Robotics and Automation Letters Submission
3DMV, FTSDF0.501 850.558 870.608 900.424 960.478 790.690 830.246 920.586 770.468 770.450 720.911 850.394 730.160 820.438 770.212 930.432 890.541 910.475 830.742 810.727 810.477 85
PCNN0.498 860.559 860.644 830.560 870.420 860.711 820.229 940.414 880.436 840.352 870.941 600.324 850.155 830.238 930.387 850.493 820.529 920.509 740.813 740.751 760.504 81
3DMV0.484 870.484 930.538 940.643 770.424 850.606 950.310 840.574 790.433 860.378 820.796 950.301 870.214 600.537 620.208 940.472 870.507 950.413 930.693 860.602 940.539 73
Angela Dai, Matthias Niessner: 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation. ECCV'18
PointCNN with RGBpermissive0.458 880.577 850.611 880.356 980.321 950.715 810.299 880.376 920.328 950.319 890.944 550.285 890.164 800.216 960.229 910.484 840.545 900.456 860.755 790.709 840.475 86
Yangyan Li, Rui Bu, Mingchao Sun, Baoquan Chen: PointCNN. NeurIPS 2018
FCPNpermissive0.447 890.679 700.604 910.578 860.380 880.682 850.291 890.106 980.483 740.258 960.920 820.258 930.025 960.231 950.325 870.480 850.560 880.463 850.725 830.666 900.231 98
Dario Rethage, Johanna Wald, Jürgen Sturm, Nassir Navab, Federico Tombari: Fully-Convolutional Point Networks for Large-Scale Point Clouds. ECCV 2018
DGCNN_reproducecopyleft0.446 900.474 940.623 860.463 920.366 900.651 880.310 840.389 910.349 930.330 880.937 660.271 910.126 880.285 890.224 920.350 950.577 850.445 890.625 910.723 820.394 90
Yue Wang, Yongbin Sun, Ziwei Liu, Sanjay E. Sarma, Michael M. Bronstein, Justin M. Solomon: Dynamic Graph CNN for Learning on Point Clouds. TOG 2019
PNET20.442 910.548 880.548 930.597 840.363 910.628 930.300 860.292 930.374 900.307 900.881 910.268 920.186 720.238 930.204 950.407 910.506 960.449 870.667 890.620 930.462 88
SurfaceConvPF0.442 910.505 900.622 870.380 970.342 930.654 870.227 950.397 900.367 910.276 930.924 790.240 940.198 680.359 850.262 890.366 920.581 840.435 910.640 900.668 890.398 89
Hao Pan, Shilin Liu, Yang Liu, Xin Tong: Convolutional Neural Networks on 3D Surfaces Using Parallel Frames.
Tangent Convolutionspermissive0.438 930.437 960.646 820.474 910.369 890.645 890.353 800.258 950.282 970.279 920.918 840.298 880.147 870.283 900.294 880.487 830.562 870.427 920.619 920.633 920.352 93
Maxim Tatarchenko, Jaesik Park, Vladlen Koltun, Qian-Yi Zhou: Tangent convolutions for dense prediction in 3d. CVPR 2018
3DWSSS0.425 940.525 890.647 810.522 880.324 940.488 980.077 990.712 580.353 920.401 800.636 990.281 900.176 750.340 860.565 680.175 990.551 890.398 940.370 990.602 940.361 92
SPLAT Netcopyleft0.393 950.472 950.511 950.606 810.311 960.656 860.245 930.405 890.328 950.197 970.927 780.227 960.000 990.001 1000.249 900.271 980.510 930.383 960.593 940.699 860.267 96
Hang Su, Varun Jampani, Deqing Sun, Subhransu Maji, Evangelos Kalogerakis, Ming-Hsuan Yang, Jan Kautz: SPLATNet: Sparse Lattice Networks for Point Cloud Processing. CVPR 2018
ScanNet+FTSDF0.383 960.297 980.491 960.432 950.358 920.612 940.274 900.116 970.411 880.265 940.904 880.229 950.079 930.250 910.185 960.320 960.510 930.385 950.548 950.597 970.394 90
PointNet++permissive0.339 970.584 840.478 970.458 930.256 980.360 990.250 910.247 960.278 980.261 950.677 980.183 970.117 890.212 970.145 980.364 930.346 990.232 990.548 950.523 980.252 97
Charles R. Qi, Li Yi, Hao Su, Leonidas J. Guibas: pointnet++: deep hierarchical feature learning on point sets in a metric space.
SSC-UNetpermissive0.308 980.353 970.290 990.278 990.166 990.553 960.169 980.286 940.147 990.148 990.908 860.182 980.064 940.023 990.018 1000.354 940.363 970.345 970.546 970.685 870.278 95
ScanNetpermissive0.306 990.203 990.366 980.501 890.311 960.524 970.211 970.002 1000.342 940.189 980.786 960.145 990.102 900.245 920.152 970.318 970.348 980.300 980.460 980.437 990.182 99
Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner: ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. CVPR'17
ERROR0.054 1000.000 1000.041 1000.172 1000.030 1000.062 1000.001 1000.035 990.004 1000.051 1000.143 1000.019 1000.003 980.041 980.050 990.003 1000.054 1000.018 1000.005 1000.264 1000.082 100


This table lists the benchmark results for the 3D semantic instance scenario.




Method Infoavg ap 25%bathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
TD3D0.875 11.000 10.976 120.877 90.783 140.970 10.889 10.828 120.945 30.803 60.713 100.720 100.709 81.000 10.936 60.934 30.873 71.000 10.791 5
Queryformer0.874 21.000 10.978 100.809 240.876 10.936 50.702 80.716 260.920 50.875 30.766 40.772 20.818 41.000 10.995 10.916 40.892 11.000 10.767 8
SoftGroup++0.874 21.000 10.972 130.947 10.839 50.898 120.556 230.913 20.881 110.756 80.828 20.748 60.821 21.000 10.937 50.937 10.887 21.000 10.821 2
Mask3D0.870 41.000 10.985 60.782 320.818 80.938 40.760 40.749 230.923 40.877 20.760 50.785 10.820 31.000 10.912 90.864 230.878 50.983 390.825 1
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
SoftGrouppermissive0.865 51.000 10.969 140.860 120.860 20.913 80.558 210.899 30.911 60.760 70.828 10.736 70.802 60.981 290.919 80.875 140.877 61.000 10.820 3
Thang Vu, Kookhoi Kim, Tung M. Luu, Xuan Thanh Nguyen, Chang D. Yoo: SoftGroup for 3D Instance Segmentaiton on Point Clouds. CVPR 2022 [Oral]
IPCA-Inst0.851 61.000 10.968 150.884 80.842 40.862 240.693 100.812 170.888 90.677 200.783 30.698 110.807 51.000 10.911 130.865 220.865 91.000 10.757 10
SPFormerpermissive0.851 61.000 10.994 20.806 250.774 160.942 30.637 130.849 100.859 140.889 10.720 80.730 80.665 131.000 10.911 130.868 210.873 81.000 10.796 4
Sun Jiahao, Qing Chunmei, Tan Junpeng, Xu Xiangmin: Superpoint Transformer for 3D Scene Instance Segmentation. AAAI 2023 [Oral]
ISBNetpermissive0.845 81.000 10.976 110.798 260.794 110.916 60.757 50.667 330.882 100.842 40.715 90.757 40.832 11.000 10.905 160.803 430.843 121.000 10.715 17
Tuan Duc Ngo, Binh-Son Hua, Khoi Nguyen: ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution. CVPR 2023
SphereSeg0.835 91.000 10.963 180.891 60.794 100.954 20.822 30.710 270.961 20.721 120.693 160.530 310.653 151.000 10.867 230.857 260.859 100.991 360.771 7
TopoSeg0.832 101.000 10.981 80.933 20.819 70.826 320.524 290.841 110.811 190.681 190.759 60.687 120.727 70.981 290.911 130.883 100.853 111.000 10.756 11
GraphCut0.832 101.000 10.922 320.724 410.798 90.902 110.701 90.856 80.859 130.715 130.706 110.748 50.640 261.000 10.934 70.862 240.880 31.000 10.729 13
PBNetpermissive0.825 121.000 10.963 170.837 160.843 30.865 190.822 20.647 350.878 120.733 100.639 240.683 130.650 161.000 10.853 240.870 180.820 131.000 10.744 12
SSEC0.820 131.000 10.983 70.924 30.826 60.817 350.415 380.899 40.793 230.673 210.731 70.636 180.653 141.000 10.939 40.804 410.878 41.000 10.780 6
DKNet0.815 141.000 10.930 240.844 140.765 200.915 70.534 270.805 190.805 210.807 50.654 180.763 30.650 161.000 10.794 360.881 110.766 171.000 10.758 9
Yizheng Wu, Min Shi, Shuaiyuan Du, Hao Lu, Zhiguo Cao, Weicai Zhong: 3D Instances as 1D Kernels. ECCV 2022
RPGN0.806 151.000 10.992 40.789 280.723 320.891 130.650 120.810 180.832 160.665 230.699 140.658 140.700 91.000 10.881 180.832 330.774 150.997 290.613 34
Shichao Dong, Guosheng Lin, Tzu-Yi Hung: Learning Regional Purity for Instance Segmentation on 3D Point Clouds. ECCV 2022
HAISpermissive0.803 161.000 10.994 20.820 200.759 210.855 250.554 240.882 50.827 180.615 290.676 170.638 170.646 241.000 10.912 90.797 450.767 160.994 340.726 14
Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang: Hierarchical Aggregation for 3D Instance Segmentation. ICCV 2021
Box2Mask0.803 161.000 10.962 190.874 100.707 350.887 160.686 110.598 390.961 10.715 140.694 150.469 360.700 91.000 10.912 90.902 50.753 220.997 290.637 28
Julian Chibane, Francis Engelmann, Tuan Anh Tran, Gerard Pons-Moll: Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation Using Bounding Boxes. ECCV 2022
Mask-Group0.792 181.000 10.968 160.812 210.766 190.864 200.460 320.815 160.888 80.598 320.651 210.639 160.600 300.918 330.941 20.896 60.721 291.000 10.723 15
Min Zhong, Xinghao Chen, Xiaokang Chen, Gang Zeng, Yunhe Wang: MaskGroup: Hierarchical Point Grouping and Masking for 3D Instance Segmentation. ICME 2022
CSC-Pretrained0.791 191.000 10.996 10.829 190.767 180.889 150.600 160.819 150.770 280.594 330.620 270.541 280.700 91.000 10.941 20.889 80.763 181.000 10.526 43
SSTNetpermissive0.789 201.000 10.840 460.888 70.717 330.835 280.717 70.684 320.627 420.724 110.652 200.727 90.600 301.000 10.912 90.822 360.757 211.000 10.691 22
Zhihao Liang, Zhihao Li, Songcen Xu, Mingkui Tan, Kui Jia: Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks. ICCV2021
GICN0.788 211.000 10.978 90.867 110.781 150.833 290.527 280.824 130.806 200.549 410.596 300.551 240.700 91.000 10.853 240.935 20.733 261.000 10.651 25
DENet0.786 221.000 10.929 250.736 390.750 270.720 470.755 60.934 10.794 220.590 340.561 360.537 290.650 161.000 10.882 170.804 420.789 141.000 10.719 16
DualGroup0.782 231.000 10.927 260.811 220.772 170.853 260.631 150.805 190.773 250.613 300.611 280.610 200.650 160.835 440.881 180.879 130.750 241.000 10.675 23
PointGroup0.778 241.000 10.900 360.798 270.715 340.863 210.493 300.706 280.895 70.569 390.701 120.576 220.639 271.000 10.880 200.851 280.719 300.997 290.709 19
Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia: PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation. CVPR 2020 [oral]
PE0.776 251.000 10.900 370.860 120.728 310.869 170.400 390.857 70.774 240.568 400.701 130.602 210.646 240.933 320.843 270.890 70.691 370.997 290.709 18
Biao Zhang, Peter Wonka: Point Cloud Instance Segmentation using Probabilistic Embeddings. CVPR 2021
AOIA0.767 261.000 10.937 210.810 230.740 290.906 90.550 250.800 210.706 340.577 380.624 260.544 270.596 350.857 360.879 220.880 120.750 230.992 350.658 24
DD-UNet+Group0.764 271.000 10.897 390.837 150.753 240.830 310.459 340.824 130.699 360.629 270.653 190.438 390.650 161.000 10.880 200.858 250.690 381.000 10.650 26
H. Liu, R. Liu, K. Yang, J. Zhang, K. Peng, R. Stiefelhagen: HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor. ICCVW 2021
INS-Conv-instance0.762 281.000 10.923 290.765 350.785 130.905 100.600 160.655 340.646 410.683 180.647 220.530 300.650 161.000 10.824 290.830 340.693 360.944 430.644 27
Dyco3Dcopyleft0.761 291.000 10.935 220.893 50.752 260.863 220.600 160.588 400.742 310.641 250.633 250.546 260.550 370.857 360.789 380.853 270.762 190.987 370.699 20
Tong He; Chunhua Shen; Anton van den Hengel: DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution. CVPR2021
OccuSeg+instance0.742 301.000 10.923 290.785 290.745 280.867 180.557 220.578 430.729 320.670 220.644 230.488 340.577 361.000 10.794 360.830 340.620 451.000 10.550 39
Lei Han, Tian Zheng, Lan Xu, Lu Fang: OccuSeg: Occupancy-aware 3D Instance Segmentation. CVPR2020
RWSeg0.739 311.000 10.899 380.759 370.753 250.823 330.282 430.691 310.658 390.582 370.594 310.547 250.628 281.000 10.795 350.868 200.728 281.000 10.692 21
3D-MPA0.737 321.000 10.933 230.785 290.794 120.831 300.279 450.588 400.695 370.616 280.559 370.556 230.650 161.000 10.809 330.875 150.696 341.000 10.608 36
Francis Engelmann, Martin Bokeloh, Alireza Fathi, Bastian Leibe, Matthias Nießner: 3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation. CVPR 2020
MTML0.731 331.000 10.992 40.779 340.609 440.746 420.308 420.867 60.601 450.607 310.539 400.519 320.550 371.000 10.824 290.869 190.729 271.000 10.616 32
Jean Lahoud, Bernard Ghanem, Marc Pollefeys, Martin R. Oswald: 3D Instance Segmentation via Multi-task Metric Learning. ICCV 2019 [oral]
OSIS0.725 341.000 10.885 420.653 470.657 410.801 360.576 200.695 300.828 170.698 160.534 410.457 380.500 440.857 360.831 280.841 310.627 441.000 10.619 31
SSEN0.724 351.000 10.926 270.781 330.661 390.845 270.596 190.529 450.764 300.653 240.489 460.461 370.500 440.859 350.765 390.872 170.761 201.000 10.577 37
Dongsu Zhang, Junha Chun, Sang Kyun Cha, Young Min Kim: Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with Deep Metric Learning. Arxiv
NeuralBF0.718 361.000 10.945 200.901 40.754 230.817 340.460 320.700 290.772 260.688 170.568 350.000 560.500 440.981 290.606 470.872 160.740 251.000 10.614 33
Weiwei Sun, Daniel Rebain, Renjie Liao, Vladimir Tankovich, Soroosh Yazdani, Kwang Moo Yi, Andrea Tagliasacchi: NeuralBF: Neural Bilateral Filtering for Top-down Instance Segmentation on Point Clouds. WACV 2023
Sparse R-CNN0.714 371.000 10.926 280.694 420.699 370.890 140.636 140.516 460.693 380.743 90.588 320.369 420.601 290.594 490.800 340.886 90.676 390.986 380.546 40
SALoss-ResNet0.695 381.000 10.855 440.579 510.589 460.735 450.484 310.588 400.856 150.634 260.571 340.298 430.500 441.000 10.824 290.818 370.702 330.935 470.545 41
Zhidong Liang, Ming Yang, Hao Li, Chunxiang Wang: 3D Instance Embedding Learning With a Structure-Aware Loss Function for Point Cloud Segmentation. IEEE Robotics and Automation Letters (IROS2020)
PanopticFusion-inst0.693 391.000 10.852 450.655 460.616 430.788 370.334 410.763 220.771 270.457 510.555 380.652 150.518 410.857 360.765 390.732 510.631 420.944 430.577 38
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
Occipital-SCS0.688 401.000 10.913 330.730 400.737 300.743 440.442 350.855 90.655 400.546 420.546 390.263 450.508 430.889 340.568 480.771 480.705 320.889 500.625 30
3D-BoNet0.687 411.000 10.887 410.836 170.587 470.643 540.550 250.620 360.724 330.522 460.501 440.243 460.512 421.000 10.751 410.807 400.661 410.909 490.612 35
Bo Yang, Jianan Wang, Ronald Clark, Qingyong Hu, Sen Wang, Andrew Markham, Niki Trigoni: Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds. NeurIPS 2019 Spotlight
PCJC0.684 421.000 10.895 400.757 380.659 400.862 230.189 520.739 240.606 440.712 150.581 330.515 330.650 160.857 360.357 530.785 460.631 430.889 500.635 29
SPG_WSIS0.678 431.000 10.880 430.836 170.701 360.727 460.273 470.607 380.706 350.541 440.515 430.174 480.600 300.857 360.716 420.846 300.711 311.000 10.506 44
One_Thing_One_Clickpermissive0.675 441.000 10.823 470.782 310.621 420.766 390.211 490.736 250.560 480.586 350.522 420.636 190.453 480.641 480.853 240.850 290.694 350.997 290.411 48
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
SegGroup_inspermissive0.637 451.000 10.923 310.593 500.561 480.746 430.143 540.504 470.766 290.485 490.442 470.372 410.530 400.714 450.815 320.775 470.673 401.000 10.431 47
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
MASCpermissive0.615 460.711 520.802 480.540 520.757 220.777 380.029 550.577 440.588 470.521 470.600 290.436 400.534 390.697 460.616 460.838 320.526 470.980 400.534 42
Chen Liu, Yasutaka Furukawa: MASC: Multi-scale Affinity with Sparse Convolution for 3D Instance Segmentation.
UNet-backbone0.605 471.000 10.909 340.764 360.603 450.704 480.415 370.301 520.548 490.461 500.394 480.267 440.386 500.857 360.649 450.817 380.504 480.959 410.356 51
3D-SISpermissive0.558 481.000 10.773 490.614 490.503 500.691 500.200 500.412 480.498 520.546 430.311 530.103 520.600 300.857 360.382 500.799 440.445 540.938 460.371 49
Ji Hou, Angela Dai, Matthias Niessner: 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans. CVPR 2019
R-PointNet0.544 490.500 550.655 550.661 450.663 380.765 400.432 360.214 540.612 430.584 360.499 450.204 470.286 540.429 520.655 440.650 560.539 460.950 420.499 45
Hier3Dcopyleft0.540 501.000 10.727 500.626 480.467 530.693 490.200 500.412 480.480 530.528 450.318 520.077 550.600 300.688 470.382 500.768 490.472 500.941 450.350 52
Tan: HCFS3D: Hierarchical Coupled Feature Selection Network for 3D Semantic and Instance Segmentation.
Region-18class0.497 510.250 570.902 350.689 430.540 490.747 410.276 460.610 370.268 560.489 480.348 490.000 560.243 560.220 550.663 430.814 390.459 520.928 480.496 46
tmp0.474 521.000 10.727 500.433 550.481 520.673 520.022 570.380 500.517 510.436 530.338 510.128 500.343 520.429 520.291 550.728 520.473 490.833 530.300 54
SemRegionNet-20cls0.470 531.000 10.727 500.447 540.481 510.678 510.024 560.380 500.518 500.440 520.339 500.128 500.350 510.429 520.212 560.711 530.465 510.833 530.290 55
ASIS0.422 540.333 560.707 530.676 440.401 540.650 530.350 400.177 550.594 460.376 540.202 540.077 540.404 490.571 500.197 570.674 550.447 530.500 560.260 56
3D-BEVIS0.401 550.667 530.687 540.419 560.137 570.587 550.188 530.235 530.359 550.211 560.093 570.080 530.311 530.571 500.382 500.754 500.300 560.874 520.357 50
Cathrin Elich, Francis Engelmann, Jonas Schult, Theodora Kontogianni, Bastian Leibe: 3D-BEVIS: Birds-Eye-View Instance Segmentation.
Sgpn_scannet0.390 560.556 540.636 560.493 530.353 550.539 560.271 480.160 560.450 540.359 550.178 550.146 490.250 550.143 560.347 540.698 540.436 550.667 550.331 53
MaskRCNN 2d->3d Proj0.261 570.903 510.081 570.008 570.233 560.175 570.280 440.106 570.150 570.203 570.175 560.480 350.218 570.143 560.542 490.404 570.153 570.393 570.049 57


This table lists the benchmark results for the 2D semantic label scenario.


Method Infoavg ioubathtubbedbookshelfcabinetchaircountercurtaindeskdoorfloorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwallwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Virtual MVFusion (R)0.745 10.861 10.839 10.881 10.672 10.512 10.422 150.898 10.723 10.714 10.954 20.454 10.509 10.773 10.895 10.756 10.820 10.653 10.935 10.891 10.728 1
Abhijit Kundu, Xiaoqi Yin, Alireza Fathi, David Ross, Brian Brewington, Thomas Funkhouser, Caroline Pantofaru: Virtual Multi-view Fusion for 3D Semantic Segmentation. ECCV 2020
BPNet_2Dcopyleft0.670 20.822 30.795 30.836 20.659 20.481 20.451 110.769 30.656 30.567 30.931 30.395 40.390 40.700 30.534 30.689 90.770 20.574 30.865 60.831 30.675 4
Wenbo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia and Tien-Tsin Wong: Bidirectional Projection Network for Cross Dimension Scene Understanding. CVPR 2021 (Oral)
CU-Hybrid-2D Net0.636 30.825 20.820 20.179 200.648 30.463 30.549 20.742 60.676 20.628 20.961 10.420 20.379 50.684 60.381 150.732 20.723 30.599 20.827 130.851 20.634 6
CMX0.613 40.681 70.725 90.502 120.634 50.297 150.478 90.830 20.651 40.537 60.924 40.375 50.315 120.686 50.451 120.714 40.543 180.504 50.894 40.823 40.688 3
DMMF_3d0.605 50.651 80.744 70.782 30.637 40.387 40.536 30.732 70.590 60.540 50.856 180.359 90.306 130.596 110.539 20.627 180.706 40.497 70.785 180.757 160.476 19
MCA-Net0.595 60.533 170.756 60.746 40.590 80.334 70.506 60.670 120.587 70.500 100.905 80.366 80.352 80.601 100.506 60.669 150.648 70.501 60.839 120.769 120.516 18
RFBNet0.592 70.616 90.758 50.659 50.581 90.330 80.469 100.655 150.543 120.524 70.924 40.355 100.336 100.572 140.479 80.671 130.648 70.480 90.814 160.814 50.614 9
FAN_NV_RVC0.586 80.510 180.764 40.079 230.620 70.330 80.494 70.753 40.573 80.556 40.884 130.405 30.303 140.718 20.452 110.672 120.658 50.509 40.898 30.813 60.727 2
DCRedNet0.583 90.682 60.723 100.542 110.510 170.310 120.451 110.668 130.549 110.520 80.920 60.375 50.446 20.528 170.417 130.670 140.577 150.478 100.862 70.806 70.628 8
MIX6D_RVC0.582 100.695 40.687 140.225 180.632 60.328 100.550 10.748 50.623 50.494 130.890 110.350 120.254 200.688 40.454 100.716 30.597 140.489 80.881 50.768 130.575 12
SSMAcopyleft0.577 110.695 40.716 120.439 140.563 110.314 110.444 130.719 80.551 100.503 90.887 120.346 130.348 90.603 90.353 170.709 50.600 120.457 120.901 20.786 80.599 11
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. International Journal of Computer Vision, 2019
UNIV_CNP_RVC_UE0.566 120.569 160.686 160.435 150.524 140.294 160.421 160.712 90.543 120.463 150.872 140.320 140.363 70.611 80.477 90.686 100.627 90.443 150.862 70.775 110.639 5
EMSAFormer0.564 130.581 130.736 80.564 100.546 130.219 200.517 40.675 110.486 170.427 190.904 90.352 110.320 110.589 120.528 40.708 60.464 210.413 190.847 110.786 80.611 10
SN_RN152pyrx8_RVCcopyleft0.546 140.572 140.663 180.638 70.518 150.298 140.366 210.633 180.510 150.446 170.864 160.296 170.267 170.542 160.346 180.704 70.575 160.431 160.853 100.766 140.630 7
UDSSEG_RVC0.545 150.610 110.661 190.588 80.556 120.268 180.482 80.642 170.572 90.475 140.836 200.312 150.367 60.630 70.189 200.639 170.495 200.452 130.826 140.756 170.541 14
segfomer with 6d0.542 160.594 120.687 140.146 210.579 100.308 130.515 50.703 100.472 180.498 110.868 150.369 70.282 150.589 120.390 140.701 80.556 170.416 180.860 90.759 150.539 16
FuseNetpermissive0.535 170.570 150.681 170.182 190.512 160.290 170.431 140.659 140.504 160.495 120.903 100.308 160.428 30.523 180.365 160.676 110.621 110.470 110.762 190.779 100.541 14
Caner Hazirbas, Lingni Ma, Csaba Domokos, Daniel Cremers: FuseNet: Incorporating Depth into Semantic Segmentation via Fusion-based CNN Architecture. ACCV 2016
AdapNet++copyleft0.503 180.613 100.722 110.418 160.358 230.337 60.370 200.479 210.443 190.368 210.907 70.207 200.213 220.464 210.525 50.618 190.657 60.450 140.788 170.721 200.408 22
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. International Journal of Computer Vision, 2019
3DMV (2d proj)0.498 190.481 210.612 200.579 90.456 190.343 50.384 180.623 190.525 140.381 200.845 190.254 190.264 190.557 150.182 210.581 210.598 130.429 170.760 200.661 220.446 21
Angela Dai, Matthias Niessner: 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation. ECCV'18
MSeg1080_RVCpermissive0.485 200.505 190.709 130.092 220.427 200.241 190.411 170.654 160.385 230.457 160.861 170.053 230.279 160.503 190.481 70.645 160.626 100.365 210.748 210.725 190.529 17
John Lambert*, Zhuang Liu*, Ozan Sener, James Hays, Vladlen Koltun: MSeg: A Composite Dataset for Multi-domain Semantic Segmentation. CVPR 2020
ILC-PSPNet0.475 210.490 200.581 210.289 170.507 180.067 230.379 190.610 200.417 210.435 180.822 220.278 180.267 170.503 190.228 190.616 200.533 190.375 200.820 150.729 180.560 13
Enet (reimpl)0.376 220.264 230.452 230.452 130.365 210.181 210.143 230.456 220.409 220.346 220.769 230.164 210.218 210.359 220.123 230.403 230.381 230.313 230.571 220.685 210.472 20
Re-implementation of Adam Paszke, Abhishek Chaurasia, Sangpil Kim, Eugenio Culurciello: ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation.
ScanNet (2d proj)permissive0.330 230.293 220.521 220.657 60.361 220.161 220.250 220.004 230.440 200.183 230.836 200.125 220.060 230.319 230.132 220.417 220.412 220.344 220.541 230.427 230.109 23
Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner: ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. CVPR'17
DMMF0.003 240.000 240.005 240.000 240.000 240.037 240.001 240.000 240.001 240.005 240.003 240.000 240.000 240.000 240.000 240.000 240.002 240.001 240.000 240.006 240.000 24


This table lists the benchmark results for the 2D semantic instance scenario.




Method Infoavg apbathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
UniDet_RVC0.205 10.381 10.323 10.037 10.226 10.177 10.063 10.277 10.120 10.067 10.131 10.074 20.317 10.080 10.235 10.289 10.141 10.678 10.080 1
MaskRCNN_ScanNetpermissive0.119 20.129 20.212 20.002 20.112 20.148 20.014 20.205 20.044 20.066 20.078 20.095 10.142 20.030 20.128 20.139 20.080 20.459 20.057 2
Re-implementation of Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross Girshick: Mask R-CNN. ICCV'17


This table lists the benchmark results for the scene type classification scenario.




Method Infoavg recallapartmentbathroombedroom / hotelbookstore / libraryconference roomcopy/mail roomhallwaykitchenlaundry roomliving room / loungemiscofficestorage / basement / garage
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
multi-taskpermissive0.700 10.500 11.000 10.882 20.500 21.000 11.000 10.500 21.000 11.000 10.778 10.000 20.938 10.000 2
Shengyu Huang, Mikhail Usvyatsov, Konrad Schindler: Indoor Scene Recognition in 3D. IROS 2020
3DASPP-SCE0.691 20.500 10.938 20.824 31.000 11.000 10.500 21.000 10.857 20.500 20.556 30.000 20.812 20.500 1
SE-ResNeXt-SSMA0.498 30.000 40.812 30.941 10.500 20.500 30.500 20.500 20.429 40.500 20.667 20.500 10.625 30.000 2
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. arXiv
resnet50_scannet0.353 40.250 30.812 30.529 40.500 20.500 30.000 40.500 20.571 30.000 40.556 30.000 20.375 40.000 2