Presenting the ScanNet200 Benchmark

We present the ScanNet200 benchmark, which studies an order of magnitude more class categories than previous version of ScanNet. The scene geometry is shared within the two tasks, but the parsing of surface annotation allows for a larger vocabulary and more realistic setting for in the wild 3D understanding methods.

The ScanNet200 benchmark includes both finer-grained categories as well as a large number of previously unaddressed classes. This induces a much more challenging setting regarding the diversity of naturally observed semantic classes seen in the raw ScanNet RGB-D observations, where the data also reflects naturally encountered class imbalances. The difference in category frequencies between ScanNet and ScanNet200 can be seen in the Figure above.

ScanNet200 Benchmark

This table lists the benchmark results for the ScanNet200 3D semantic label scenario.




Method Infoavg iouhead ioucommon ioutail ioualarm clockarmchairbackpackbagballbarbasketbathroom cabinetbathroom counterbathroom stallbathroom stall doorbathroom vanitybathtubbedbenchbicyclebinblackboardblanketblindsboardbookbookshelfbottlebowlboxbroombucketbulletin boardcabinetcalendarcandlecartcase of water bottlescd caseceilingceiling lightchairclockclosetcloset doorcloset rodcloset wallclothesclothes dryercoat rackcoffee kettlecoffee makercoffee tablecolumncomputer towercontainercopiercouchcountercratecupcurtaincushiondecorationdeskdining tabledish rackdishwasherdividerdoordoorframedresserdumbbelldustpanend tablefanfile cabinetfire alarmfire extinguisherfireplacefloorfolded chairfurnitureguitarguitar casehair dryerhandicap barhatheadphonesironing boardjacketkeyboardkeyboard pianokitchen cabinetkitchen counterladderlamplaptoplaundry basketlaundry detergentlaundry hamperledgelightlight switchluggagemachinemailboxmatmattressmicrowavemini fridgemirrormonitormousemusic standnightstandobjectoffice chairottomanovenpaperpaper bagpaper cutterpaper towel dispenserpaper towel rollpersonpianopicturepillarpillowpipeplantplateplungerposterpotted plantpower outletpower stripprinterprojectorprojector screenpurserackradiatorrailrange hoodrecycling binrefrigeratorscaleseatshelfshoeshowershower curtainshower curtain rodshower doorshower floorshower headshower wallsignsinksoap dishsoap dispensersofa chairspeakerstair railstairsstandstoolstorage binstorage containerstorage organizerstovestructurestuffed animalsuitcasetabletelephonetissue boxtoastertoaster oventoilettoilet papertoilet paper dispensertoilet paper holdertoilet seat cover dispensertoweltrash bintrash cantraytubetvtv standvacuum cleanerventwallwardrobewashing machinewater bottlewater coolerwater pitcherwhiteboardwindowwindowsill
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
CeCo0.340 10.551 20.247 20.181 10.475 20.057 60.142 50.000 10.000 10.000 10.387 30.463 30.499 40.924 10.774 20.213 10.257 20.000 30.546 60.100 30.006 20.615 10.177 60.534 10.246 10.000 40.400 20.000 10.338 10.006 50.484 20.609 10.000 10.083 20.000 20.873 20.089 30.661 30.000 30.048 60.560 10.408 10.892 20.000 10.000 10.586 10.616 30.000 50.692 30.900 10.721 20.162 10.228 10.860 10.000 10.000 20.575 10.083 20.550 10.347 10.624 20.410 20.360 20.740 10.109 40.321 40.660 20.000 20.121 20.939 20.143 30.000 10.400 10.003 30.190 10.564 10.652 20.615 20.421 10.304 50.579 10.547 10.000 20.000 10.296 30.000 50.030 40.096 10.000 30.916 10.037 30.551 30.171 30.376 10.865 30.286 10.000 10.633 10.102 60.027 40.011 30.000 10.000 20.474 40.742 10.133 20.311 20.824 20.242 30.503 30.068 40.828 10.000 30.429 20.000 10.063 10.000 20.781 10.000 10.000 30.000 10.665 10.633 10.450 10.818 10.000 30.000 10.429 20.532 10.226 20.825 10.510 40.377 10.709 10.079 40.000 10.753 10.683 10.102 60.063 20.401 50.620 40.000 10.619 10.000 60.000 40.000 10.595 40.000 20.000 10.345 30.564 20.411 10.603 10.384 20.945 20.266 20.643 10.367 30.304 10.663 20.000 10.010 10.726 40.767 20.898 10.000 10.784 20.435 10.861 20.000 10.447 20.000 60.257 20.656 20.377 4
: Understanding Imbalanced Semantic Segmentation Through Neural Collapse.
OA-CNN-L_ScanNet2000.333 20.558 10.269 10.124 30.448 40.080 20.272 10.000 10.000 10.000 10.342 50.515 20.524 20.713 60.789 10.158 20.384 10.000 30.806 10.125 10.000 30.496 20.332 10.498 50.227 20.024 20.474 10.000 10.003 20.071 30.487 10.000 30.000 10.110 10.000 20.876 10.013 60.703 10.000 30.076 20.473 30.355 20.906 10.000 10.000 10.476 20.706 10.000 50.672 40.835 30.748 10.015 60.223 20.860 10.000 10.000 20.572 20.000 50.509 20.313 20.662 10.398 30.396 10.411 40.276 10.527 10.711 10.000 20.076 40.946 10.166 20.000 10.022 20.160 10.183 20.493 20.699 10.637 10.403 20.330 40.406 30.526 20.024 10.000 10.392 20.000 50.016 60.000 30.196 20.915 20.112 20.557 20.197 10.352 20.877 20.000 30.000 10.592 50.103 50.000 60.067 10.000 10.089 10.735 20.625 30.130 30.568 10.836 10.271 10.534 10.043 60.799 20.001 20.445 10.000 10.000 20.024 10.661 20.000 10.262 10.000 10.591 30.517 50.373 20.788 20.021 20.000 10.455 10.517 20.320 10.823 20.200 60.001 60.150 30.100 20.000 10.736 20.668 20.103 50.052 30.662 10.720 10.000 10.602 30.112 20.002 30.000 10.637 20.000 20.000 10.621 10.569 10.398 20.412 20.234 30.949 10.363 10.492 50.495 10.251 30.665 10.000 10.001 30.805 20.833 10.794 20.000 10.821 10.314 20.843 30.000 10.560 10.245 20.262 10.713 10.370 5
LGroundpermissive0.272 40.485 40.184 40.106 40.476 10.077 30.218 20.000 10.000 10.000 10.547 10.295 40.540 10.746 40.745 40.058 50.112 50.005 10.658 30.077 60.000 30.322 40.178 50.512 30.190 30.199 10.277 40.000 10.000 30.173 10.399 30.000 30.000 10.039 50.000 20.858 40.085 40.676 20.002 10.103 10.498 20.323 30.703 40.000 10.000 10.296 40.549 40.216 10.702 20.768 40.718 40.028 40.092 50.786 50.000 10.000 20.453 50.022 30.251 60.252 30.572 40.348 40.321 30.514 20.063 50.279 50.552 40.000 20.019 50.932 40.132 50.000 10.000 40.000 50.156 60.457 40.623 30.518 30.265 50.358 30.381 40.395 40.000 20.000 10.127 60.012 30.051 10.000 30.000 30.886 50.014 40.437 60.179 20.244 40.826 40.000 30.000 10.599 30.136 10.085 20.000 40.000 10.000 20.565 30.612 40.143 10.207 40.566 40.232 50.446 40.127 20.708 40.000 30.384 40.000 10.000 20.000 20.402 30.000 10.059 20.000 10.525 60.566 30.229 40.659 40.000 30.000 10.265 40.446 30.147 50.720 60.597 20.066 40.000 40.187 10.000 10.726 30.467 60.134 40.000 50.413 40.629 30.000 10.363 50.055 40.022 20.000 10.626 30.000 20.000 10.323 40.479 60.154 50.117 40.028 50.901 40.243 40.415 60.295 60.143 50.610 50.000 10.000 40.777 30.397 60.324 50.000 10.778 40.179 40.702 50.000 10.274 60.404 10.233 30.622 40.398 3
David Rozenberszki, Or Litany, Angela Dai: Language-Grounded Indoor 3D Semantic Segmentation in the Wild. arXiv
Minkowski 34Dpermissive0.253 50.463 50.154 60.102 50.381 60.084 10.134 60.000 10.000 10.000 10.386 40.141 60.279 60.737 50.703 50.014 60.164 40.000 30.663 20.092 50.000 30.224 50.291 20.531 20.056 60.000 40.242 50.000 10.000 30.013 40.331 50.000 30.000 10.035 60.001 10.858 40.059 50.650 50.000 30.056 50.353 50.299 40.670 50.000 10.000 10.284 50.484 50.071 30.594 50.720 50.710 50.027 50.068 60.813 30.000 10.005 10.492 40.164 10.274 50.111 60.571 50.307 60.293 40.307 60.150 30.163 60.531 50.002 10.545 10.932 40.093 60.000 10.000 40.002 40.159 40.368 60.581 50.440 60.228 60.406 20.282 60.294 50.000 20.000 10.189 50.060 10.036 30.000 30.000 30.897 30.000 60.525 50.025 60.205 60.771 60.000 30.000 10.593 40.108 40.044 30.000 40.000 10.000 20.282 60.589 50.094 50.169 50.466 60.227 60.419 60.125 30.757 30.002 10.334 50.000 10.000 20.000 20.357 40.000 10.000 30.000 10.582 40.513 60.337 30.612 60.000 30.000 10.250 50.352 60.136 60.724 50.655 10.280 20.000 40.046 60.000 10.606 60.559 40.159 20.102 10.445 20.655 20.000 10.310 60.117 10.000 40.000 10.581 60.026 10.000 10.265 60.483 50.084 60.097 60.044 40.865 60.142 60.588 20.351 40.272 20.596 60.000 10.003 20.622 50.720 30.096 60.000 10.771 50.016 50.772 40.000 10.302 40.194 30.214 50.621 50.197 6
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019
AWCS0.305 30.508 30.225 30.142 20.463 30.063 40.195 30.000 10.000 10.000 10.467 20.551 10.504 30.773 20.764 30.142 30.029 60.000 30.626 40.100 30.000 30.360 30.179 40.507 40.137 40.006 30.300 30.000 10.000 30.172 20.364 40.512 20.000 10.056 30.000 20.865 30.093 20.634 60.000 30.071 40.396 40.296 50.876 30.000 10.000 10.373 30.436 60.063 40.749 10.877 20.721 20.131 20.124 30.804 40.000 10.000 20.515 30.010 40.452 30.252 30.578 30.417 10.179 60.484 30.171 20.337 30.606 30.000 20.115 30.937 30.142 40.000 10.008 30.000 50.157 50.484 30.402 60.501 40.339 30.553 10.529 20.478 30.000 20.000 10.404 10.001 40.022 50.077 20.000 30.894 40.219 10.628 10.093 40.305 30.886 10.233 20.000 10.603 20.112 30.023 50.000 40.000 10.000 20.741 10.664 20.097 40.253 30.782 30.264 20.523 20.154 10.707 50.000 30.411 30.000 10.000 20.000 20.332 50.000 10.000 30.000 10.602 20.595 20.185 50.656 50.159 10.000 10.355 30.424 40.154 40.729 40.516 30.220 30.620 20.084 30.000 10.707 40.651 30.173 10.014 40.381 60.582 50.000 10.619 10.049 50.000 40.000 10.702 10.000 20.000 10.302 50.489 40.317 30.334 30.392 10.922 30.254 30.533 40.394 20.129 60.613 40.000 10.000 40.820 10.649 40.749 30.000 10.782 30.282 30.863 10.000 10.288 50.006 40.220 40.633 30.542 1
CSC-Pretrainpermissive0.249 60.455 60.171 50.079 60.418 50.059 50.186 40.000 10.000 10.000 10.335 60.250 50.316 50.766 30.697 60.142 30.170 30.003 20.553 50.112 20.097 10.201 60.186 30.476 60.081 50.000 40.216 60.000 10.000 30.001 60.314 60.000 30.000 10.055 40.000 20.832 60.094 10.659 40.002 10.076 20.310 60.293 60.664 60.000 10.000 10.175 60.634 20.130 20.552 60.686 60.700 60.076 30.110 40.770 60.000 10.000 20.430 60.000 50.319 40.166 50.542 60.327 50.205 50.332 50.052 60.375 20.444 60.000 20.012 60.930 60.203 10.000 10.000 40.046 20.175 30.413 50.592 40.471 50.299 40.152 60.340 50.247 60.000 20.000 10.225 40.058 20.037 20.000 30.207 10.862 60.014 40.548 40.033 50.233 50.816 50.000 30.000 10.542 60.123 20.121 10.019 20.000 10.000 20.463 50.454 60.045 60.128 60.557 50.235 40.441 50.063 50.484 60.000 30.308 60.000 10.000 20.000 20.318 60.000 10.000 30.000 10.545 50.543 40.164 60.734 30.000 30.000 10.215 60.371 50.198 30.743 30.205 50.062 50.000 40.079 40.000 10.683 50.547 50.142 30.000 50.441 30.579 60.000 10.464 40.098 30.041 10.000 10.590 50.000 20.000 10.373 20.494 30.174 40.105 50.001 60.895 50.222 50.537 30.307 50.180 40.625 30.000 10.000 40.591 60.609 50.398 40.000 10.766 60.014 60.638 60.000 10.377 30.004 50.206 60.609 60.465 2
Ji Hou, Benjamin Graham, Matthias Nießner, Saining Xie: Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts. CVPR 2021


This table lists the benchmark results for the ScanNet200 3D semantic instance scenario.




Method Infoavg ap 25%head ap 25%common ap 25%tail ap 25%alarm clockarmchairbackpackbagballbarbasketbathroom cabinetbathroom counterbathroom stallbathroom stall doorbathroom vanitybathtubbedbenchbicyclebinblackboardblanketblindsboardbookbookshelfbottlebowlboxbroombucketbulletin boardcabinetcalendarcandlecartcase of water bottlescd caseceilingceiling lightchairclockclosetcloset doorcloset rodcloset wallclothesclothes dryercoat rackcoffee kettlecoffee makercoffee tablecolumncomputer towercontainercopiercouchcountercratecupcurtaincushiondecorationdeskdining tabledish rackdishwasherdividerdoordoorframedresserdumbbelldustpanend tablefanfile cabinetfire alarmfire extinguisherfireplacefolded chairfurnitureguitarguitar casehair dryerhandicap barhatheadphonesironing boardjacketkeyboardkeyboard pianokitchen cabinetkitchen counterladderlamplaptoplaundry basketlaundry detergentlaundry hamperledgelightlight switchluggagemachinemailboxmatmattressmicrowavemini fridgemirrormonitormousemusic standnightstandobjectoffice chairottomanovenpaperpaper bagpaper cutterpaper towel dispenserpaper towel rollpersonpianopicturepillarpillowpipeplantplateplungerposterpotted plantpower outletpower stripprinterprojectorprojector screenpurserackradiatorrailrange hoodrecycling binrefrigeratorscaleseatshelfshoeshowershower curtainshower curtain rodshower doorshower floorshower headshower wallsignsinksoap dishsoap dispensersofa chairspeakerstair railstairsstandstoolstorage binstorage containerstorage organizerstovestructurestuffed animalsuitcasetabletelephonetissue boxtoastertoaster oventoilettoilet papertoilet paper dispensertoilet paper holdertoilet seat cover dispensertoweltrash bintrash cantraytubetvtv standvacuum cleanerventwardrobewashing machinewater bottlewater coolerwater pitcherwhiteboardwindowwindowsill
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
TD3D Scannet2000.379 20.603 20.306 20.190 20.635 20.073 20.500 10.000 10.000 10.000 10.495 30.735 20.275 51.000 10.979 20.590 20.000 40.021 20.000 30.146 30.000 20.356 20.173 50.795 10.226 20.000 10.173 20.000 10.000 20.226 20.390 20.000 20.000 10.250 10.000 10.706 20.061 30.885 10.093 20.186 20.259 40.200 10.667 10.000 20.000 10.667 20.825 10.250 40.834 41.000 10.958 10.553 10.111 30.748 10.220 20.051 20.866 20.792 10.390 50.045 50.800 20.302 50.517 10.533 30.113 20.427 10.843 20.000 20.458 10.600 10.000 10.101 20.000 10.259 10.717 20.500 20.615 20.520 20.526 20.457 10.270 40.000 10.000 10.400 20.088 20.294 20.181 20.000 11.000 10.400 10.710 50.103 30.477 50.905 20.061 20.000 10.906 20.102 20.232 10.125 20.000 20.003 20.792 31.000 10.000 20.102 30.125 40.559 50.523 30.075 20.715 10.000 20.424 50.000 10.396 20.250 10.638 10.000 10.000 20.000 10.622 50.833 20.221 10.970 10.250 20.038 10.260 20.415 10.125 21.000 11.000 10.857 20.000 20.908 10.012 10.869 30.836 10.635 10.111 10.625 11.000 10.020 20.510 10.003 30.009 21.000 10.778 10.000 10.000 10.370 30.755 10.288 20.333 30.274 21.000 10.557 10.731 20.456 20.433 30.769 50.000 10.000 20.621 41.000 10.458 40.000 10.196 20.817 10.000 10.472 10.222 30.205 50.689 20.274 3
LGround Inst.permissive0.314 30.529 30.225 30.155 30.578 50.010 30.500 10.000 10.000 10.000 10.515 20.556 30.696 11.000 10.927 30.400 30.083 30.000 31.000 10.252 10.000 20.167 30.350 20.731 20.067 30.000 10.123 40.000 10.000 20.036 30.372 30.000 20.000 10.250 10.000 10.569 40.031 50.810 30.000 30.000 40.630 10.183 20.278 30.000 20.000 10.582 40.589 50.500 20.863 31.000 10.940 20.000 40.144 10.716 30.000 30.000 30.484 30.000 30.500 30.400 30.798 30.500 20.278 40.750 10.093 30.166 40.783 30.000 20.200 20.400 20.000 10.000 30.000 10.219 20.539 30.500 20.578 30.413 30.181 50.457 20.375 20.000 10.000 10.050 50.000 40.077 40.000 30.000 10.500 50.000 50.743 30.250 20.488 40.846 30.000 30.000 10.800 30.069 30.000 30.000 30.000 20.000 31.000 10.607 40.000 20.200 10.500 10.694 20.528 20.063 30.659 20.000 20.594 20.000 10.000 30.000 20.571 20.000 10.000 20.000 10.716 40.647 50.221 20.857 40.000 30.000 30.217 30.346 30.071 50.530 51.000 10.429 30.000 20.286 30.000 30.826 50.706 30.208 40.000 30.250 40.744 50.000 30.500 20.042 10.000 30.000 20.746 30.000 10.000 10.517 10.625 30.085 50.333 30.000 41.000 10.378 40.533 50.376 40.042 50.814 30.000 10.000 20.765 31.000 10.600 30.000 10.000 40.667 30.000 10.472 10.333 20.337 30.605 30.305 2
David Rozenberszki, Or Litany, Angela Dai: Language-Grounded Indoor 3D Semantic Segmentation in the Wild.
Minkowski 34D Inst.permissive0.280 40.488 40.192 50.124 40.593 40.010 40.500 10.000 10.000 10.000 10.447 40.535 40.445 31.000 10.861 40.400 30.225 20.000 30.000 30.142 40.000 20.074 40.342 30.467 50.067 30.000 10.119 50.000 10.000 20.000 40.337 50.000 20.000 10.000 40.000 10.506 50.070 20.804 40.000 30.000 40.333 30.172 30.150 50.000 20.000 10.479 50.745 30.000 50.830 51.000 10.904 30.167 20.090 40.732 20.000 30.000 30.443 40.000 30.500 30.542 10.772 50.396 40.077 50.385 40.044 40.118 50.777 40.000 20.000 40.200 30.000 10.000 30.000 10.148 40.502 40.500 20.419 40.159 50.281 40.404 50.317 30.000 10.000 10.200 30.000 40.077 30.000 30.000 10.750 30.200 30.715 40.021 40.551 20.828 50.000 30.000 10.743 40.059 50.000 30.000 30.000 20.000 30.125 50.648 30.000 20.191 20.500 10.669 40.502 40.000 50.568 40.000 20.516 40.000 10.000 30.000 20.305 50.000 10.000 20.000 10.825 10.833 20.021 50.918 20.000 30.000 30.191 40.346 40.100 40.981 31.000 10.286 40.000 20.000 50.000 30.868 40.648 50.292 30.000 30.375 31.000 10.000 30.500 20.000 40.333 10.000 20.538 50.000 10.000 10.213 50.518 40.098 40.528 10.250 30.997 30.284 50.677 30.398 30.167 40.790 40.000 10.000 20.618 50.903 50.200 50.000 10.333 10.333 40.000 10.442 30.083 40.213 40.587 40.131 5
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019
CSC-Pretrain Inst.permissive0.275 50.466 50.218 40.110 50.625 30.007 50.500 10.000 10.000 10.000 10.000 50.222 50.377 41.000 10.661 50.400 30.000 40.000 30.000 30.119 50.000 20.000 50.277 40.685 40.067 30.000 10.132 30.000 10.000 20.000 40.367 40.000 20.000 10.000 40.000 10.591 30.055 40.783 50.000 30.014 30.500 20.161 40.278 30.000 20.000 10.667 20.768 20.500 20.866 21.000 10.829 50.000 40.019 50.555 50.000 30.000 30.305 50.000 30.750 10.200 40.783 40.429 30.395 30.677 20.020 50.286 30.584 50.000 20.000 40.115 50.000 10.000 30.000 10.145 50.423 50.500 20.364 50.369 40.571 10.448 30.206 50.000 10.000 10.200 30.106 10.065 50.000 30.000 10.750 30.200 30.774 20.000 50.501 30.841 40.000 30.000 10.692 50.063 40.000 30.000 30.000 20.000 30.500 40.649 20.000 20.084 40.125 40.719 10.413 50.004 40.450 50.000 20.638 10.000 10.000 30.000 20.505 30.000 10.000 20.000 10.727 30.833 20.221 20.779 50.000 30.000 30.168 50.311 50.125 20.571 40.500 50.143 50.000 20.250 40.000 30.869 20.667 40.162 50.000 30.250 41.000 10.000 30.500 20.000 40.000 30.000 20.689 40.000 10.000 10.312 40.383 50.114 30.333 30.000 40.997 30.420 30.613 40.212 50.500 20.819 20.000 10.000 20.768 21.000 10.918 10.000 10.000 40.278 50.000 10.333 50.000 50.353 20.546 50.258 4
Ji Hou, Benjamin Graham, Matthias Nießner, Saining Xie: Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts. CVPR 2021
Mask3D Scannet2000.445 10.653 10.392 10.254 10.648 10.097 10.125 50.000 10.000 10.000 10.657 10.971 10.451 21.000 11.000 10.640 10.500 10.045 11.000 10.241 20.409 10.363 10.440 10.686 30.300 10.000 10.201 10.000 10.009 10.290 10.556 11.000 10.000 10.063 30.000 10.830 10.573 10.844 20.333 10.204 10.058 50.158 50.552 20.056 10.000 11.000 10.725 40.750 10.927 11.000 10.888 40.042 30.120 20.615 40.226 10.250 10.890 10.792 10.677 20.510 20.818 10.699 10.512 20.167 50.125 10.315 20.943 10.309 10.017 30.200 30.000 10.188 10.000 10.183 30.815 11.000 10.827 10.741 10.442 30.414 40.600 10.000 10.000 10.458 10.049 30.321 10.381 10.000 10.908 20.400 10.841 10.260 10.710 10.966 10.265 10.000 10.924 10.152 10.025 20.500 10.027 10.028 11.000 10.556 50.016 10.080 50.500 10.694 30.608 10.084 10.604 30.194 10.538 30.000 10.500 10.000 20.354 40.000 11.000 10.000 10.761 20.930 10.053 40.890 31.000 10.008 20.262 10.358 21.000 11.000 10.792 40.966 11.000 10.765 20.004 20.930 10.780 20.330 20.027 20.625 10.974 40.050 10.412 50.021 20.000 30.000 20.778 10.000 10.000 10.493 20.746 20.454 10.335 20.396 10.930 50.551 21.000 10.552 10.606 10.853 10.000 10.004 10.806 11.000 10.727 20.000 10.042 30.745 20.000 10.399 40.391 10.630 10.721 10.619 1
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023


ScanNet Benchmark

This table lists the benchmark results for the 3D semantic label scenario.


Method Infoavg ioubathtubbedbookshelfcabinetchaircountercurtaindeskdoorfloorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwallwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Mix3Dpermissive0.781 10.964 10.855 10.843 110.781 10.858 70.575 30.831 200.685 70.714 10.979 10.594 30.310 180.801 10.892 80.841 20.819 30.723 30.940 80.887 10.725 13
Alexey Nekrasov, Jonas Schult, Or Litany, Bastian Leibe, Francis Engelmann: Mix3D: Out-of-Context Data Augmentation for 3D Scenes. 3DV 2021 (Oral)
CU-Hybrid Net0.764 20.924 20.819 90.840 120.757 70.853 90.580 10.848 150.709 20.643 130.958 110.587 70.295 240.753 150.884 120.758 120.815 60.725 20.927 180.867 110.743 6
OccuSeg+Semantic0.764 20.758 470.796 210.839 130.746 120.907 10.562 50.850 140.680 90.672 60.978 20.610 10.335 80.777 40.819 320.847 10.830 10.691 80.972 10.885 20.727 11
O-CNNpermissive0.762 40.924 20.823 60.844 100.770 30.852 100.577 20.847 160.711 10.640 170.958 110.592 40.217 590.762 110.888 90.758 120.813 70.726 10.932 160.868 100.744 5
Peng-Shuai Wang, Yang Liu, Yu-Xiao Guo, Chun-Yu Sun, Xin Tong: O-CNN: Octree-based Convolutional Neural Networks for 3D Shape Analysis. SIGGRAPH 2017
OA-CNN-L_ScanNet200.756 50.783 350.826 50.858 40.776 20.837 200.548 90.896 50.649 170.675 50.962 80.586 80.335 80.771 70.802 360.770 80.787 220.691 80.936 110.880 50.761 3
PointTransformerV20.752 60.742 540.809 150.872 10.758 60.860 60.552 70.891 60.610 310.687 20.960 90.559 150.304 210.766 90.926 20.767 90.797 140.644 230.942 60.876 80.722 15
Xiaoyang Wu, Yixing Lao, Li Jiang, Xihui Liu, Hengshuang Zhao: Point Transformer V2: Grouped Vector Attention and Partition-based Pooling. NeurIPS 2022
DMF-Net0.752 60.906 60.793 240.802 300.689 280.825 320.556 60.867 100.681 80.602 310.960 90.555 170.365 30.779 30.859 170.747 150.795 180.717 40.917 210.856 190.764 2
PointConvFormer0.749 80.793 320.790 250.807 250.750 110.856 80.524 170.881 80.588 420.642 160.977 40.591 50.274 350.781 20.929 10.804 30.796 150.642 240.947 30.885 20.715 18
Wenxuan Wu, Qi Shan, Li Fuxin: PointConvFormer: Revenge of the Point-based Convolution.
BPNetcopyleft0.749 80.909 40.818 110.811 220.752 90.839 190.485 330.842 170.673 100.644 120.957 140.528 260.305 200.773 60.859 170.788 40.818 50.693 70.916 220.856 190.723 14
Wenbo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia, Tien-Tsin Wong: Bidirectional Projection Network for Cross Dimension Scene Understanding. CVPR 2021 (Oral)
MSP0.748 100.623 800.804 170.859 30.745 130.824 340.501 240.912 20.690 60.685 30.956 150.567 120.320 140.768 80.918 30.720 230.802 100.676 130.921 190.881 40.779 1
StratifiedFormerpermissive0.747 110.901 70.803 180.845 90.757 70.846 140.512 200.825 230.696 50.645 110.956 150.576 100.262 450.744 190.861 160.742 160.770 310.705 50.899 340.860 160.734 7
Xin Lai*, Jianhui Liu*, Li Jiang, Liwei Wang, Hengshuang Zhao, Shu Liu, Xiaojuan Qi, Jiaya Jia: Stratified Transformer for 3D Point Cloud Segmentation. CVPR 2022
Virtual MVFusion0.746 120.771 410.819 90.848 70.702 260.865 50.397 710.899 30.699 30.664 80.948 420.588 60.330 100.746 180.851 240.764 100.796 150.704 60.935 120.866 120.728 9
Abhijit Kundu, Xiaoqi Yin, Alireza Fathi, David Ross, Brian Brewington, Thomas Funkhouser, Caroline Pantofaru: Virtual Multi-view Fusion for 3D Semantic Segmentation. ECCV 2020
VMNetpermissive0.746 120.870 120.838 20.858 40.729 180.850 120.501 240.874 90.587 430.658 90.956 150.564 130.299 220.765 100.900 50.716 260.812 80.631 290.939 90.858 170.709 19
Zeyu HU, Xuyang Bai, Jiaxiang Shang, Runze Zhang, Jiayu Dong, Xin Wang, Guangyuan Sun, Hongbo Fu, Chiew-Lan Tai: VMNet: Voxel-Mesh Network for Geodesic-Aware 3D Semantic Segmentation. ICCV 2021 (Oral)
Retro-FPN0.744 140.842 190.800 190.767 420.740 140.836 230.541 110.914 10.672 110.626 200.958 110.552 180.272 370.777 40.886 110.696 330.801 110.674 150.941 70.858 170.717 16
EQ-Net0.743 150.620 810.799 200.849 60.730 170.822 360.493 310.897 40.664 120.681 40.955 190.562 140.378 10.760 120.903 40.738 170.801 110.673 160.907 270.877 60.745 4
Zetong Yang*, Li Jiang*, Yanan Sun, Bernt Schiele, Jiaya JIa: A Unified Query-based Paradigm for Point Cloud Understanding. CVPR 2022
SAT0.742 160.860 140.765 360.819 170.769 40.848 130.533 130.829 210.663 130.631 190.955 190.586 80.274 350.753 150.896 60.729 180.760 380.666 180.921 190.855 210.733 8
LRPNet0.742 160.816 270.806 160.807 250.752 90.828 300.575 30.839 190.699 30.637 180.954 240.520 280.320 140.755 140.834 280.760 110.772 280.676 130.915 230.862 140.717 16
TXC0.740 180.842 190.832 40.805 290.715 220.846 140.473 350.885 70.615 270.671 70.971 60.547 190.320 140.697 230.799 380.777 60.819 30.682 110.946 40.871 90.696 24
LargeKernel3D0.739 190.909 40.820 80.806 270.740 140.852 100.545 100.826 220.594 410.643 130.955 190.541 210.263 440.723 210.858 190.775 70.767 320.678 120.933 140.848 250.694 25
MinkowskiNetpermissive0.736 200.859 150.818 110.832 140.709 230.840 180.521 190.853 130.660 150.643 130.951 320.544 200.286 290.731 200.893 70.675 400.772 280.683 100.874 520.852 230.727 11
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019
IPCA0.731 210.890 80.837 30.864 20.726 190.873 20.530 160.824 240.489 740.647 100.978 20.609 20.336 70.624 380.733 470.758 120.776 260.570 540.949 20.877 60.728 9
SparseConvNet0.725 220.647 770.821 70.846 80.721 200.869 30.533 130.754 440.603 370.614 240.955 190.572 110.325 120.710 220.870 130.724 210.823 20.628 300.934 130.865 130.683 28
PointTransformer++0.725 220.727 610.811 140.819 170.765 50.841 170.502 230.814 300.621 260.623 210.955 190.556 160.284 300.620 390.866 140.781 50.757 410.648 210.932 160.862 140.709 19
MatchingNet0.724 240.812 290.812 130.810 230.735 160.834 250.495 300.860 120.572 490.602 310.954 240.512 300.280 320.757 130.845 260.725 200.780 240.606 400.937 100.851 240.700 22
INS-Conv-semantic0.717 250.751 500.759 390.812 210.704 250.868 40.537 120.842 170.609 330.608 270.953 270.534 220.293 250.616 400.864 150.719 250.793 190.640 250.933 140.845 300.663 33
PointMetaBase0.714 260.835 210.785 270.821 150.684 300.846 140.531 150.865 110.614 280.596 350.953 270.500 330.246 510.674 240.888 90.692 340.764 340.624 310.849 670.844 310.675 30
contrastBoundarypermissive0.705 270.769 440.775 320.809 240.687 290.820 390.439 590.812 310.661 140.591 370.945 500.515 290.171 770.633 350.856 200.720 230.796 150.668 170.889 410.847 270.689 26
Liyao Tang, Yibing Zhan, Zhe Chen, Baosheng Yu, Dacheng Tao: Contrastive Boundary Learning for Point Cloud Segmentation. CVPR2022
RFCR0.702 280.889 90.745 480.813 200.672 320.818 430.493 310.815 280.623 240.610 250.947 440.470 430.249 500.594 430.848 250.705 300.779 250.646 220.892 390.823 370.611 47
Jingyu Gong, Jiachen Xu, Xin Tan, Haichuan Song, Yanyun Qu, Yuan Xie, Lizhuang Ma: Omni-Supervised Point Cloud Segmentation via Gradual Receptive Field Component Reasoning. CVPR2021
One Thing One Click0.701 290.825 250.796 210.723 490.716 210.832 260.433 610.816 260.634 220.609 260.969 70.418 680.344 50.559 550.833 290.715 270.808 90.560 580.902 310.847 270.680 29
JSENetpermissive0.699 300.881 110.762 370.821 150.667 330.800 560.522 180.792 360.613 290.607 280.935 700.492 350.205 640.576 480.853 220.691 350.758 400.652 200.872 550.828 340.649 37
Zeyu HU, Mingmin Zhen, Xuyang BAI, Hongbo Fu, Chiew-lan Tai: JSENet: Joint Semantic Segmentation and Edge Detection Network for 3D Point Clouds. ECCV 2020
PicassoNet-IIpermissive0.696 310.704 660.790 250.787 340.709 230.837 200.459 430.815 280.543 580.615 230.956 150.529 240.250 480.551 600.790 390.703 310.799 130.619 350.908 260.848 250.700 22
Huan Lei, Naveed Akhtar, Mubarak Shah, and Ajmal Mian: Geometric feature learning for 3D meshes.
One-Thing-One-Click0.693 320.743 530.794 230.655 730.684 300.822 360.497 290.719 540.622 250.617 220.977 40.447 550.339 60.750 170.664 620.703 310.790 210.596 440.946 40.855 210.647 38
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
Feature_GeometricNetpermissive0.690 330.884 100.754 430.795 330.647 390.818 430.422 630.802 340.612 300.604 290.945 500.462 460.189 720.563 540.853 220.726 190.765 330.632 280.904 290.821 400.606 51
Kangcheng Liu, Ben M. Chen: https://arxiv.org/abs/2012.09439. arXiv Preprint
FusionNet0.688 340.704 660.741 520.754 460.656 350.829 280.501 240.741 490.609 330.548 440.950 360.522 270.371 20.633 350.756 420.715 270.771 300.623 320.861 630.814 420.658 34
Feihu Zhang, Jin Fang, Benjamin Wah, Philip Torr: Deep FusionNet for Point Cloud Semantic Segmentation. ECCV 2020
Feature-Geometry Netpermissive0.685 350.866 130.748 450.819 170.645 410.794 590.450 480.802 340.587 430.604 290.945 500.464 450.201 670.554 570.840 270.723 220.732 500.602 420.907 270.822 390.603 54
VACNN++0.684 360.728 600.757 420.776 390.690 270.804 530.464 410.816 260.577 480.587 380.945 500.508 320.276 340.671 250.710 520.663 450.750 440.589 490.881 460.832 330.653 36
KP-FCNN0.684 360.847 180.758 410.784 360.647 390.814 460.473 350.772 390.605 350.594 360.935 700.450 530.181 750.587 440.805 350.690 360.785 230.614 360.882 450.819 410.632 43
H. Thomas, C. Qi, J. Deschaud, B. Marcotegui, F. Goulette, L. Guibas.: KPConv: Flexible and Deformable Convolution for Point Clouds. ICCV 2019
DGNet0.684 360.712 650.784 280.782 380.658 340.835 240.499 280.823 250.641 190.597 340.950 360.487 360.281 310.575 490.619 650.647 530.764 340.620 340.871 580.846 290.688 27
Superpoint Network0.683 390.851 170.728 560.800 320.653 370.806 510.468 380.804 320.572 490.602 310.946 470.453 520.239 540.519 660.822 300.689 380.762 370.595 460.895 370.827 350.630 44
PointContrast_LA_SEM0.683 390.757 480.784 280.786 350.639 430.824 340.408 660.775 380.604 360.541 460.934 740.532 230.269 400.552 580.777 400.645 560.793 190.640 250.913 240.824 360.671 31
VI-PointConv0.676 410.770 430.754 430.783 370.621 470.814 460.552 70.758 420.571 510.557 420.954 240.529 240.268 420.530 640.682 570.675 400.719 530.603 410.888 420.833 320.665 32
Xingyi Li, Wenxuan Wu, Xiaoli Z. Fern, Li Fuxin: The Devils in the Point Clouds: Studying the Robustness of Point Cloud Convolutions.
ROSMRF3D0.673 420.789 330.748 450.763 440.635 450.814 460.407 680.747 460.581 470.573 390.950 360.484 370.271 390.607 410.754 430.649 500.774 270.596 440.883 440.823 370.606 51
SALANet0.670 430.816 270.770 340.768 410.652 380.807 500.451 450.747 460.659 160.545 450.924 800.473 420.149 870.571 510.811 340.635 590.746 450.623 320.892 390.794 540.570 64
PointConvpermissive0.666 440.781 360.759 390.699 580.644 420.822 360.475 340.779 370.564 540.504 620.953 270.428 620.203 660.586 460.754 430.661 460.753 420.588 500.902 310.813 440.642 39
Wenxuan Wu, Zhongang Qi, Li Fuxin: PointConv: Deep Convolutional Networks on 3D Point Clouds. CVPR 2019
PointASNLpermissive0.666 440.703 680.781 300.751 480.655 360.830 270.471 370.769 400.474 770.537 480.951 320.475 410.279 330.635 330.698 560.675 400.751 430.553 630.816 740.806 460.703 21
Xu Yan, Chaoda Zheng, Zhen Li, Sheng Wang, Shuguang Cui: PointASNL: Robust Point Clouds Processing using Nonlocal Neural Networks with Adaptive Sampling. CVPR 2020
PPCNN++permissive0.663 460.746 510.708 600.722 500.638 440.820 390.451 450.566 810.599 390.541 460.950 360.510 310.313 170.648 300.819 320.616 640.682 690.590 480.869 590.810 450.656 35
Pyunghwan Ahn, Juyoung Yang, Eojindl Yi, Chanho Lee, Junmo Kim: Projection-based Point Convolution for Efficient Point Cloud Segmentation. IEEE Access
DCM-Net0.658 470.778 370.702 630.806 270.619 480.813 490.468 380.693 620.494 690.524 540.941 610.449 540.298 230.510 680.821 310.675 400.727 520.568 560.826 720.803 480.637 41
Jonas Schult*, Francis Engelmann*, Theodora Kontogianni, Bastian Leibe: DualConvMesh-Net: Joint Geodesic and Euclidean Convolutions on 3D Meshes. CVPR 2020 [Oral]
HPGCNN0.656 480.698 690.743 500.650 740.564 650.820 390.505 220.758 420.631 230.479 670.945 500.480 390.226 550.572 500.774 410.690 360.735 480.614 360.853 660.776 690.597 57
Jisheng Dang, Qingyong Hu, Yulan Guo, Jun Yang: HPGCNN.
SAFNet-segpermissive0.654 490.752 490.734 540.664 710.583 600.815 450.399 700.754 440.639 200.535 500.942 590.470 430.309 190.665 260.539 710.650 490.708 580.635 270.857 650.793 560.642 39
Linqing Zhao, Jiwen Lu, Jie Zhou: Similarity-Aware Fusion Network for 3D Semantic Segmentation. IROS 2021
RandLA-Netpermissive0.645 500.778 370.731 550.699 580.577 610.829 280.446 500.736 500.477 760.523 560.945 500.454 500.269 400.484 750.749 460.618 620.738 460.599 430.827 710.792 590.621 46
MVPNetpermissive0.641 510.831 220.715 580.671 680.590 560.781 650.394 720.679 640.642 180.553 430.937 670.462 460.256 460.649 290.406 840.626 600.691 660.666 180.877 480.792 590.608 50
Maximilian Jaritz, Jiayuan Gu, Hao Su: Multi-view PointNet for 3D Scene Understanding. GMDL Workshop, ICCV 2019
PointConv-SFPN0.641 510.776 390.703 620.721 510.557 680.826 310.451 450.672 660.563 550.483 660.943 580.425 650.162 820.644 310.726 480.659 470.709 570.572 530.875 500.786 640.559 69
PointMRNet0.640 530.717 640.701 640.692 610.576 620.801 550.467 400.716 550.563 550.459 720.953 270.429 610.169 790.581 470.854 210.605 650.710 550.550 640.894 380.793 560.575 62
FPConvpermissive0.639 540.785 340.760 380.713 560.603 510.798 570.392 730.534 860.603 370.524 540.948 420.457 480.250 480.538 620.723 500.598 690.696 640.614 360.872 550.799 490.567 66
Yiqun Lin, Zizheng Yan, Haibin Huang, Dong Du, Ligang Liu, Shuguang Cui, Xiaoguang Han: FPConv: Learning Local Flattening for Point Convolution. CVPR 2020
PD-Net0.638 550.797 310.769 350.641 790.590 560.820 390.461 420.537 850.637 210.536 490.947 440.388 760.206 630.656 270.668 600.647 530.732 500.585 510.868 600.793 560.473 88
PointSPNet0.637 560.734 570.692 710.714 550.576 620.797 580.446 500.743 480.598 400.437 770.942 590.403 720.150 860.626 370.800 370.649 500.697 630.557 610.846 680.777 680.563 67
SConv0.636 570.830 230.697 670.752 470.572 640.780 670.445 520.716 550.529 610.530 510.951 320.446 560.170 780.507 700.666 610.636 580.682 690.541 690.886 430.799 490.594 58
Supervoxel-CNN0.635 580.656 750.711 590.719 520.613 490.757 760.444 550.765 410.534 600.566 400.928 780.478 400.272 370.636 320.531 730.664 440.645 790.508 770.864 620.792 590.611 47
joint point-basedpermissive0.634 590.614 820.778 310.667 700.633 460.825 320.420 640.804 320.467 790.561 410.951 320.494 340.291 260.566 520.458 790.579 760.764 340.559 600.838 690.814 420.598 56
Hung-Yueh Chiang, Yen-Liang Lin, Yueh-Cheng Liu, Winston H. Hsu: A Unified Point-Based Framework for 3D Segmentation. 3DV 2019
PointMTL0.632 600.731 580.688 740.675 650.591 550.784 640.444 550.565 820.610 310.492 640.949 400.456 490.254 470.587 440.706 530.599 680.665 750.612 390.868 600.791 630.579 61
3DSM_DMMF0.631 610.626 790.745 480.801 310.607 500.751 770.506 210.729 530.565 530.491 650.866 940.434 570.197 700.595 420.630 640.709 290.705 600.560 580.875 500.740 790.491 83
PointNet2-SFPN0.631 610.771 410.692 710.672 660.524 730.837 200.440 580.706 600.538 590.446 740.944 560.421 670.219 580.552 580.751 450.591 720.737 470.543 680.901 330.768 710.557 70
APCF-Net0.631 610.742 540.687 760.672 660.557 680.792 620.408 660.665 670.545 570.508 590.952 310.428 620.186 730.634 340.702 540.620 610.706 590.555 620.873 530.798 510.581 60
Haojia, Lin: Adaptive Pyramid Context Fusion for Point Cloud Perception. GRSL
FusionAwareConv0.630 640.604 840.741 520.766 430.590 560.747 780.501 240.734 510.503 680.527 520.919 840.454 500.323 130.550 610.420 830.678 390.688 670.544 660.896 360.795 530.627 45
Jiazhao Zhang, Chenyang Zhu, Lintao Zheng, Kai Xu: Fusion-Aware Point Convolution for Online Semantic 3D Scene Segmentation. CVPR 2020
DenSeR0.628 650.800 300.625 860.719 520.545 710.806 510.445 520.597 760.448 830.519 570.938 660.481 380.328 110.489 740.499 780.657 480.759 390.592 470.881 460.797 520.634 42
SegGroup_sempermissive0.627 660.818 260.747 470.701 570.602 520.764 730.385 770.629 730.490 720.508 590.931 770.409 700.201 670.564 530.725 490.618 620.692 650.539 700.873 530.794 540.548 73
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
SIConv0.625 670.830 230.694 690.757 450.563 660.772 710.448 490.647 700.520 630.509 580.949 400.431 600.191 710.496 720.614 660.647 530.672 730.535 720.876 490.783 650.571 63
HPEIN0.618 680.729 590.668 770.647 760.597 540.766 720.414 650.680 630.520 630.525 530.946 470.432 580.215 600.493 730.599 670.638 570.617 840.570 540.897 350.806 460.605 53
Li Jiang, Hengshuang Zhao, Shu Liu, Xiaoyong Shen, Chi-Wing Fu, Jiaya Jia: Hierarchical Point-Edge Interaction Network for Point Cloud Semantic Segmentation. ICCV 2019
SPH3D-GCNpermissive0.610 690.858 160.772 330.489 910.532 720.792 620.404 690.643 720.570 520.507 610.935 700.414 690.046 960.510 680.702 540.602 670.705 600.549 650.859 640.773 700.534 76
Huan Lei, Naveed Akhtar, and Ajmal Mian: Spherical Kernel for Efficient Graph Convolution on 3D Point Clouds. TPAMI 2020
AttAN0.609 700.760 460.667 780.649 750.521 740.793 600.457 440.648 690.528 620.434 790.947 440.401 730.153 850.454 770.721 510.648 520.717 540.536 710.904 290.765 720.485 84
Gege Zhang, Qinghua Ma, Licheng Jiao, Fang Liu and Qigong Sun: AttAN: Attention Adversarial Networks for 3D Point Cloud Semantic Segmentation. IJCAI2020
wsss-transformer0.600 710.634 780.743 500.697 600.601 530.781 650.437 600.585 790.493 700.446 740.933 750.394 740.011 980.654 280.661 630.603 660.733 490.526 730.832 700.761 740.480 85
dtc_net0.596 720.683 700.725 570.715 540.549 700.803 540.444 550.647 700.493 700.495 630.941 610.409 700.000 1000.424 820.544 700.598 690.703 620.522 740.912 250.792 590.520 79
LAP-D0.594 730.720 620.692 710.637 800.456 830.773 700.391 750.730 520.587 430.445 760.940 640.381 770.288 270.434 800.453 810.591 720.649 770.581 520.777 780.749 780.610 49
DPC0.592 740.720 620.700 650.602 840.480 790.762 750.380 780.713 580.585 460.437 770.940 640.369 790.288 270.434 800.509 770.590 740.639 820.567 570.772 790.755 760.592 59
Francis Engelmann, Theodora Kontogianni, Bastian Leibe: Dilated Point Convolutions: On the Receptive Field Size of Point Convolutions on 3D Point Clouds. ICRA 2020
CCRFNet0.589 750.766 450.659 810.683 630.470 820.740 800.387 760.620 750.490 720.476 680.922 820.355 820.245 520.511 670.511 760.571 770.643 800.493 810.872 550.762 730.600 55
ROSMRF0.580 760.772 400.707 610.681 640.563 660.764 730.362 800.515 870.465 800.465 710.936 690.427 640.207 620.438 780.577 680.536 800.675 720.486 820.723 850.779 660.524 78
SD-DETR0.576 770.746 510.609 900.445 950.517 750.643 910.366 790.714 570.456 810.468 700.870 930.432 580.264 430.558 560.674 580.586 750.688 670.482 830.739 830.733 810.537 75
SQN_0.1%0.569 780.676 720.696 680.657 720.497 760.779 680.424 620.548 830.515 650.376 840.902 910.422 660.357 40.379 850.456 800.596 710.659 760.544 660.685 880.665 920.556 71
TextureNetpermissive0.566 790.672 740.664 790.671 680.494 770.719 810.445 520.678 650.411 890.396 820.935 700.356 810.225 560.412 830.535 720.565 780.636 830.464 850.794 770.680 890.568 65
Jingwei Huang, Haotian Zhang, Li Yi, Thomas Funkerhouser, Matthias Niessner, Leonidas Guibas: TextureNet: Consistent Local Parametrizations for Learning from High-Resolution Signals on Meshes. CVPR
DVVNet0.562 800.648 760.700 650.770 400.586 590.687 850.333 840.650 680.514 660.475 690.906 880.359 800.223 570.340 870.442 820.422 910.668 740.501 780.708 860.779 660.534 76
Pointnet++ & Featurepermissive0.557 810.735 560.661 800.686 620.491 780.744 790.392 730.539 840.451 820.375 850.946 470.376 780.205 640.403 840.356 870.553 790.643 800.497 790.824 730.756 750.515 80
GMLPs0.538 820.495 920.693 700.647 760.471 810.793 600.300 870.477 880.505 670.358 860.903 900.327 850.081 930.472 760.529 740.448 890.710 550.509 750.746 810.737 800.554 72
PanopticFusion-label0.529 830.491 930.688 740.604 830.386 880.632 920.225 970.705 610.434 860.293 920.815 950.348 830.241 530.499 710.669 590.507 820.649 770.442 910.796 760.602 950.561 68
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
subcloud_weak0.516 840.676 720.591 930.609 810.442 840.774 690.335 830.597 760.422 880.357 870.932 760.341 840.094 920.298 890.528 750.473 870.676 710.495 800.602 940.721 840.349 95
Online SegFusion0.515 850.607 830.644 840.579 860.434 850.630 930.353 810.628 740.440 840.410 800.762 980.307 870.167 800.520 650.403 850.516 810.565 870.447 890.678 890.701 860.514 81
Davide Menini, Suryansh Kumar, Martin R. Oswald, Erik Sandstroem, Cristian Sminchisescu, Luc van Gool: A Real-Time Learning Framework for Joint 3D Reconstruction and Semantic Segmentation. Robotics and Automation Letters Submission
3DMV, FTSDF0.501 860.558 880.608 910.424 970.478 800.690 840.246 930.586 780.468 780.450 730.911 860.394 740.160 830.438 780.212 940.432 900.541 920.475 840.742 820.727 820.477 86
PCNN0.498 870.559 870.644 840.560 880.420 870.711 830.229 950.414 890.436 850.352 880.941 610.324 860.155 840.238 940.387 860.493 830.529 930.509 750.813 750.751 770.504 82
3DMV0.484 880.484 940.538 950.643 780.424 860.606 960.310 850.574 800.433 870.378 830.796 960.301 880.214 610.537 630.208 950.472 880.507 960.413 940.693 870.602 950.539 74
Angela Dai, Matthias Niessner: 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation. ECCV'18
PointCNN with RGBpermissive0.458 890.577 860.611 890.356 990.321 960.715 820.299 890.376 930.328 960.319 900.944 560.285 900.164 810.216 970.229 920.484 850.545 910.456 870.755 800.709 850.475 87
Yangyan Li, Rui Bu, Mingchao Sun, Baoquan Chen: PointCNN. NeurIPS 2018
FCPNpermissive0.447 900.679 710.604 920.578 870.380 890.682 860.291 900.106 990.483 750.258 970.920 830.258 940.025 970.231 960.325 880.480 860.560 890.463 860.725 840.666 910.231 99
Dario Rethage, Johanna Wald, Jürgen Sturm, Nassir Navab, Federico Tombari: Fully-Convolutional Point Networks for Large-Scale Point Clouds. ECCV 2018
DGCNN_reproducecopyleft0.446 910.474 950.623 870.463 930.366 910.651 890.310 850.389 920.349 940.330 890.937 670.271 920.126 890.285 900.224 930.350 960.577 860.445 900.625 920.723 830.394 91
Yue Wang, Yongbin Sun, Ziwei Liu, Sanjay E. Sarma, Michael M. Bronstein, Justin M. Solomon: Dynamic Graph CNN for Learning on Point Clouds. TOG 2019
SurfaceConvPF0.442 920.505 910.622 880.380 980.342 940.654 880.227 960.397 910.367 920.276 940.924 800.240 950.198 690.359 860.262 900.366 930.581 850.435 920.640 910.668 900.398 90
Hao Pan, Shilin Liu, Yang Liu, Xin Tong: Convolutional Neural Networks on 3D Surfaces Using Parallel Frames.
PNET20.442 920.548 890.548 940.597 850.363 920.628 940.300 870.292 940.374 910.307 910.881 920.268 930.186 730.238 940.204 960.407 920.506 970.449 880.667 900.620 940.462 89
Tangent Convolutionspermissive0.438 940.437 970.646 830.474 920.369 900.645 900.353 810.258 960.282 980.279 930.918 850.298 890.147 880.283 910.294 890.487 840.562 880.427 930.619 930.633 930.352 94
Maxim Tatarchenko, Jaesik Park, Vladlen Koltun, Qian-Yi Zhou: Tangent convolutions for dense prediction in 3d. CVPR 2018
3DWSSS0.425 950.525 900.647 820.522 890.324 950.488 990.077 1000.712 590.353 930.401 810.636 1000.281 910.176 760.340 870.565 690.175 1000.551 900.398 950.370 1000.602 950.361 93
SPLAT Netcopyleft0.393 960.472 960.511 960.606 820.311 970.656 870.245 940.405 900.328 960.197 980.927 790.227 970.000 1000.001 1010.249 910.271 990.510 940.383 970.593 950.699 870.267 97
Hang Su, Varun Jampani, Deqing Sun, Subhransu Maji, Evangelos Kalogerakis, Ming-Hsuan Yang, Jan Kautz: SPLATNet: Sparse Lattice Networks for Point Cloud Processing. CVPR 2018
ScanNet+FTSDF0.383 970.297 990.491 970.432 960.358 930.612 950.274 910.116 980.411 890.265 950.904 890.229 960.079 940.250 920.185 970.320 970.510 940.385 960.548 960.597 980.394 91
PointNet++permissive0.339 980.584 850.478 980.458 940.256 990.360 1000.250 920.247 970.278 990.261 960.677 990.183 980.117 900.212 980.145 990.364 940.346 1000.232 1000.548 960.523 990.252 98
Charles R. Qi, Li Yi, Hao Su, Leonidas J. Guibas: pointnet++: deep hierarchical feature learning on point sets in a metric space.
SSC-UNetpermissive0.308 990.353 980.290 1000.278 1000.166 1000.553 970.169 990.286 950.147 1000.148 1000.908 870.182 990.064 950.023 1000.018 1010.354 950.363 980.345 980.546 980.685 880.278 96
ScanNetpermissive0.306 1000.203 1000.366 990.501 900.311 970.524 980.211 980.002 1010.342 950.189 990.786 970.145 1000.102 910.245 930.152 980.318 980.348 990.300 990.460 990.437 1000.182 100
Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner: ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. CVPR'17
ERROR0.054 1010.000 1010.041 1010.172 1010.030 1010.062 1010.001 1010.035 1000.004 1010.051 1010.143 1010.019 1010.003 990.041 990.050 1000.003 1010.054 1010.018 1010.005 1010.264 1010.082 101


This table lists the benchmark results for the 3D semantic instance scenario.




Method Infoavg ap 25%bathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
TD3D0.875 11.000 10.976 120.877 90.783 140.970 10.889 10.828 120.945 30.803 60.713 100.720 100.709 81.000 10.936 60.934 30.873 71.000 10.791 5
Queryformer0.874 21.000 10.978 100.809 240.876 10.936 50.702 80.716 260.920 50.875 30.766 40.772 20.818 41.000 10.995 10.916 40.892 11.000 10.767 8
SoftGroup++0.874 21.000 10.972 130.947 10.839 50.898 120.556 230.913 20.881 110.756 80.828 20.748 60.821 21.000 10.937 50.937 10.887 21.000 10.821 2
Mask3D0.870 41.000 10.985 60.782 320.818 80.938 40.760 40.749 230.923 40.877 20.760 50.785 10.820 31.000 10.912 90.864 230.878 50.983 390.825 1
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
SoftGrouppermissive0.865 51.000 10.969 140.860 120.860 20.913 80.558 210.899 30.911 60.760 70.828 10.736 70.802 60.981 290.919 80.875 140.877 61.000 10.820 3
Thang Vu, Kookhoi Kim, Tung M. Luu, Xuan Thanh Nguyen, Chang D. Yoo: SoftGroup for 3D Instance Segmentaiton on Point Clouds. CVPR 2022 [Oral]
IPCA-Inst0.851 61.000 10.968 150.884 80.842 40.862 240.693 100.812 170.888 90.677 200.783 30.698 110.807 51.000 10.911 130.865 220.865 91.000 10.757 10
SPFormerpermissive0.851 61.000 10.994 20.806 250.774 160.942 30.637 130.849 100.859 140.889 10.720 80.730 80.665 131.000 10.911 130.868 210.873 81.000 10.796 4
Sun Jiahao, Qing Chunmei, Tan Junpeng, Xu Xiangmin: Superpoint Transformer for 3D Scene Instance Segmentation. AAAI 2023 [Oral]
ISBNetpermissive0.845 81.000 10.976 110.798 260.794 110.916 60.757 50.667 330.882 100.842 40.715 90.757 40.832 11.000 10.905 160.803 430.843 121.000 10.715 17
Tuan Duc Ngo, Binh-Son Hua, Khoi Nguyen: ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution. CVPR 2023
SphereSeg0.835 91.000 10.963 180.891 60.794 100.954 20.822 30.710 270.961 20.721 120.693 160.530 310.653 151.000 10.867 230.857 260.859 100.991 360.771 7
TopoSeg0.832 101.000 10.981 80.933 20.819 70.826 320.524 290.841 110.811 190.681 190.759 60.687 120.727 70.981 290.911 130.883 100.853 111.000 10.756 11
GraphCut0.832 101.000 10.922 320.724 410.798 90.902 110.701 90.856 80.859 130.715 130.706 110.748 50.640 261.000 10.934 70.862 240.880 31.000 10.729 13
PBNetpermissive0.825 121.000 10.963 170.837 160.843 30.865 190.822 20.647 350.878 120.733 100.639 240.683 130.650 161.000 10.853 240.870 180.820 131.000 10.744 12
SSEC0.820 131.000 10.983 70.924 30.826 60.817 350.415 380.899 40.793 230.673 210.731 70.636 180.653 141.000 10.939 40.804 410.878 41.000 10.780 6
DKNet0.815 141.000 10.930 240.844 140.765 200.915 70.534 270.805 190.805 210.807 50.654 180.763 30.650 161.000 10.794 360.881 110.766 171.000 10.758 9
Yizheng Wu, Min Shi, Shuaiyuan Du, Hao Lu, Zhiguo Cao, Weicai Zhong: 3D Instances as 1D Kernels. ECCV 2022
RPGN0.806 151.000 10.992 40.789 280.723 320.891 130.650 120.810 180.832 160.665 230.699 140.658 140.700 91.000 10.881 180.832 330.774 150.997 290.613 34
Shichao Dong, Guosheng Lin, Tzu-Yi Hung: Learning Regional Purity for Instance Segmentation on 3D Point Clouds. ECCV 2022
HAISpermissive0.803 161.000 10.994 20.820 200.759 210.855 250.554 240.882 50.827 180.615 290.676 170.638 170.646 241.000 10.912 90.797 450.767 160.994 340.726 14
Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang: Hierarchical Aggregation for 3D Instance Segmentation. ICCV 2021
Box2Mask0.803 161.000 10.962 190.874 100.707 350.887 160.686 110.598 390.961 10.715 140.694 150.469 360.700 91.000 10.912 90.902 50.753 220.997 290.637 28
Julian Chibane, Francis Engelmann, Tuan Anh Tran, Gerard Pons-Moll: Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation Using Bounding Boxes. ECCV 2022
Mask-Group0.792 181.000 10.968 160.812 210.766 190.864 200.460 320.815 160.888 80.598 320.651 210.639 160.600 300.918 330.941 20.896 60.721 291.000 10.723 15
Min Zhong, Xinghao Chen, Xiaokang Chen, Gang Zeng, Yunhe Wang: MaskGroup: Hierarchical Point Grouping and Masking for 3D Instance Segmentation. ICME 2022
CSC-Pretrained0.791 191.000 10.996 10.829 190.767 180.889 150.600 160.819 150.770 280.594 330.620 270.541 280.700 91.000 10.941 20.889 80.763 181.000 10.526 43
SSTNetpermissive0.789 201.000 10.840 460.888 70.717 330.835 280.717 70.684 320.627 420.724 110.652 200.727 90.600 301.000 10.912 90.822 360.757 211.000 10.691 22
Zhihao Liang, Zhihao Li, Songcen Xu, Mingkui Tan, Kui Jia: Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks. ICCV2021
GICN0.788 211.000 10.978 90.867 110.781 150.833 290.527 280.824 130.806 200.549 410.596 300.551 240.700 91.000 10.853 240.935 20.733 261.000 10.651 25
DENet0.786 221.000 10.929 250.736 390.750 270.720 470.755 60.934 10.794 220.590 340.561 360.537 290.650 161.000 10.882 170.804 420.789 141.000 10.719 16
DualGroup0.782 231.000 10.927 260.811 220.772 170.853 260.631 150.805 190.773 250.613 300.611 280.610 200.650 160.835 440.881 180.879 130.750 241.000 10.675 23
PointGroup0.778 241.000 10.900 360.798 270.715 340.863 210.493 300.706 280.895 70.569 390.701 120.576 220.639 271.000 10.880 200.851 280.719 300.997 290.709 19
Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia: PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation. CVPR 2020 [oral]
PE0.776 251.000 10.900 370.860 120.728 310.869 170.400 390.857 70.774 240.568 400.701 130.602 210.646 240.933 320.843 270.890 70.691 370.997 290.709 18
Biao Zhang, Peter Wonka: Point Cloud Instance Segmentation using Probabilistic Embeddings. CVPR 2021
AOIA0.767 261.000 10.937 210.810 230.740 290.906 90.550 250.800 210.706 340.577 380.624 260.544 270.596 350.857 360.879 220.880 120.750 230.992 350.658 24
DD-UNet+Group0.764 271.000 10.897 390.837 150.753 240.830 310.459 340.824 130.699 360.629 270.653 190.438 390.650 161.000 10.880 200.858 250.690 381.000 10.650 26
H. Liu, R. Liu, K. Yang, J. Zhang, K. Peng, R. Stiefelhagen: HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor. ICCVW 2021
INS-Conv-instance0.762 281.000 10.923 290.765 350.785 130.905 100.600 160.655 340.646 410.683 180.647 220.530 300.650 161.000 10.824 290.830 340.693 360.944 430.644 27
Dyco3Dcopyleft0.761 291.000 10.935 220.893 50.752 260.863 220.600 160.588 400.742 310.641 250.633 250.546 260.550 370.857 360.789 380.853 270.762 190.987 370.699 20
Tong He; Chunhua Shen; Anton van den Hengel: DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution. CVPR2021
OccuSeg+instance0.742 301.000 10.923 290.785 290.745 280.867 180.557 220.578 430.729 320.670 220.644 230.488 340.577 361.000 10.794 360.830 340.620 451.000 10.550 39
Lei Han, Tian Zheng, Lan Xu, Lu Fang: OccuSeg: Occupancy-aware 3D Instance Segmentation. CVPR2020
RWSeg0.739 311.000 10.899 380.759 370.753 250.823 330.282 430.691 310.658 390.582 370.594 310.547 250.628 281.000 10.795 350.868 200.728 281.000 10.692 21
3D-MPA0.737 321.000 10.933 230.785 290.794 120.831 300.279 450.588 400.695 370.616 280.559 370.556 230.650 161.000 10.809 330.875 150.696 341.000 10.608 36
Francis Engelmann, Martin Bokeloh, Alireza Fathi, Bastian Leibe, Matthias Nießner: 3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation. CVPR 2020
MTML0.731 331.000 10.992 40.779 340.609 440.746 420.308 420.867 60.601 450.607 310.539 400.519 320.550 371.000 10.824 290.869 190.729 271.000 10.616 32
Jean Lahoud, Bernard Ghanem, Marc Pollefeys, Martin R. Oswald: 3D Instance Segmentation via Multi-task Metric Learning. ICCV 2019 [oral]
OSIS0.725 341.000 10.885 420.653 470.657 410.801 360.576 200.695 300.828 170.698 160.534 410.457 380.500 440.857 360.831 280.841 310.627 441.000 10.619 31
SSEN0.724 351.000 10.926 270.781 330.661 390.845 270.596 190.529 450.764 300.653 240.489 460.461 370.500 440.859 350.765 390.872 170.761 201.000 10.577 37
Dongsu Zhang, Junha Chun, Sang Kyun Cha, Young Min Kim: Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with Deep Metric Learning. Arxiv
NeuralBF0.718 361.000 10.945 200.901 40.754 230.817 340.460 320.700 290.772 260.688 170.568 350.000 560.500 440.981 290.606 470.872 160.740 251.000 10.614 33
Weiwei Sun, Daniel Rebain, Renjie Liao, Vladimir Tankovich, Soroosh Yazdani, Kwang Moo Yi, Andrea Tagliasacchi: NeuralBF: Neural Bilateral Filtering for Top-down Instance Segmentation on Point Clouds. WACV 2023
Sparse R-CNN0.714 371.000 10.926 280.694 420.699 370.890 140.636 140.516 460.693 380.743 90.588 320.369 420.601 290.594 490.800 340.886 90.676 390.986 380.546 40
SALoss-ResNet0.695 381.000 10.855 440.579 510.589 460.735 450.484 310.588 400.856 150.634 260.571 340.298 430.500 441.000 10.824 290.818 370.702 330.935 470.545 41
Zhidong Liang, Ming Yang, Hao Li, Chunxiang Wang: 3D Instance Embedding Learning With a Structure-Aware Loss Function for Point Cloud Segmentation. IEEE Robotics and Automation Letters (IROS2020)
PanopticFusion-inst0.693 391.000 10.852 450.655 460.616 430.788 370.334 410.763 220.771 270.457 510.555 380.652 150.518 410.857 360.765 390.732 510.631 420.944 430.577 38
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
Occipital-SCS0.688 401.000 10.913 330.730 400.737 300.743 440.442 350.855 90.655 400.546 420.546 390.263 450.508 430.889 340.568 480.771 480.705 320.889 500.625 30
3D-BoNet0.687 411.000 10.887 410.836 170.587 470.643 540.550 250.620 360.724 330.522 460.501 440.243 460.512 421.000 10.751 410.807 400.661 410.909 490.612 35
Bo Yang, Jianan Wang, Ronald Clark, Qingyong Hu, Sen Wang, Andrew Markham, Niki Trigoni: Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds. NeurIPS 2019 Spotlight
PCJC0.684 421.000 10.895 400.757 380.659 400.862 230.189 520.739 240.606 440.712 150.581 330.515 330.650 160.857 360.357 530.785 460.631 430.889 500.635 29
SPG_WSIS0.678 431.000 10.880 430.836 170.701 360.727 460.273 470.607 380.706 350.541 440.515 430.174 480.600 300.857 360.716 420.846 300.711 311.000 10.506 44
One_Thing_One_Clickpermissive0.675 441.000 10.823 470.782 310.621 420.766 390.211 490.736 250.560 480.586 350.522 420.636 190.453 480.641 480.853 240.850 290.694 350.997 290.411 48
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
SegGroup_inspermissive0.637 451.000 10.923 310.593 500.561 480.746 430.143 540.504 470.766 290.485 490.442 470.372 410.530 400.714 450.815 320.775 470.673 401.000 10.431 47
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
MASCpermissive0.615 460.711 520.802 480.540 520.757 220.777 380.029 550.577 440.588 470.521 470.600 290.436 400.534 390.697 460.616 460.838 320.526 470.980 400.534 42
Chen Liu, Yasutaka Furukawa: MASC: Multi-scale Affinity with Sparse Convolution for 3D Instance Segmentation.
UNet-backbone0.605 471.000 10.909 340.764 360.603 450.704 480.415 370.301 520.548 490.461 500.394 480.267 440.386 500.857 360.649 450.817 380.504 480.959 410.356 51
3D-SISpermissive0.558 481.000 10.773 490.614 490.503 500.691 500.200 500.412 480.498 520.546 430.311 530.103 520.600 300.857 360.382 500.799 440.445 540.938 460.371 49
Ji Hou, Angela Dai, Matthias Niessner: 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans. CVPR 2019
R-PointNet0.544 490.500 550.655 550.661 450.663 380.765 400.432 360.214 540.612 430.584 360.499 450.204 470.286 540.429 520.655 440.650 560.539 460.950 420.499 45
Hier3Dcopyleft0.540 501.000 10.727 500.626 480.467 530.693 490.200 500.412 480.480 530.528 450.318 520.077 550.600 300.688 470.382 500.768 490.472 500.941 450.350 52
Tan: HCFS3D: Hierarchical Coupled Feature Selection Network for 3D Semantic and Instance Segmentation.
Region-18class0.497 510.250 570.902 350.689 430.540 490.747 410.276 460.610 370.268 560.489 480.348 490.000 560.243 560.220 550.663 430.814 390.459 520.928 480.496 46
tmp0.474 521.000 10.727 500.433 550.481 520.673 520.022 570.380 500.517 510.436 530.338 510.128 500.343 520.429 520.291 550.728 520.473 490.833 530.300 54
SemRegionNet-20cls0.470 531.000 10.727 500.447 540.481 510.678 510.024 560.380 500.518 500.440 520.339 500.128 500.350 510.429 520.212 560.711 530.465 510.833 530.290 55
ASIS0.422 540.333 560.707 530.676 440.401 540.650 530.350 400.177 550.594 460.376 540.202 540.077 540.404 490.571 500.197 570.674 550.447 530.500 560.260 56
3D-BEVIS0.401 550.667 530.687 540.419 560.137 570.587 550.188 530.235 530.359 550.211 560.093 570.080 530.311 530.571 500.382 500.754 500.300 560.874 520.357 50
Cathrin Elich, Francis Engelmann, Jonas Schult, Theodora Kontogianni, Bastian Leibe: 3D-BEVIS: Birds-Eye-View Instance Segmentation.
Sgpn_scannet0.390 560.556 540.636 560.493 530.353 550.539 560.271 480.160 560.450 540.359 550.178 550.146 490.250 550.143 560.347 540.698 540.436 550.667 550.331 53
MaskRCNN 2d->3d Proj0.261 570.903 510.081 570.008 570.233 560.175 570.280 440.106 570.150 570.203 570.175 560.480 350.218 570.143 560.542 490.404 570.153 570.393 570.049 57


This table lists the benchmark results for the 2D semantic label scenario.


Method Infoavg ioubathtubbedbookshelfcabinetchaircountercurtaindeskdoorfloorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwallwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Virtual MVFusion (R)0.745 10.861 10.839 10.881 10.672 10.512 10.422 150.898 10.723 10.714 10.954 20.454 10.509 10.773 10.895 10.756 10.820 10.653 10.935 10.891 10.728 1
Abhijit Kundu, Xiaoqi Yin, Alireza Fathi, David Ross, Brian Brewington, Thomas Funkhouser, Caroline Pantofaru: Virtual Multi-view Fusion for 3D Semantic Segmentation. ECCV 2020
BPNet_2Dcopyleft0.670 20.822 30.795 30.836 20.659 20.481 20.451 110.769 30.656 30.567 30.931 30.395 40.390 40.700 30.534 30.689 90.770 20.574 30.865 60.831 30.675 4
Wenbo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia and Tien-Tsin Wong: Bidirectional Projection Network for Cross Dimension Scene Understanding. CVPR 2021 (Oral)
CU-Hybrid-2D Net0.636 30.825 20.820 20.179 200.648 30.463 30.549 20.742 60.676 20.628 20.961 10.420 20.379 50.684 60.381 150.732 20.723 30.599 20.827 130.851 20.634 6
CMX0.613 40.681 70.725 90.502 120.634 50.297 150.478 90.830 20.651 40.537 60.924 40.375 50.315 120.686 50.451 120.714 40.543 180.504 50.894 40.823 40.688 3
DMMF_3d0.605 50.651 80.744 70.782 30.637 40.387 40.536 30.732 70.590 60.540 50.856 180.359 90.306 130.596 110.539 20.627 180.706 40.497 70.785 180.757 160.476 19
MCA-Net0.595 60.533 170.756 60.746 40.590 80.334 70.506 60.670 120.587 70.500 100.905 80.366 80.352 80.601 100.506 60.669 150.648 70.501 60.839 120.769 120.516 18
RFBNet0.592 70.616 90.758 50.659 50.581 90.330 80.469 100.655 150.543 120.524 70.924 40.355 100.336 100.572 140.479 80.671 130.648 70.480 90.814 160.814 50.614 9
FAN_NV_RVC0.586 80.510 180.764 40.079 230.620 70.330 80.494 70.753 40.573 80.556 40.884 130.405 30.303 140.718 20.452 110.672 120.658 50.509 40.898 30.813 60.727 2
DCRedNet0.583 90.682 60.723 100.542 110.510 170.310 120.451 110.668 130.549 110.520 80.920 60.375 50.446 20.528 170.417 130.670 140.577 150.478 100.862 70.806 70.628 8
MIX6D_RVC0.582 100.695 40.687 140.225 180.632 60.328 100.550 10.748 50.623 50.494 130.890 110.350 120.254 200.688 40.454 100.716 30.597 140.489 80.881 50.768 130.575 12
SSMAcopyleft0.577 110.695 40.716 120.439 140.563 110.314 110.444 130.719 80.551 100.503 90.887 120.346 130.348 90.603 90.353 170.709 50.600 120.457 120.901 20.786 80.599 11
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. International Journal of Computer Vision, 2019
UNIV_CNP_RVC_UE0.566 120.569 160.686 160.435 150.524 140.294 160.421 160.712 90.543 120.463 150.872 140.320 140.363 70.611 80.477 90.686 100.627 90.443 150.862 70.775 110.639 5
EMSAFormer0.564 130.581 130.736 80.564 100.546 130.219 200.517 40.675 110.486 170.427 190.904 90.352 110.320 110.589 120.528 40.708 60.464 210.413 190.847 110.786 80.611 10
SN_RN152pyrx8_RVCcopyleft0.546 140.572 140.663 180.638 70.518 150.298 140.366 210.633 180.510 150.446 170.864 160.296 170.267 170.542 160.346 180.704 70.575 160.431 160.853 100.766 140.630 7
UDSSEG_RVC0.545 150.610 110.661 190.588 80.556 120.268 180.482 80.642 170.572 90.475 140.836 200.312 150.367 60.630 70.189 200.639 170.495 200.452 130.826 140.756 170.541 14
segfomer with 6d0.542 160.594 120.687 140.146 210.579 100.308 130.515 50.703 100.472 180.498 110.868 150.369 70.282 150.589 120.390 140.701 80.556 170.416 180.860 90.759 150.539 16
FuseNetpermissive0.535 170.570 150.681 170.182 190.512 160.290 170.431 140.659 140.504 160.495 120.903 100.308 160.428 30.523 180.365 160.676 110.621 110.470 110.762 190.779 100.541 14
Caner Hazirbas, Lingni Ma, Csaba Domokos, Daniel Cremers: FuseNet: Incorporating Depth into Semantic Segmentation via Fusion-based CNN Architecture. ACCV 2016
AdapNet++copyleft0.503 180.613 100.722 110.418 160.358 230.337 60.370 200.479 210.443 190.368 210.907 70.207 200.213 220.464 210.525 50.618 190.657 60.450 140.788 170.721 200.408 22
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. International Journal of Computer Vision, 2019
3DMV (2d proj)0.498 190.481 210.612 200.579 90.456 190.343 50.384 180.623 190.525 140.381 200.845 190.254 190.264 190.557 150.182 210.581 210.598 130.429 170.760 200.661 220.446 21
Angela Dai, Matthias Niessner: 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation. ECCV'18
MSeg1080_RVCpermissive0.485 200.505 190.709 130.092 220.427 200.241 190.411 170.654 160.385 230.457 160.861 170.053 230.279 160.503 190.481 70.645 160.626 100.365 210.748 210.725 190.529 17
John Lambert*, Zhuang Liu*, Ozan Sener, James Hays, Vladlen Koltun: MSeg: A Composite Dataset for Multi-domain Semantic Segmentation. CVPR 2020
ILC-PSPNet0.475 210.490 200.581 210.289 170.507 180.067 230.379 190.610 200.417 210.435 180.822 220.278 180.267 170.503 190.228 190.616 200.533 190.375 200.820 150.729 180.560 13
Enet (reimpl)0.376 220.264 230.452 230.452 130.365 210.181 210.143 230.456 220.409 220.346 220.769 230.164 210.218 210.359 220.123 230.403 230.381 230.313 230.571 220.685 210.472 20
Re-implementation of Adam Paszke, Abhishek Chaurasia, Sangpil Kim, Eugenio Culurciello: ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation.
ScanNet (2d proj)permissive0.330 230.293 220.521 220.657 60.361 220.161 220.250 220.004 230.440 200.183 230.836 200.125 220.060 230.319 230.132 220.417 220.412 220.344 220.541 230.427 230.109 23
Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner: ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. CVPR'17
DMMF0.003 240.000 240.005 240.000 240.000 240.037 240.001 240.000 240.001 240.005 240.003 240.000 240.000 240.000 240.000 240.000 240.002 240.001 240.000 240.006 240.000 24


This table lists the benchmark results for the 2D semantic instance scenario.




Method Infoavg apbathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
UniDet_RVC0.205 10.381 10.323 10.037 10.226 10.177 10.063 10.277 10.120 10.067 10.131 10.074 20.317 10.080 10.235 10.289 10.141 10.678 10.080 1
MaskRCNN_ScanNetpermissive0.119 20.129 20.212 20.002 20.112 20.148 20.014 20.205 20.044 20.066 20.078 20.095 10.142 20.030 20.128 20.139 20.080 20.459 20.057 2
Re-implementation of Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross Girshick: Mask R-CNN. ICCV'17


This table lists the benchmark results for the scene type classification scenario.




Method Infoavg recallapartmentbathroombedroom / hotelbookstore / libraryconference roomcopy/mail roomhallwaykitchenlaundry roomliving room / loungemiscofficestorage / basement / garage
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
multi-taskpermissive0.700 10.500 11.000 10.882 20.500 21.000 11.000 10.500 21.000 11.000 10.778 10.000 20.938 10.000 2
Shengyu Huang, Mikhail Usvyatsov, Konrad Schindler: Indoor Scene Recognition in 3D. IROS 2020
3DASPP-SCE0.691 20.500 10.938 20.824 31.000 11.000 10.500 21.000 10.857 20.500 20.556 30.000 20.812 20.500 1
SE-ResNeXt-SSMA0.498 30.000 40.812 30.941 10.500 20.500 30.500 20.500 20.429 40.500 20.667 20.500 10.625 30.000 2
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. arXiv
resnet50_scannet0.353 40.250 30.812 30.529 40.500 20.500 30.000 40.500 20.571 30.000 40.556 30.000 20.375 40.000 2