Presenting the ScanNet200 Benchmark

We present the ScanNet200 benchmark, which studies an order of magnitude more class categories than previous version of ScanNet. The scene geometry is shared within the two tasks, but the parsing of surface annotation allows for a larger vocabulary and more realistic setting for in the wild 3D understanding methods.

The ScanNet200 benchmark includes both finer-grained categories as well as a large number of previously unaddressed classes. This induces a much more challenging setting regarding the diversity of naturally observed semantic classes seen in the raw ScanNet RGB-D observations, where the data also reflects naturally encountered class imbalances. The difference in category frequencies between ScanNet and ScanNet200 can be seen in the Figure above.

ScanNet200 Benchmark

This table lists the benchmark results for the ScanNet200 3D semantic label scenario.




Method Infoavg iouhead ioucommon ioutail ioualarm clockarmchairbackpackbagballbarbasketbathroom cabinetbathroom counterbathroom stallbathroom stall doorbathroom vanitybathtubbedbenchbicyclebinblackboardblanketblindsboardbookbookshelfbottlebowlboxbroombucketbulletin boardcabinetcalendarcandlecartcase of water bottlescd caseceilingceiling lightchairclockclosetcloset doorcloset rodcloset wallclothesclothes dryercoat rackcoffee kettlecoffee makercoffee tablecolumncomputer towercontainercopiercouchcountercratecupcurtaincushiondecorationdeskdining tabledish rackdishwasherdividerdoordoorframedresserdumbbelldustpanend tablefanfile cabinetfire alarmfire extinguisherfireplacefloorfolded chairfurnitureguitarguitar casehair dryerhandicap barhatheadphonesironing boardjacketkeyboardkeyboard pianokitchen cabinetkitchen counterladderlamplaptoplaundry basketlaundry detergentlaundry hamperledgelightlight switchluggagemachinemailboxmatmattressmicrowavemini fridgemirrormonitormousemusic standnightstandobjectoffice chairottomanovenpaperpaper bagpaper cutterpaper towel dispenserpaper towel rollpersonpianopicturepillarpillowpipeplantplateplungerposterpotted plantpower outletpower stripprinterprojectorprojector screenpurserackradiatorrailrange hoodrecycling binrefrigeratorscaleseatshelfshoeshowershower curtainshower curtain rodshower doorshower floorshower headshower wallsignsinksoap dishsoap dispensersofa chairspeakerstair railstairsstandstoolstorage binstorage containerstorage organizerstovestructurestuffed animalsuitcasetabletelephonetissue boxtoastertoaster oventoilettoilet papertoilet paper dispensertoilet paper holdertoilet seat cover dispensertoweltrash bintrash cantraytubetvtv standvacuum cleanerventwallwardrobewashing machinewater bottlewater coolerwater pitcherwhiteboardwindowwindowsill
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
OA-CNN-L_ScanNet2000.333 20.558 10.269 20.124 40.448 60.080 40.272 30.000 10.000 10.000 10.342 50.515 20.524 20.713 80.789 20.158 40.384 30.000 30.806 20.125 20.000 40.496 30.332 20.498 70.227 40.024 20.474 10.000 10.003 20.071 40.487 10.000 30.000 10.110 20.000 20.876 10.013 80.703 10.000 30.076 40.473 40.355 40.906 20.000 10.000 10.476 40.706 10.000 70.672 60.835 50.748 20.015 70.223 20.860 30.000 10.000 40.572 20.000 50.509 30.313 20.662 10.398 50.396 10.411 60.276 10.527 10.711 10.000 20.076 50.946 10.166 20.000 10.022 20.160 10.183 40.493 40.699 30.637 20.403 20.330 50.406 40.526 20.024 10.000 10.392 40.000 50.016 80.000 30.196 20.915 20.112 30.557 30.197 10.352 40.877 20.000 30.000 10.592 60.103 60.000 80.067 10.000 10.089 10.735 30.625 30.130 50.568 20.836 20.271 10.534 30.043 60.799 20.001 20.445 10.000 10.000 20.024 10.661 20.000 10.262 10.000 10.591 30.517 70.373 30.788 30.021 20.000 10.455 10.517 40.320 30.823 40.200 80.001 80.150 30.100 40.000 10.736 20.668 20.103 60.052 30.662 10.720 10.000 10.602 40.112 30.002 30.000 10.637 40.000 20.000 10.621 30.569 10.398 20.412 30.234 30.949 10.363 10.492 70.495 30.251 30.665 30.000 10.001 50.805 20.833 20.794 40.000 10.821 10.314 20.843 50.000 10.560 20.245 20.262 20.713 10.370 5
PPT-SpUNet-F.T.0.332 30.556 20.270 10.123 50.519 10.091 20.349 20.000 10.000 10.000 10.339 60.383 50.498 50.833 20.807 10.241 10.584 20.000 30.755 30.124 30.000 40.608 20.330 30.530 40.314 10.000 40.374 30.000 10.000 30.197 10.459 30.000 30.000 10.117 10.000 20.876 10.095 10.682 20.000 30.086 30.518 20.433 10.930 10.000 10.000 10.563 30.542 50.077 40.715 20.858 30.756 10.008 80.171 40.874 20.000 10.039 10.550 30.000 50.545 20.256 30.657 30.453 10.351 30.449 50.213 20.392 30.611 40.000 20.037 60.946 10.138 50.000 10.000 40.063 30.308 10.537 20.796 10.673 10.323 50.392 30.400 50.509 30.000 20.000 10.649 10.000 50.023 50.000 30.000 30.914 30.002 70.506 70.163 50.359 30.872 30.000 30.000 10.623 20.112 30.001 70.000 40.000 10.021 20.753 10.565 70.150 10.579 10.806 40.267 20.616 10.042 70.783 40.000 30.374 50.000 10.000 20.000 20.620 40.000 10.000 40.000 10.572 60.634 10.350 40.792 20.000 30.000 10.376 40.535 20.378 20.855 10.672 10.074 50.000 40.185 30.000 10.727 30.660 30.076 80.000 60.432 40.646 30.000 10.594 50.006 70.000 40.000 10.658 20.000 20.000 10.661 10.549 30.300 50.291 50.045 50.942 40.304 20.600 30.572 20.135 70.695 10.000 10.008 30.793 30.942 10.899 20.000 10.816 20.181 40.897 10.000 10.679 10.223 30.264 10.691 20.345 6
OctFormer ScanNet200permissive0.326 40.539 40.265 30.131 30.499 20.110 10.522 10.000 10.000 10.000 10.318 80.427 40.455 60.743 60.765 40.175 30.842 10.000 30.828 10.204 10.033 20.429 40.335 10.601 10.312 20.000 40.357 40.000 10.000 30.047 50.423 40.000 30.000 10.105 30.000 20.873 30.079 60.670 40.000 30.117 10.471 50.432 20.829 50.000 10.000 10.584 20.417 80.089 30.684 50.837 40.705 70.021 60.178 30.892 10.000 10.028 20.505 50.000 50.457 40.200 60.662 10.412 30.244 60.496 30.000 80.451 20.626 30.000 20.102 40.943 30.138 50.000 10.000 40.149 20.291 20.534 30.722 20.632 30.331 40.253 70.453 30.487 40.000 20.000 10.479 20.000 50.022 60.000 30.000 30.900 40.128 20.684 10.164 40.413 10.854 50.000 30.000 10.512 80.074 80.003 60.000 40.000 10.000 30.469 60.613 40.132 40.529 30.871 10.227 70.582 20.026 80.787 30.000 30.339 60.000 10.000 20.000 20.626 30.000 10.029 30.000 10.587 40.612 30.411 20.724 50.000 30.000 10.407 30.552 10.513 10.849 20.655 20.408 10.000 40.296 10.000 10.686 60.645 50.145 30.022 40.414 50.633 40.000 10.637 10.224 10.000 40.000 10.650 30.000 20.000 10.622 20.535 40.343 30.483 20.230 40.943 30.289 30.618 20.596 10.140 60.679 20.000 10.022 10.783 40.620 60.906 10.000 10.806 30.137 60.865 20.000 10.378 40.000 70.168 80.680 30.227 7
Peng-Shuai Wang: OctFormer: Octree-based Transformers for 3D Point Clouds. SIGGRAPH 2023
CeCo0.340 10.551 30.247 40.181 10.475 40.057 80.142 70.000 10.000 10.000 10.387 30.463 30.499 40.924 10.774 30.213 20.257 40.000 30.546 80.100 50.006 30.615 10.177 80.534 20.246 30.000 40.400 20.000 10.338 10.006 70.484 20.609 10.000 10.083 40.000 20.873 30.089 40.661 50.000 30.048 80.560 10.408 30.892 30.000 10.000 10.586 10.616 30.000 70.692 40.900 10.721 30.162 10.228 10.860 30.000 10.000 40.575 10.083 20.550 10.347 10.624 40.410 40.360 20.740 10.109 50.321 60.660 20.000 20.121 20.939 40.143 30.000 10.400 10.003 50.190 30.564 10.652 40.615 40.421 10.304 60.579 10.547 10.000 20.000 10.296 50.000 50.030 40.096 10.000 30.916 10.037 40.551 40.171 30.376 20.865 40.286 10.000 10.633 10.102 70.027 40.011 30.000 10.000 30.474 50.742 10.133 30.311 40.824 30.242 40.503 50.068 40.828 10.000 30.429 20.000 10.063 10.000 20.781 10.000 10.000 40.000 10.665 10.633 20.450 10.818 10.000 30.000 10.429 20.532 30.226 40.825 30.510 60.377 20.709 10.079 60.000 10.753 10.683 10.102 70.063 20.401 70.620 60.000 10.619 20.000 80.000 40.000 10.595 60.000 20.000 10.345 50.564 20.411 10.603 10.384 20.945 20.266 40.643 10.367 50.304 10.663 40.000 10.010 20.726 60.767 30.898 30.000 10.784 40.435 10.861 40.000 10.447 30.000 70.257 30.656 40.377 4
: Understanding Imbalanced Semantic Segmentation Through Neural Collapse.
AWCS0.305 50.508 50.225 50.142 20.463 50.063 60.195 50.000 10.000 10.000 10.467 20.551 10.504 30.773 30.764 50.142 50.029 80.000 30.626 60.100 50.000 40.360 50.179 60.507 60.137 60.006 30.300 50.000 10.000 30.172 30.364 60.512 20.000 10.056 50.000 20.865 50.093 30.634 80.000 30.071 60.396 60.296 70.876 40.000 10.000 10.373 50.436 70.063 60.749 10.877 20.721 30.131 20.124 50.804 60.000 10.000 40.515 40.010 40.452 50.252 40.578 50.417 20.179 80.484 40.171 30.337 50.606 50.000 20.115 30.937 50.142 40.000 10.008 30.000 70.157 70.484 50.402 80.501 60.339 30.553 10.529 20.478 50.000 20.000 10.404 30.001 40.022 60.077 20.000 30.894 60.219 10.628 20.093 60.305 50.886 10.233 20.000 10.603 30.112 30.023 50.000 40.000 10.000 30.741 20.664 20.097 60.253 50.782 50.264 30.523 40.154 10.707 70.000 30.411 30.000 10.000 20.000 20.332 70.000 10.000 40.000 10.602 20.595 40.185 70.656 70.159 10.000 10.355 50.424 60.154 60.729 60.516 50.220 40.620 20.084 50.000 10.707 50.651 40.173 10.014 50.381 80.582 70.000 10.619 20.049 60.000 40.000 10.702 10.000 20.000 10.302 70.489 60.317 40.334 40.392 10.922 50.254 50.533 60.394 40.129 80.613 60.000 10.000 60.820 10.649 50.749 50.000 10.782 50.282 30.863 30.000 10.288 70.006 50.220 50.633 50.542 1
LGroundpermissive0.272 60.485 60.184 60.106 60.476 30.077 50.218 40.000 10.000 10.000 10.547 10.295 60.540 10.746 50.745 60.058 70.112 70.005 10.658 50.077 80.000 40.322 60.178 70.512 50.190 50.199 10.277 60.000 10.000 30.173 20.399 50.000 30.000 10.039 70.000 20.858 60.085 50.676 30.002 10.103 20.498 30.323 50.703 60.000 10.000 10.296 60.549 40.216 10.702 30.768 60.718 50.028 40.092 70.786 70.000 10.000 40.453 70.022 30.251 80.252 40.572 60.348 60.321 40.514 20.063 60.279 70.552 60.000 20.019 70.932 60.132 70.000 10.000 40.000 70.156 80.457 60.623 50.518 50.265 70.358 40.381 60.395 60.000 20.000 10.127 80.012 30.051 10.000 30.000 30.886 70.014 50.437 80.179 20.244 60.826 60.000 30.000 10.599 40.136 10.085 20.000 40.000 10.000 30.565 40.612 50.143 20.207 60.566 60.232 60.446 60.127 20.708 60.000 30.384 40.000 10.000 20.000 20.402 50.000 10.059 20.000 10.525 80.566 50.229 60.659 60.000 30.000 10.265 60.446 50.147 70.720 80.597 40.066 60.000 40.187 20.000 10.726 40.467 80.134 50.000 60.413 60.629 50.000 10.363 70.055 50.022 20.000 10.626 50.000 20.000 10.323 60.479 80.154 70.117 60.028 70.901 60.243 60.415 80.295 80.143 50.610 70.000 10.000 60.777 50.397 80.324 70.000 10.778 60.179 50.702 70.000 10.274 80.404 10.233 40.622 60.398 3
David Rozenberszki, Or Litany, Angela Dai: Language-Grounded Indoor 3D Semantic Segmentation in the Wild. arXiv
CSC-Pretrainpermissive0.249 80.455 80.171 70.079 80.418 70.059 70.186 60.000 10.000 10.000 10.335 70.250 70.316 70.766 40.697 80.142 50.170 50.003 20.553 70.112 40.097 10.201 80.186 50.476 80.081 70.000 40.216 80.000 10.000 30.001 80.314 80.000 30.000 10.055 60.000 20.832 80.094 20.659 60.002 10.076 40.310 80.293 80.664 80.000 10.000 10.175 80.634 20.130 20.552 80.686 80.700 80.076 30.110 60.770 80.000 10.000 40.430 80.000 50.319 60.166 70.542 80.327 70.205 70.332 70.052 70.375 40.444 80.000 20.012 80.930 80.203 10.000 10.000 40.046 40.175 50.413 70.592 60.471 70.299 60.152 80.340 70.247 80.000 20.000 10.225 60.058 20.037 20.000 30.207 10.862 80.014 50.548 50.033 70.233 70.816 70.000 30.000 10.542 70.123 20.121 10.019 20.000 10.000 30.463 70.454 80.045 80.128 80.557 70.235 50.441 70.063 50.484 80.000 30.308 80.000 10.000 20.000 20.318 80.000 10.000 40.000 10.545 70.543 60.164 80.734 40.000 30.000 10.215 80.371 70.198 50.743 50.205 70.062 70.000 40.079 60.000 10.683 70.547 70.142 40.000 60.441 30.579 80.000 10.464 60.098 40.041 10.000 10.590 70.000 20.000 10.373 40.494 50.174 60.105 70.001 80.895 70.222 70.537 50.307 70.180 40.625 50.000 10.000 60.591 80.609 70.398 60.000 10.766 80.014 80.638 80.000 10.377 50.004 60.206 70.609 80.465 2
Ji Hou, Benjamin Graham, Matthias Nießner, Saining Xie: Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts. CVPR 2021
Minkowski 34Dpermissive0.253 70.463 70.154 80.102 70.381 80.084 30.134 80.000 10.000 10.000 10.386 40.141 80.279 80.737 70.703 70.014 80.164 60.000 30.663 40.092 70.000 40.224 70.291 40.531 30.056 80.000 40.242 70.000 10.000 30.013 60.331 70.000 30.000 10.035 80.001 10.858 60.059 70.650 70.000 30.056 70.353 70.299 60.670 70.000 10.000 10.284 70.484 60.071 50.594 70.720 70.710 60.027 50.068 80.813 50.000 10.005 30.492 60.164 10.274 70.111 80.571 70.307 80.293 50.307 80.150 40.163 80.531 70.002 10.545 10.932 60.093 80.000 10.000 40.002 60.159 60.368 80.581 70.440 80.228 80.406 20.282 80.294 70.000 20.000 10.189 70.060 10.036 30.000 30.000 30.897 50.000 80.525 60.025 80.205 80.771 80.000 30.000 10.593 50.108 50.044 30.000 40.000 10.000 30.282 80.589 60.094 70.169 70.466 80.227 70.419 80.125 30.757 50.002 10.334 70.000 10.000 20.000 20.357 60.000 10.000 40.000 10.582 50.513 80.337 50.612 80.000 30.000 10.250 70.352 80.136 80.724 70.655 20.280 30.000 40.046 80.000 10.606 80.559 60.159 20.102 10.445 20.655 20.000 10.310 80.117 20.000 40.000 10.581 80.026 10.000 10.265 80.483 70.084 80.097 80.044 60.865 80.142 80.588 40.351 60.272 20.596 80.000 10.003 40.622 70.720 40.096 80.000 10.771 70.016 70.772 60.000 10.302 60.194 40.214 60.621 70.197 8
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019


This table lists the benchmark results for the ScanNet200 3D semantic instance scenario.




Method Infoavg ap 50%head ap 50%common ap 50%tail ap 50%alarm clockarmchairbackpackbagballbarbasketbathroom cabinetbathroom counterbathroom stallbathroom stall doorbathroom vanitybathtubbedbenchbicyclebinblackboardblanketblindsboardbookbookshelfbottlebowlboxbroombucketbulletin boardcabinetcalendarcandlecartcase of water bottlescd caseceilingceiling lightchairclockclosetcloset doorcloset rodcloset wallclothesclothes dryercoat rackcoffee kettlecoffee makercoffee tablecolumncomputer towercontainercopiercouchcountercratecupcurtaincushiondecorationdeskdining tabledish rackdishwasherdividerdoordoorframedresserdumbbelldustpanend tablefanfile cabinetfire alarmfire extinguisherfireplacefolded chairfurnitureguitarguitar casehair dryerhandicap barhatheadphonesironing boardjacketkeyboardkeyboard pianokitchen cabinetkitchen counterladderlamplaptoplaundry basketlaundry detergentlaundry hamperledgelightlight switchluggagemachinemailboxmatmattressmicrowavemini fridgemirrormonitormousemusic standnightstandobjectoffice chairottomanovenpaperpaper bagpaper cutterpaper towel dispenserpaper towel rollpersonpianopicturepillarpillowpipeplantplateplungerposterpotted plantpower outletpower stripprinterprojectorprojector screenpurserackradiatorrailrange hoodrecycling binrefrigeratorscaleseatshelfshoeshowershower curtainshower curtain rodshower doorshower floorshower headshower wallsignsinksoap dishsoap dispensersofa chairspeakerstair railstairsstandstoolstorage binstorage containerstorage organizerstovestructurestuffed animalsuitcasetabletelephonetissue boxtoastertoaster oventoilettoilet papertoilet paper dispensertoilet paper holdertoilet seat cover dispensertoweltrash bintrash cantraytubetvtv standvacuum cleanerventwardrobewashing machinewater bottlewater coolerwater pitcherwhiteboardwindowwindowsill
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
TD3D Scannet2000.320 20.501 20.264 20.164 20.506 30.062 20.500 10.000 10.000 10.000 10.208 10.431 20.252 31.000 10.733 30.587 20.000 20.008 20.000 30.106 10.000 20.356 10.123 40.686 10.101 20.000 10.152 20.000 10.000 20.226 10.280 30.000 20.000 10.250 10.000 10.619 20.061 30.841 10.000 10.000 20.167 10.194 10.333 20.000 20.000 10.667 20.820 10.250 30.790 41.000 10.879 20.077 10.094 30.708 10.217 20.049 20.634 10.792 10.331 40.033 50.716 20.159 20.396 20.331 40.099 20.415 10.842 10.000 20.458 10.542 10.000 10.101 20.000 10.218 10.513 20.500 20.458 20.104 20.516 10.456 10.268 40.000 10.000 10.400 10.022 10.233 20.143 20.000 10.677 10.400 10.504 50.095 30.083 50.890 20.061 20.000 10.906 10.076 20.231 10.125 20.000 20.003 20.792 30.881 10.000 20.098 30.125 40.498 50.459 20.063 10.715 10.000 20.241 40.000 10.396 20.063 10.605 10.000 10.000 20.000 10.448 50.629 30.202 20.967 10.250 20.038 10.192 10.185 20.083 41.000 11.000 10.857 20.000 20.470 20.012 10.565 30.798 10.621 10.111 10.500 11.000 10.017 20.509 10.000 10.008 11.000 10.525 20.000 10.000 10.332 30.679 10.264 20.333 20.267 11.000 10.549 10.299 50.387 20.328 30.744 40.000 10.000 20.435 51.000 10.283 40.000 10.196 10.817 10.000 10.472 10.222 30.123 40.560 20.156 2
Mask3D Scannet2000.388 10.542 10.357 10.237 10.610 10.091 10.125 50.000 10.000 10.000 10.065 30.668 10.451 11.000 10.955 10.640 10.500 10.039 10.125 20.063 20.409 10.311 20.291 10.609 30.266 10.000 10.163 10.000 10.008 10.044 20.496 11.000 10.000 10.018 20.000 10.756 10.573 10.808 20.000 10.010 10.042 30.130 30.552 10.042 10.000 11.000 10.725 40.750 10.883 11.000 10.832 40.024 20.107 10.614 30.226 10.250 10.628 20.792 10.677 20.400 10.741 10.278 10.511 10.077 50.111 10.313 20.715 20.302 10.017 30.200 20.000 10.188 10.000 10.178 20.736 11.000 10.615 10.514 10.409 20.380 50.600 10.000 10.000 10.400 10.013 20.254 10.381 10.000 10.123 40.400 10.839 10.258 10.463 10.926 10.265 10.000 10.857 20.099 10.021 20.500 10.027 10.028 11.000 10.502 50.016 10.076 40.500 10.612 10.578 10.005 20.597 20.194 10.497 10.000 10.500 10.000 20.323 40.000 11.000 10.000 10.748 10.708 20.050 40.890 21.000 10.008 20.151 30.301 11.000 11.000 10.792 30.945 11.000 10.511 10.004 20.753 10.776 20.287 20.020 20.003 40.974 30.033 10.412 50.000 10.000 20.000 20.667 10.000 10.000 10.491 10.676 20.352 10.335 10.060 20.822 50.527 21.000 10.517 10.606 10.853 10.000 10.004 10.806 11.000 10.727 10.000 10.042 20.739 20.000 10.399 30.391 10.504 10.591 10.571 1
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
Minkowski 34D Inst.permissive0.203 50.369 40.134 50.078 50.479 40.003 40.500 10.000 10.000 10.000 10.100 20.371 30.300 20.667 40.746 20.400 30.000 20.000 30.000 30.031 30.000 20.074 40.165 30.413 50.000 40.000 10.070 40.000 10.000 20.000 30.221 50.000 20.000 10.000 30.000 10.372 50.070 20.706 40.000 10.000 20.000 50.123 40.033 50.000 20.000 10.422 50.732 30.000 40.778 51.000 10.845 30.000 30.090 40.636 20.000 30.000 30.158 40.000 30.250 50.050 40.693 30.123 40.051 50.385 30.009 40.118 50.406 50.000 20.000 40.200 20.000 10.000 30.000 10.133 40.307 50.500 20.251 40.000 40.281 30.402 40.317 20.000 10.000 10.000 30.000 30.060 40.000 30.000 10.396 20.200 30.669 20.021 40.218 40.720 50.000 30.000 10.696 30.025 40.000 30.000 30.000 20.000 30.125 50.596 20.000 20.191 10.500 10.595 20.369 40.000 30.500 40.000 20.143 50.000 10.000 30.000 20.226 50.000 10.000 20.000 10.701 20.511 40.000 50.851 40.000 30.000 30.150 40.052 50.100 30.981 30.500 40.286 30.000 20.000 50.000 30.545 40.522 50.250 30.000 30.000 50.522 50.000 30.500 20.000 10.000 20.000 20.282 50.000 10.000 10.178 50.382 40.018 50.056 40.000 30.997 30.107 50.677 20.313 40.000 40.726 50.000 10.000 20.583 40.903 40.200 50.000 10.000 30.333 40.000 10.442 20.083 40.109 50.387 40.000 5
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019
CSC-Pretrain Inst.permissive0.209 40.361 50.157 40.085 40.506 20.007 30.500 10.000 10.000 10.000 10.000 50.093 50.221 40.667 40.524 50.400 30.000 20.000 30.000 30.004 40.000 20.000 50.109 50.589 40.000 40.000 10.059 50.000 10.000 20.000 30.322 20.000 20.000 10.000 30.000 10.405 30.055 40.700 50.000 10.000 20.028 40.091 50.083 30.000 20.000 10.667 20.768 20.000 40.807 31.000 10.776 50.000 30.000 50.340 50.000 30.000 30.103 50.000 30.750 10.200 30.634 50.053 50.246 30.677 20.006 50.198 30.432 40.000 20.000 40.050 40.000 10.000 30.000 10.111 50.356 40.500 20.188 50.000 40.220 40.448 20.050 50.000 10.000 10.000 30.000 30.032 50.000 30.000 10.396 20.000 40.573 40.000 50.228 30.747 40.000 30.000 10.573 50.021 50.000 30.000 30.000 20.000 30.500 40.573 30.000 20.000 50.125 40.592 30.364 50.000 30.450 50.000 20.364 20.000 10.000 30.000 20.340 30.000 10.000 20.000 10.610 30.833 10.221 10.702 50.000 30.000 30.135 50.094 40.125 20.571 40.500 40.143 50.000 20.125 30.000 30.618 20.667 40.115 50.000 30.125 21.000 10.000 30.500 20.000 10.000 20.000 20.502 40.000 10.000 10.312 40.248 50.050 40.000 50.000 30.997 30.420 30.500 40.149 50.451 20.748 20.000 10.000 20.636 30.667 50.600 20.000 10.000 30.278 50.000 10.333 40.000 50.294 20.381 50.110 3
Ji Hou, Benjamin Graham, Matthias Nießner, Saining Xie: Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts. CVPR 2021
LGround Inst.permissive0.246 30.413 30.170 30.130 30.455 50.003 50.500 10.000 10.000 10.000 10.017 40.333 40.111 51.000 10.681 40.400 30.000 20.000 31.000 10.003 50.000 20.167 30.190 20.637 20.067 30.000 10.081 30.000 10.000 20.000 30.264 40.000 20.000 10.000 30.000 10.387 40.031 50.754 30.000 10.000 20.151 20.135 20.056 40.000 20.000 10.582 40.589 50.500 20.815 21.000 10.903 10.000 30.097 20.588 40.000 30.000 30.234 30.000 30.500 30.400 10.682 40.156 30.159 40.750 10.046 30.125 40.660 30.000 20.200 20.000 50.000 10.000 30.000 10.164 30.402 30.500 20.373 30.025 30.143 50.426 30.317 20.000 10.000 10.000 30.000 30.063 30.000 30.000 10.000 50.000 40.575 30.250 20.241 20.772 30.000 30.000 10.653 40.034 30.000 30.000 30.000 20.000 31.000 10.561 40.000 20.100 20.500 10.541 40.452 30.000 30.581 30.000 20.364 20.000 10.000 30.000 20.571 20.000 10.000 20.000 10.568 40.511 40.167 30.857 30.000 30.000 30.164 20.112 30.000 50.530 51.000 10.286 30.000 20.125 30.000 30.464 50.706 30.208 40.000 30.125 20.744 40.000 30.500 20.000 10.000 20.000 20.511 30.000 10.000 10.344 20.541 30.068 30.333 20.000 31.000 10.196 40.533 30.318 30.000 40.748 30.000 10.000 20.690 21.000 10.400 30.000 10.000 30.667 30.000 10.333 40.333 20.270 30.399 30.083 4
David Rozenberszki, Or Litany, Angela Dai: Language-Grounded Indoor 3D Semantic Segmentation in the Wild.


ScanNet Benchmark

This table lists the benchmark results for the 3D semantic label scenario.


Method Infoavg ioubathtubbedbookshelfcabinetchaircountercurtaindeskdoorfloorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwallwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Mix3Dpermissive0.781 10.964 10.855 10.843 130.781 30.858 90.575 40.831 240.685 90.714 20.979 10.594 40.310 200.801 10.892 110.841 20.819 30.723 30.940 80.887 20.725 17
Alexey Nekrasov, Jonas Schult, Or Litany, Bastian Leibe, Francis Engelmann: Mix3D: Out-of-Context Data Augmentation for 3D Scenes. 3DV 2021 (Oral)
Swin3Dpermissive0.779 20.861 160.818 110.836 160.790 10.875 20.576 30.905 20.704 30.739 10.969 70.611 10.349 60.756 140.958 10.702 350.805 120.708 60.916 230.898 10.801 1
PPT-SpUNet-Joint0.766 30.932 20.794 250.829 180.751 130.854 110.540 130.903 30.630 250.672 90.963 90.565 150.357 40.788 20.900 70.737 200.802 130.685 120.950 20.887 20.780 2
OctFormerpermissive0.766 30.925 30.808 180.849 70.786 20.846 170.566 60.876 110.690 70.674 80.960 110.576 110.226 560.753 160.904 50.777 70.815 60.722 40.923 190.877 80.776 4
Peng-Shuai Wang: OctFormer: Octree-based Transformers for 3D Point Clouds. SIGGRAPH 2023
CU-Hybrid Net0.764 50.924 40.819 90.840 140.757 90.853 120.580 10.848 180.709 20.643 170.958 140.587 80.295 270.753 160.884 140.758 140.815 60.725 20.927 180.867 140.743 9
OccuSeg+Semantic0.764 50.758 500.796 230.839 150.746 150.907 10.562 70.850 170.680 110.672 90.978 20.610 20.335 100.777 50.819 340.847 10.830 10.691 100.972 10.885 40.727 15
O-CNNpermissive0.762 70.924 40.823 60.844 120.770 50.852 130.577 20.847 200.711 10.640 210.958 140.592 50.217 620.762 110.888 120.758 140.813 80.726 10.932 160.868 130.744 8
Peng-Shuai Wang, Yang Liu, Yu-Xiao Guo, Chun-Yu Sun, Xin Tong: O-CNN: Octree-based Convolutional Neural Networks for 3D Shape Analysis. SIGGRAPH 2017
OA-CNN-L_ScanNet200.756 80.783 380.826 50.858 50.776 40.837 240.548 110.896 60.649 180.675 60.962 100.586 90.335 100.771 70.802 380.770 100.787 250.691 100.936 110.880 70.761 6
PointTransformerV20.752 90.742 570.809 160.872 10.758 80.860 80.552 90.891 80.610 330.687 30.960 110.559 180.304 230.766 90.926 30.767 110.797 160.644 250.942 70.876 110.722 19
Xiaoyang Wu, Yixing Lao, Li Jiang, Xihui Liu, Hengshuang Zhao: Point Transformer V2: Grouped Vector Attention and Partition-based Pooling. NeurIPS 2022
DMF-Net0.752 90.906 80.793 270.802 340.689 300.825 350.556 80.867 130.681 100.602 340.960 110.555 200.365 30.779 40.859 190.747 170.795 200.717 50.917 220.856 220.764 5
PointConvFormer0.749 110.793 350.790 280.807 290.750 140.856 100.524 190.881 100.588 440.642 200.977 40.591 60.274 380.781 30.929 20.804 30.796 170.642 270.947 40.885 40.715 22
Wenxuan Wu, Qi Shan, Li Fuxin: PointConvFormer: Revenge of the Point-based Convolution.
BPNetcopyleft0.749 110.909 60.818 110.811 260.752 110.839 230.485 360.842 210.673 120.644 160.957 160.528 280.305 220.773 60.859 190.788 40.818 50.693 90.916 230.856 220.723 18
Wenbo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia, Tien-Tsin Wong: Bidirectional Projection Network for Cross Dimension Scene Understanding. CVPR 2021 (Oral)
MSP0.748 130.623 830.804 200.859 40.745 160.824 370.501 270.912 10.690 70.685 40.956 170.567 140.320 160.768 80.918 40.720 260.802 130.676 160.921 200.881 60.779 3
StratifiedFormerpermissive0.747 140.901 90.803 210.845 110.757 90.846 170.512 230.825 270.696 60.645 150.956 170.576 110.262 470.744 210.861 180.742 180.770 340.705 70.899 370.860 200.734 10
Xin Lai*, Jianhui Liu*, Li Jiang, Liwei Wang, Hengshuang Zhao, Shu Liu, Xiaojuan Qi, Jiaya Jia: Stratified Transformer for 3D Point Cloud Segmentation. CVPR 2022
VMNetpermissive0.746 150.870 140.838 20.858 50.729 200.850 150.501 270.874 120.587 450.658 130.956 170.564 160.299 240.765 100.900 70.716 290.812 90.631 320.939 90.858 210.709 23
Zeyu HU, Xuyang Bai, Jiaxiang Shang, Runze Zhang, Jiayu Dong, Xin Wang, Guangyuan Sun, Hongbo Fu, Chiew-Lan Tai: VMNet: Voxel-Mesh Network for Geodesic-Aware 3D Semantic Segmentation. ICCV 2021 (Oral)
Virtual MVFusion0.746 150.771 440.819 90.848 90.702 280.865 70.397 740.899 40.699 40.664 120.948 450.588 70.330 120.746 200.851 260.764 120.796 170.704 80.935 120.866 160.728 13
Abhijit Kundu, Xiaoqi Yin, Alireza Fathi, David Ross, Brian Brewington, Thomas Funkhouser, Caroline Pantofaru: Virtual Multi-view Fusion for 3D Semantic Segmentation. ECCV 2020
EQ-Net0.743 170.620 840.799 220.849 70.730 190.822 390.493 340.897 50.664 130.681 50.955 210.562 170.378 10.760 120.903 60.738 190.801 150.673 180.907 290.877 80.745 7
Zetong Yang*, Li Jiang*, Yanan Sun, Bernt Schiele, Jiaya JIa: A Unified Query-based Paradigm for Point Cloud Understanding. CVPR 2022
LRPNet0.742 180.816 300.806 190.807 290.752 110.828 330.575 40.839 230.699 40.637 220.954 260.520 300.320 160.755 150.834 300.760 130.772 310.676 160.915 250.862 180.717 21
SAT0.742 180.860 170.765 390.819 210.769 60.848 160.533 150.829 250.663 140.631 230.955 210.586 90.274 380.753 160.896 90.729 210.760 410.666 200.921 200.855 240.733 11
TXC0.740 200.842 220.832 40.805 330.715 240.846 170.473 380.885 90.615 290.671 110.971 60.547 210.320 160.697 250.799 400.777 70.819 30.682 140.946 50.871 120.696 27
LargeKernel3D0.739 210.909 60.820 80.806 310.740 170.852 130.545 120.826 260.594 430.643 170.955 210.541 230.263 460.723 230.858 210.775 90.767 350.678 150.933 140.848 290.694 28
Yukang Chen*, Jianhui Liu*, Xiangyu Zhang, Xiaojuan Qi, Jiaya Jia: LargeKernel3D: Scaling up Kernels in 3D Sparse CNNs. CVPR 2023
MinkowskiNetpermissive0.736 220.859 180.818 110.832 170.709 260.840 220.521 210.853 160.660 160.643 170.951 350.544 220.286 320.731 220.893 100.675 430.772 310.683 130.874 550.852 270.727 15
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019
IPCA0.731 230.890 100.837 30.864 20.726 210.873 30.530 180.824 280.489 770.647 140.978 20.609 30.336 90.624 400.733 490.758 140.776 290.570 560.949 30.877 80.728 13
Retro-FPN0.725 240.827 270.809 160.862 30.711 250.836 260.443 600.893 70.579 500.675 60.956 170.519 310.298 250.606 440.766 430.782 50.792 230.644 250.914 260.867 140.722 19
PointTransformer++0.725 240.727 650.811 150.819 210.765 70.841 210.502 260.814 330.621 280.623 250.955 210.556 190.284 330.620 410.866 160.781 60.757 440.648 230.932 160.862 180.709 23
SparseConvNet0.725 240.647 800.821 70.846 100.721 220.869 40.533 150.754 470.603 390.614 270.955 210.572 130.325 140.710 240.870 150.724 240.823 20.628 330.934 130.865 170.683 31
MatchingNet0.724 270.812 320.812 140.810 270.735 180.834 280.495 330.860 150.572 520.602 340.954 260.512 330.280 350.757 130.845 280.725 230.780 270.606 420.937 100.851 280.700 26
INS-Conv-semantic0.717 280.751 530.759 420.812 250.704 270.868 50.537 140.842 210.609 350.608 300.953 290.534 250.293 280.616 420.864 170.719 280.793 210.640 280.933 140.845 330.663 36
PointMetaBase0.714 290.835 230.785 290.821 190.684 320.846 170.531 170.865 140.614 300.596 380.953 290.500 360.246 520.674 260.888 120.692 360.764 370.624 340.849 700.844 340.675 33
contrastBoundarypermissive0.705 300.769 470.775 340.809 280.687 310.820 420.439 620.812 340.661 150.591 400.945 530.515 320.171 800.633 370.856 220.720 260.796 170.668 190.889 440.847 300.689 29
Liyao Tang, Yibing Zhan, Zhe Chen, Baosheng Yu, Dacheng Tao: Contrastive Boundary Learning for Point Cloud Segmentation. CVPR2022
RFCR0.702 310.889 110.745 510.813 240.672 350.818 460.493 340.815 320.623 260.610 280.947 470.470 460.249 510.594 460.848 270.705 330.779 280.646 240.892 420.823 400.611 50
Jingyu Gong, Jiachen Xu, Xin Tan, Haichuan Song, Yanyun Qu, Yuan Xie, Lizhuang Ma: Omni-Supervised Point Cloud Segmentation via Gradual Receptive Field Component Reasoning. CVPR2021
One Thing One Click0.701 320.825 280.796 230.723 520.716 230.832 290.433 640.816 300.634 230.609 290.969 70.418 710.344 70.559 580.833 310.715 300.808 110.560 610.902 340.847 300.680 32
JSENetpermissive0.699 330.881 130.762 400.821 190.667 360.800 590.522 200.792 390.613 310.607 310.935 730.492 380.205 670.576 510.853 240.691 370.758 430.652 220.872 580.828 370.649 40
Zeyu HU, Mingmin Zhen, Xuyang BAI, Hongbo Fu, Chiew-lan Tai: JSENet: Joint Semantic Segmentation and Edge Detection Network for 3D Point Clouds. ECCV 2020
One-Thing-One-Click0.693 340.743 560.794 250.655 760.684 320.822 390.497 320.719 570.622 270.617 260.977 40.447 580.339 80.750 190.664 650.703 340.790 240.596 460.946 50.855 240.647 41
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
PicassoNet-IIpermissive0.692 350.732 610.772 350.786 380.677 340.866 60.517 220.848 180.509 690.626 240.952 330.536 240.225 580.545 640.704 560.689 400.810 100.564 600.903 330.854 260.729 12
Huan Lei, Naveed Akhtar, Mubarak Shah, and Ajmal Mian: Geometric feature learning for 3D meshes.
Feature_GeometricNetpermissive0.690 360.884 120.754 460.795 370.647 420.818 460.422 660.802 370.612 320.604 320.945 530.462 490.189 750.563 570.853 240.726 220.765 360.632 310.904 310.821 430.606 54
Kangcheng Liu, Ben M. Chen: https://arxiv.org/abs/2012.09439. arXiv Preprint
FusionNet0.688 370.704 700.741 550.754 490.656 380.829 310.501 270.741 520.609 350.548 470.950 390.522 290.371 20.633 370.756 440.715 300.771 330.623 350.861 660.814 450.658 37
Feihu Zhang, Jin Fang, Benjamin Wah, Philip Torr: Deep FusionNet for Point Cloud Semantic Segmentation. ECCV 2020
Feature-Geometry Netpermissive0.685 380.866 150.748 480.819 210.645 440.794 620.450 500.802 370.587 450.604 320.945 530.464 480.201 700.554 600.840 290.723 250.732 530.602 440.907 290.822 420.603 57
KP-FCNN0.684 390.847 210.758 440.784 400.647 420.814 490.473 380.772 420.605 370.594 390.935 730.450 560.181 780.587 470.805 370.690 380.785 260.614 380.882 480.819 440.632 46
H. Thomas, C. Qi, J. Deschaud, B. Marcotegui, F. Goulette, L. Guibas.: KPConv: Flexible and Deformable Convolution for Point Clouds. ICCV 2019
DGNet0.684 390.712 690.784 300.782 420.658 370.835 270.499 310.823 290.641 200.597 370.950 390.487 390.281 340.575 520.619 680.647 560.764 370.620 370.871 610.846 320.688 30
VACNN++0.684 390.728 640.757 450.776 430.690 290.804 560.464 440.816 300.577 510.587 410.945 530.508 350.276 370.671 270.710 540.663 480.750 470.589 510.881 490.832 360.653 39
Superpoint Network0.683 420.851 200.728 590.800 360.653 400.806 540.468 410.804 350.572 520.602 340.946 500.453 550.239 550.519 690.822 320.689 400.762 400.595 480.895 400.827 380.630 47
PointContrast_LA_SEM0.683 420.757 510.784 300.786 380.639 460.824 370.408 690.775 410.604 380.541 490.934 770.532 260.269 420.552 610.777 410.645 590.793 210.640 280.913 270.824 390.671 34
VI-PointConv0.676 440.770 460.754 460.783 410.621 500.814 490.552 90.758 450.571 540.557 450.954 260.529 270.268 440.530 670.682 600.675 430.719 560.603 430.888 450.833 350.665 35
Xingyi Li, Wenxuan Wu, Xiaoli Z. Fern, Li Fuxin: The Devils in the Point Clouds: Studying the Robustness of Point Cloud Convolutions.
ROSMRF3D0.673 450.789 360.748 480.763 470.635 480.814 490.407 710.747 490.581 490.573 420.950 390.484 400.271 410.607 430.754 450.649 530.774 300.596 460.883 470.823 400.606 54
SALANet0.670 460.816 300.770 370.768 450.652 410.807 530.451 470.747 490.659 170.545 480.924 830.473 450.149 900.571 540.811 360.635 620.746 480.623 350.892 420.794 570.570 67
PointConvpermissive0.666 470.781 390.759 420.699 610.644 450.822 390.475 370.779 400.564 570.504 650.953 290.428 650.203 690.586 490.754 450.661 490.753 450.588 520.902 340.813 470.642 42
Wenxuan Wu, Zhongang Qi, Li Fuxin: PointConv: Deep Convolutional Networks on 3D Point Clouds. CVPR 2019
PointASNLpermissive0.666 470.703 710.781 320.751 510.655 390.830 300.471 400.769 430.474 800.537 510.951 350.475 440.279 360.635 350.698 590.675 430.751 460.553 660.816 770.806 490.703 25
Xu Yan, Chaoda Zheng, Zhen Li, Sheng Wang, Shuguang Cui: PointASNL: Robust Point Clouds Processing using Nonlocal Neural Networks with Adaptive Sampling. CVPR 2020
PPCNN++permissive0.663 490.746 540.708 630.722 530.638 470.820 420.451 470.566 840.599 410.541 490.950 390.510 340.313 190.648 320.819 340.616 670.682 720.590 500.869 620.810 480.656 38
Pyunghwan Ahn, Juyoung Yang, Eojindl Yi, Chanho Lee, Junmo Kim: Projection-based Point Convolution for Efficient Point Cloud Segmentation. IEEE Access
DCM-Net0.658 500.778 400.702 660.806 310.619 510.813 520.468 410.693 650.494 720.524 570.941 640.449 570.298 250.510 710.821 330.675 430.727 550.568 580.826 750.803 510.637 44
Jonas Schult*, Francis Engelmann*, Theodora Kontogianni, Bastian Leibe: DualConvMesh-Net: Joint Geodesic and Euclidean Convolutions on 3D Meshes. CVPR 2020 [Oral]
HPGCNN0.656 510.698 720.743 530.650 770.564 680.820 420.505 250.758 450.631 240.479 700.945 530.480 420.226 560.572 530.774 420.690 380.735 510.614 380.853 690.776 720.597 60
Jisheng Dang, Qingyong Hu, Yulan Guo, Jun Yang: HPGCNN.
SAFNet-segpermissive0.654 520.752 520.734 570.664 740.583 630.815 480.399 730.754 470.639 210.535 530.942 620.470 460.309 210.665 280.539 740.650 520.708 610.635 300.857 680.793 590.642 42
Linqing Zhao, Jiwen Lu, Jie Zhou: Similarity-Aware Fusion Network for 3D Semantic Segmentation. IROS 2021
RandLA-Netpermissive0.645 530.778 400.731 580.699 610.577 640.829 310.446 520.736 530.477 790.523 590.945 530.454 530.269 420.484 780.749 480.618 650.738 490.599 450.827 740.792 620.621 49
MVPNetpermissive0.641 540.831 240.715 610.671 710.590 590.781 680.394 750.679 670.642 190.553 460.937 700.462 490.256 480.649 310.406 870.626 630.691 690.666 200.877 510.792 620.608 53
Maximilian Jaritz, Jiayuan Gu, Hao Su: Multi-view PointNet for 3D Scene Understanding. GMDL Workshop, ICCV 2019
PointConv-SFPN0.641 540.776 420.703 650.721 540.557 710.826 340.451 470.672 690.563 580.483 690.943 610.425 680.162 850.644 330.726 500.659 500.709 600.572 550.875 530.786 670.559 72
PointMRNet0.640 560.717 680.701 670.692 640.576 650.801 580.467 430.716 580.563 580.459 750.953 290.429 640.169 820.581 500.854 230.605 680.710 580.550 670.894 410.793 590.575 65
FPConvpermissive0.639 570.785 370.760 410.713 590.603 540.798 600.392 760.534 890.603 390.524 570.948 450.457 510.250 500.538 650.723 520.598 720.696 670.614 380.872 580.799 520.567 69
Yiqun Lin, Zizheng Yan, Haibin Huang, Dong Du, Ligang Liu, Shuguang Cui, Xiaoguang Han: FPConv: Learning Local Flattening for Point Convolution. CVPR 2020
PD-Net0.638 580.797 340.769 380.641 820.590 590.820 420.461 450.537 880.637 220.536 520.947 470.388 790.206 660.656 290.668 630.647 560.732 530.585 530.868 630.793 590.473 91
PointSPNet0.637 590.734 600.692 740.714 580.576 650.797 610.446 520.743 510.598 420.437 800.942 620.403 750.150 890.626 390.800 390.649 530.697 660.557 640.846 710.777 710.563 70
SConv0.636 600.830 250.697 700.752 500.572 670.780 700.445 540.716 580.529 630.530 540.951 350.446 590.170 810.507 730.666 640.636 610.682 720.541 720.886 460.799 520.594 61
Supervoxel-CNN0.635 610.656 780.711 620.719 550.613 520.757 790.444 570.765 440.534 620.566 430.928 810.478 430.272 400.636 340.531 760.664 470.645 820.508 800.864 650.792 620.611 50
joint point-basedpermissive0.634 620.614 850.778 330.667 730.633 490.825 350.420 670.804 350.467 820.561 440.951 350.494 370.291 290.566 550.458 820.579 790.764 370.559 630.838 720.814 450.598 59
Hung-Yueh Chiang, Yen-Liang Lin, Yueh-Cheng Liu, Winston H. Hsu: A Unified Point-Based Framework for 3D Segmentation. 3DV 2019
PointMTL0.632 630.731 620.688 770.675 680.591 580.784 670.444 570.565 850.610 330.492 670.949 430.456 520.254 490.587 470.706 550.599 710.665 780.612 410.868 630.791 660.579 64
APCF-Net0.631 640.742 570.687 790.672 690.557 710.792 650.408 690.665 700.545 600.508 620.952 330.428 650.186 760.634 360.702 570.620 640.706 620.555 650.873 560.798 540.581 63
Haojia, Lin: Adaptive Pyramid Context Fusion for Point Cloud Perception. GRSL
3DSM_DMMF0.631 640.626 820.745 510.801 350.607 530.751 800.506 240.729 560.565 560.491 680.866 970.434 600.197 730.595 450.630 670.709 320.705 630.560 610.875 530.740 820.491 86
PointNet2-SFPN0.631 640.771 440.692 740.672 690.524 760.837 240.440 610.706 630.538 610.446 770.944 590.421 700.219 610.552 610.751 470.591 750.737 500.543 710.901 360.768 740.557 73
FusionAwareConv0.630 670.604 870.741 550.766 460.590 590.747 810.501 270.734 540.503 710.527 550.919 870.454 530.323 150.550 630.420 860.678 420.688 700.544 690.896 390.795 560.627 48
Jiazhao Zhang, Chenyang Zhu, Lintao Zheng, Kai Xu: Fusion-Aware Point Convolution for Online Semantic 3D Scene Segmentation. CVPR 2020
DenSeR0.628 680.800 330.625 890.719 550.545 740.806 540.445 540.597 790.448 860.519 600.938 690.481 410.328 130.489 770.499 810.657 510.759 420.592 490.881 490.797 550.634 45
SegGroup_sempermissive0.627 690.818 290.747 500.701 600.602 550.764 760.385 800.629 760.490 750.508 620.931 800.409 730.201 700.564 560.725 510.618 650.692 680.539 730.873 560.794 570.548 76
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
SIConv0.625 700.830 250.694 720.757 480.563 690.772 740.448 510.647 730.520 650.509 610.949 430.431 630.191 740.496 750.614 690.647 560.672 760.535 750.876 520.783 680.571 66
HPEIN0.618 710.729 630.668 800.647 790.597 570.766 750.414 680.680 660.520 650.525 560.946 500.432 610.215 630.493 760.599 700.638 600.617 870.570 560.897 380.806 490.605 56
Li Jiang, Hengshuang Zhao, Shu Liu, Xiaoyong Shen, Chi-Wing Fu, Jiaya Jia: Hierarchical Point-Edge Interaction Network for Point Cloud Semantic Segmentation. ICCV 2019
SPH3D-GCNpermissive0.610 720.858 190.772 350.489 940.532 750.792 650.404 720.643 750.570 550.507 640.935 730.414 720.046 990.510 710.702 570.602 700.705 630.549 680.859 670.773 730.534 79
Huan Lei, Naveed Akhtar, and Ajmal Mian: Spherical Kernel for Efficient Graph Convolution on 3D Point Clouds. TPAMI 2020
AttAN0.609 730.760 490.667 810.649 780.521 770.793 630.457 460.648 720.528 640.434 820.947 470.401 760.153 880.454 800.721 530.648 550.717 570.536 740.904 310.765 750.485 87
Gege Zhang, Qinghua Ma, Licheng Jiao, Fang Liu and Qigong Sun: AttAN: Attention Adversarial Networks for 3D Point Cloud Semantic Segmentation. IJCAI2020
wsss-transformer0.600 740.634 810.743 530.697 630.601 560.781 680.437 630.585 820.493 730.446 770.933 780.394 770.011 1010.654 300.661 660.603 690.733 520.526 760.832 730.761 770.480 88
dtc_net0.596 750.683 730.725 600.715 570.549 730.803 570.444 570.647 730.493 730.495 660.941 640.409 730.000 1030.424 850.544 730.598 720.703 650.522 770.912 280.792 620.520 82
LAP-D0.594 760.720 660.692 740.637 830.456 860.773 730.391 780.730 550.587 450.445 790.940 670.381 800.288 300.434 830.453 840.591 750.649 800.581 540.777 810.749 810.610 52
DPC0.592 770.720 660.700 680.602 870.480 820.762 780.380 810.713 610.585 480.437 800.940 670.369 820.288 300.434 830.509 800.590 770.639 850.567 590.772 820.755 790.592 62
Francis Engelmann, Theodora Kontogianni, Bastian Leibe: Dilated Point Convolutions: On the Receptive Field Size of Point Convolutions on 3D Point Clouds. ICRA 2020
CCRFNet0.589 780.766 480.659 840.683 660.470 850.740 830.387 790.620 780.490 750.476 710.922 850.355 850.245 530.511 700.511 790.571 800.643 830.493 840.872 580.762 760.600 58
ROSMRF0.580 790.772 430.707 640.681 670.563 690.764 760.362 830.515 900.465 830.465 740.936 720.427 670.207 650.438 810.577 710.536 830.675 750.486 850.723 880.779 690.524 81
SD-DETR0.576 800.746 540.609 930.445 980.517 780.643 940.366 820.714 600.456 840.468 730.870 960.432 610.264 450.558 590.674 610.586 780.688 700.482 860.739 860.733 840.537 78
SQN_0.1%0.569 810.676 750.696 710.657 750.497 790.779 710.424 650.548 860.515 670.376 870.902 940.422 690.357 40.379 880.456 830.596 740.659 790.544 690.685 910.665 950.556 74
TextureNetpermissive0.566 820.672 770.664 820.671 710.494 800.719 840.445 540.678 680.411 920.396 850.935 730.356 840.225 580.412 860.535 750.565 810.636 860.464 880.794 800.680 920.568 68
Jingwei Huang, Haotian Zhang, Li Yi, Thomas Funkerhouser, Matthias Niessner, Leonidas Guibas: TextureNet: Consistent Local Parametrizations for Learning from High-Resolution Signals on Meshes. CVPR
DVVNet0.562 830.648 790.700 680.770 440.586 620.687 880.333 870.650 710.514 680.475 720.906 910.359 830.223 600.340 900.442 850.422 940.668 770.501 810.708 890.779 690.534 79
Pointnet++ & Featurepermissive0.557 840.735 590.661 830.686 650.491 810.744 820.392 760.539 870.451 850.375 880.946 500.376 810.205 670.403 870.356 900.553 820.643 830.497 820.824 760.756 780.515 83
GMLPs0.538 850.495 950.693 730.647 790.471 840.793 630.300 900.477 910.505 700.358 890.903 930.327 880.081 960.472 790.529 770.448 920.710 580.509 780.746 840.737 830.554 75
PanopticFusion-label0.529 860.491 960.688 770.604 860.386 910.632 950.225 1000.705 640.434 890.293 950.815 980.348 860.241 540.499 740.669 620.507 850.649 800.442 940.796 790.602 980.561 71
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
subcloud_weak0.516 870.676 750.591 960.609 840.442 870.774 720.335 860.597 790.422 910.357 900.932 790.341 870.094 950.298 920.528 780.473 900.676 740.495 830.602 970.721 870.349 98
Online SegFusion0.515 880.607 860.644 870.579 890.434 880.630 960.353 840.628 770.440 870.410 830.762 1010.307 900.167 830.520 680.403 880.516 840.565 900.447 920.678 920.701 890.514 84
Davide Menini, Suryansh Kumar, Martin R. Oswald, Erik Sandstroem, Cristian Sminchisescu, Luc van Gool: A Real-Time Learning Framework for Joint 3D Reconstruction and Semantic Segmentation. Robotics and Automation Letters Submission
3DMV, FTSDF0.501 890.558 910.608 940.424 1000.478 830.690 870.246 960.586 810.468 810.450 760.911 890.394 770.160 860.438 810.212 970.432 930.541 950.475 870.742 850.727 850.477 89
PCNN0.498 900.559 900.644 870.560 910.420 900.711 860.229 980.414 920.436 880.352 910.941 640.324 890.155 870.238 970.387 890.493 860.529 960.509 780.813 780.751 800.504 85
3DMV0.484 910.484 970.538 980.643 810.424 890.606 990.310 880.574 830.433 900.378 860.796 990.301 910.214 640.537 660.208 980.472 910.507 990.413 970.693 900.602 980.539 77
Angela Dai, Matthias Niessner: 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation. ECCV'18
PointCNN with RGBpermissive0.458 920.577 890.611 920.356 1020.321 990.715 850.299 920.376 960.328 990.319 930.944 590.285 930.164 840.216 1000.229 950.484 880.545 940.456 900.755 830.709 880.475 90
Yangyan Li, Rui Bu, Mingchao Sun, Baoquan Chen: PointCNN. NeurIPS 2018
FCPNpermissive0.447 930.679 740.604 950.578 900.380 920.682 890.291 930.106 1020.483 780.258 1000.920 860.258 970.025 1000.231 990.325 910.480 890.560 920.463 890.725 870.666 940.231 102
Dario Rethage, Johanna Wald, Jürgen Sturm, Nassir Navab, Federico Tombari: Fully-Convolutional Point Networks for Large-Scale Point Clouds. ECCV 2018
DGCNN_reproducecopyleft0.446 940.474 980.623 900.463 960.366 940.651 920.310 880.389 950.349 970.330 920.937 700.271 950.126 920.285 930.224 960.350 990.577 890.445 930.625 950.723 860.394 94
Yue Wang, Yongbin Sun, Ziwei Liu, Sanjay E. Sarma, Michael M. Bronstein, Justin M. Solomon: Dynamic Graph CNN for Learning on Point Clouds. TOG 2019
PNET20.442 950.548 920.548 970.597 880.363 950.628 970.300 900.292 970.374 940.307 940.881 950.268 960.186 760.238 970.204 990.407 950.506 1000.449 910.667 930.620 970.462 92
SurfaceConvPF0.442 950.505 940.622 910.380 1010.342 970.654 910.227 990.397 940.367 950.276 970.924 830.240 980.198 720.359 890.262 930.366 960.581 880.435 950.640 940.668 930.398 93
Hao Pan, Shilin Liu, Yang Liu, Xin Tong: Convolutional Neural Networks on 3D Surfaces Using Parallel Frames.
Tangent Convolutionspermissive0.438 970.437 1000.646 860.474 950.369 930.645 930.353 840.258 990.282 1010.279 960.918 880.298 920.147 910.283 940.294 920.487 870.562 910.427 960.619 960.633 960.352 97
Maxim Tatarchenko, Jaesik Park, Vladlen Koltun, Qian-Yi Zhou: Tangent convolutions for dense prediction in 3d. CVPR 2018
3DWSSS0.425 980.525 930.647 850.522 920.324 980.488 1020.077 1030.712 620.353 960.401 840.636 1030.281 940.176 790.340 900.565 720.175 1030.551 930.398 980.370 1030.602 980.361 96
SPLAT Netcopyleft0.393 990.472 990.511 990.606 850.311 1000.656 900.245 970.405 930.328 990.197 1010.927 820.227 1000.000 1030.001 1040.249 940.271 1020.510 970.383 1000.593 980.699 900.267 100
Hang Su, Varun Jampani, Deqing Sun, Subhransu Maji, Evangelos Kalogerakis, Ming-Hsuan Yang, Jan Kautz: SPLATNet: Sparse Lattice Networks for Point Cloud Processing. CVPR 2018
ScanNet+FTSDF0.383 1000.297 1020.491 1000.432 990.358 960.612 980.274 940.116 1010.411 920.265 980.904 920.229 990.079 970.250 950.185 1000.320 1000.510 970.385 990.548 990.597 1010.394 94
PointNet++permissive0.339 1010.584 880.478 1010.458 970.256 1020.360 1030.250 950.247 1000.278 1020.261 990.677 1020.183 1010.117 930.212 1010.145 1020.364 970.346 1030.232 1030.548 990.523 1020.252 101
Charles R. Qi, Li Yi, Hao Su, Leonidas J. Guibas: pointnet++: deep hierarchical feature learning on point sets in a metric space.
SSC-UNetpermissive0.308 1020.353 1010.290 1030.278 1030.166 1030.553 1000.169 1020.286 980.147 1030.148 1030.908 900.182 1020.064 980.023 1030.018 1040.354 980.363 1010.345 1010.546 1010.685 910.278 99
ScanNetpermissive0.306 1030.203 1030.366 1020.501 930.311 1000.524 1010.211 1010.002 1040.342 980.189 1020.786 1000.145 1030.102 940.245 960.152 1010.318 1010.348 1020.300 1020.460 1020.437 1030.182 103
Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner: ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. CVPR'17
ERROR0.054 1040.000 1040.041 1040.172 1040.030 1040.062 1040.001 1040.035 1030.004 1040.051 1040.143 1040.019 1040.003 1020.041 1020.050 1030.003 1040.054 1040.018 1040.005 1040.264 1040.082 104


This table lists the benchmark results for the 3D semantic instance scenario.




Method Infoavg ap 50%bathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Queryformer0.787 11.000 10.933 10.601 340.754 10.886 40.558 20.661 250.767 30.665 40.716 30.639 110.808 31.000 10.844 10.897 20.804 21.000 10.624 2
Mask3D0.780 21.000 10.786 270.716 250.696 50.885 50.500 40.714 180.810 20.672 30.715 40.679 70.809 21.000 10.831 20.833 80.787 41.000 10.602 6
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
SPFormerpermissive0.770 30.903 390.903 20.806 130.609 170.886 30.568 10.815 60.705 70.711 10.655 60.652 100.685 111.000 10.789 40.809 140.776 71.000 10.583 11
Sun Jiahao, Qing Chunmei, Tan Junpeng, Xu Xiangmin: Superpoint Transformer for 3D Scene Instance Segmentation. AAAI 2023 [Oral]
SoftGroup++0.769 41.000 10.803 200.937 10.684 60.865 70.213 200.870 20.664 90.571 100.758 10.702 40.807 41.000 10.653 160.902 10.792 31.000 10.626 1
ISBNetpermissive0.763 51.000 10.873 50.717 240.666 90.858 110.508 30.667 230.764 40.643 50.676 50.688 60.825 11.000 10.773 50.741 270.777 61.000 10.556 17
Tuan Duc Ngo, Binh-Son Hua, Khoi Nguyen: ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution. CVPR 2023
SoftGrouppermissive0.761 61.000 10.808 170.845 80.716 20.862 90.243 170.824 40.655 110.620 60.734 20.699 50.791 60.981 250.716 80.844 50.769 81.000 10.594 9
Thang Vu, Kookhoi Kim, Tung M. Luu, Xuan Thanh Nguyen, Chang D. Yoo: SoftGroup for 3D Instance Segmentaiton on Point Clouds. CVPR 2022 [Oral]
TD3D0.751 71.000 10.774 280.867 70.621 130.934 10.404 70.706 190.812 10.605 80.633 110.626 120.690 101.000 10.640 180.820 110.777 51.000 10.612 4
PBNetpermissive0.747 81.000 10.818 130.837 100.713 30.844 120.457 60.647 280.711 60.614 70.617 130.657 90.650 131.000 10.692 100.822 100.765 101.000 10.595 8
GraphCut0.732 91.000 10.788 250.724 230.642 110.859 100.248 160.787 110.618 140.596 90.653 80.722 20.583 311.000 10.766 60.861 30.825 11.000 10.504 23
IPCA-Inst0.731 101.000 10.788 260.884 60.698 40.788 270.252 150.760 130.646 120.511 180.637 100.665 80.804 51.000 10.644 170.778 170.747 121.000 10.561 15
TopoSeg0.725 111.000 10.806 190.933 20.668 80.758 300.272 140.734 170.630 130.549 140.654 70.606 130.697 90.966 270.612 220.839 60.754 111.000 10.573 12
DKNet0.718 121.000 10.814 140.782 160.619 140.872 60.224 180.751 150.569 180.677 20.585 160.724 10.633 230.981 250.515 320.819 120.736 131.000 10.617 3
Yizheng Wu, Min Shi, Shuaiyuan Du, Hao Lu, Zhiguo Cao, Weicai Zhong: 3D Instances as 1D Kernels. ECCV 2022
SSEC0.707 131.000 10.850 70.924 30.648 100.747 330.162 220.862 30.572 170.520 160.624 120.549 160.649 211.000 10.560 270.706 330.768 91.000 10.591 10
HAISpermissive0.699 141.000 10.849 80.820 110.675 70.808 210.279 120.757 140.465 230.517 170.596 140.559 150.600 251.000 10.654 150.767 190.676 170.994 350.560 16
Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang: Hierarchical Aggregation for 3D Instance Segmentation. ICCV 2021
SSTNetpermissive0.698 151.000 10.697 440.888 50.556 230.803 220.387 80.626 300.417 270.556 130.585 170.702 30.600 251.000 10.824 30.720 320.692 151.000 10.509 22
Zhihao Liang, Zhihao Li, Songcen Xu, Mingkui Tan, Kui Jia: Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks. ICCV2021
DualGroup0.694 161.000 10.799 220.811 120.622 120.817 160.376 90.805 90.590 160.487 210.568 200.525 200.650 130.835 380.600 230.829 90.655 191.000 10.526 19
SphereSeg0.680 171.000 10.856 60.744 220.618 150.893 20.151 230.651 270.713 50.537 150.579 190.430 290.651 121.000 10.389 410.744 260.697 140.991 370.601 7
Box2Mask0.677 181.000 10.847 90.771 180.509 310.816 170.277 130.558 370.482 200.562 120.640 90.448 250.700 71.000 10.666 110.852 40.578 310.997 300.488 27
Julian Chibane, Francis Engelmann, Tuan Anh Tran, Gerard Pons-Moll: Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation Using Bounding Boxes. ECCV 2022
OccuSeg+instance0.672 191.000 10.758 360.682 280.576 210.842 130.477 50.504 410.524 190.567 110.585 180.451 240.557 321.000 10.751 70.797 150.563 341.000 10.467 31
Lei Han, Tian Zheng, Lan Xu, Lu Fang: OccuSeg: Occupancy-aware 3D Instance Segmentation. CVPR2020
Mask-Group0.664 201.000 10.822 120.764 210.616 160.815 180.139 270.694 210.597 150.459 250.566 210.599 140.600 250.516 480.715 90.819 130.635 231.000 10.603 5
Min Zhong, Xinghao Chen, Xiaokang Chen, Gang Zeng, Yunhe Wang: MaskGroup: Hierarchical Point Grouping and Masking for 3D Instance Segmentation. ICME 2022
INS-Conv-instance0.657 211.000 10.760 340.667 300.581 190.863 80.323 100.655 260.477 210.473 230.549 230.432 280.650 131.000 10.655 140.738 280.585 300.944 410.472 30
CSC-Pretrained0.648 221.000 10.810 150.768 190.523 290.813 190.143 260.819 50.389 300.422 330.511 270.443 260.650 131.000 10.624 200.732 290.634 241.000 10.375 38
PE0.645 231.000 10.773 300.798 150.538 250.786 280.088 340.799 100.350 340.435 320.547 240.545 170.646 220.933 280.562 260.761 220.556 390.997 300.501 25
Biao Zhang, Peter Wonka: Point Cloud Instance Segmentation using Probabilistic Embeddings. CVPR 2021
RPGN0.643 241.000 10.758 350.582 400.539 240.826 150.046 380.765 120.372 320.436 310.588 150.539 190.650 131.000 10.577 240.750 240.653 210.997 300.495 26
Shichao Dong, Guosheng Lin, Tzu-Yi Hung: Learning Regional Purity for Instance Segmentation on 3D Point Clouds. ECCV 2022
Dyco3Dcopyleft0.641 251.000 10.841 100.893 40.531 270.802 230.115 310.588 350.448 240.438 290.537 260.430 300.550 330.857 300.534 300.764 210.657 180.987 380.568 13
Tong He; Chunhua Shen; Anton van den Hengel: DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution. CVPR2021
GICN0.638 261.000 10.895 40.800 140.480 350.676 370.144 250.737 160.354 330.447 260.400 390.365 350.700 71.000 10.569 250.836 70.599 261.000 10.473 29
PointGroup0.636 271.000 10.765 310.624 320.505 330.797 240.116 300.696 200.384 310.441 270.559 220.476 220.596 281.000 10.666 110.756 230.556 380.997 300.513 21
Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia: PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation. CVPR 2020 [oral]
DD-UNet+Group0.635 280.667 400.797 240.714 260.562 220.774 290.146 240.810 80.429 260.476 220.546 250.399 320.633 231.000 10.632 190.722 310.609 251.000 10.514 20
H. Liu, R. Liu, K. Yang, J. Zhang, K. Peng, R. Stiefelhagen: HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor. ICCVW 2021
DENet0.629 291.000 10.797 230.608 330.589 180.627 410.219 190.882 10.310 360.402 380.383 410.396 330.650 131.000 10.663 130.543 490.691 161.000 10.568 14
3D-MPA0.611 301.000 10.833 110.765 200.526 280.756 310.136 290.588 350.470 220.438 300.432 360.358 360.650 130.857 300.429 370.765 200.557 371.000 10.430 33
Francis Engelmann, Martin Bokeloh, Alireza Fathi, Bastian Leibe, Matthias Nießner: 3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation. CVPR 2020
OSIS0.605 311.000 10.801 210.599 350.535 260.728 350.286 110.436 450.679 80.491 190.433 340.256 380.404 450.857 300.620 210.724 300.510 431.000 10.539 18
AOIA0.601 321.000 10.761 330.687 270.485 340.828 140.008 440.663 240.405 290.405 370.425 370.490 210.596 280.714 410.553 290.779 160.597 270.992 360.424 35
PCJC0.578 331.000 10.810 160.583 390.449 380.813 200.042 390.603 330.341 350.490 200.465 310.410 310.650 130.835 380.264 470.694 370.561 350.889 450.504 24
SSEN0.575 341.000 10.761 320.473 420.477 360.795 250.066 350.529 380.658 100.460 240.461 320.380 340.331 470.859 290.401 400.692 390.653 201.000 10.348 40
Dongsu Zhang, Junha Chun, Sang Kyun Cha, Young Min Kim: Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with Deep Metric Learning. Arxiv
RWSeg0.567 350.528 500.708 430.626 310.580 200.745 340.063 360.627 290.240 400.400 390.497 280.464 230.515 341.000 10.475 340.745 250.571 321.000 10.429 34
NeuralBF0.555 360.667 400.896 30.843 90.517 300.751 320.029 400.519 390.414 280.439 280.465 300.000 560.484 360.857 300.287 450.693 380.651 221.000 10.485 28
Weiwei Sun, Daniel Rebain, Renjie Liao, Vladimir Tankovich, Soroosh Yazdani, Kwang Moo Yi, Andrea Tagliasacchi: NeuralBF: Neural Bilateral Filtering for Top-down Instance Segmentation on Point Clouds. WACV 2023
MTML0.549 371.000 10.807 180.588 380.327 430.647 390.004 460.815 70.180 420.418 340.364 430.182 410.445 391.000 10.442 360.688 400.571 331.000 10.396 36
Jean Lahoud, Bernard Ghanem, Marc Pollefeys, Martin R. Oswald: 3D Instance Segmentation via Multi-task Metric Learning. ICCV 2019 [oral]
One_Thing_One_Clickpermissive0.529 380.667 400.718 390.777 170.399 390.683 360.000 490.669 220.138 450.391 400.374 420.539 180.360 460.641 450.556 280.774 180.593 280.997 300.251 45
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
Sparse R-CNN0.515 391.000 10.538 510.282 450.468 370.790 260.173 210.345 470.429 250.413 360.484 290.176 420.595 300.591 460.522 310.668 410.476 440.986 390.327 41
Occipital-SCS0.512 401.000 10.716 400.509 410.506 320.611 420.092 330.602 340.177 430.346 430.383 400.165 430.442 400.850 370.386 420.618 450.543 400.889 450.389 37
3D-BoNet0.488 411.000 10.672 460.590 370.301 450.484 520.098 320.620 310.306 370.341 440.259 470.125 450.434 420.796 400.402 390.499 510.513 420.909 440.439 32
Bo Yang, Jianan Wang, Ronald Clark, Qingyong Hu, Sen Wang, Andrew Markham, Niki Trigoni: Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds. NeurIPS 2019 Spotlight
PanopticFusion-inst0.478 420.667 400.712 420.595 360.259 480.550 480.000 490.613 320.175 440.250 490.434 330.437 270.411 440.857 300.485 330.591 480.267 540.944 410.359 39
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
SPG_WSIS0.470 430.667 400.685 450.677 290.372 410.562 460.000 490.482 420.244 390.316 460.298 440.052 510.442 410.857 300.267 460.702 340.559 361.000 10.287 43
SALoss-ResNet0.459 441.000 10.737 380.159 550.259 470.587 440.138 280.475 430.217 410.416 350.408 380.128 440.315 480.714 410.411 380.536 500.590 290.873 480.304 42
Zhidong Liang, Ming Yang, Hao Li, Chunxiang Wang: 3D Instance Embedding Learning With a Structure-Aware Loss Function for Point Cloud Segmentation. IEEE Robotics and Automation Letters (IROS2020)
MASCpermissive0.447 450.528 500.555 490.381 430.382 400.633 400.002 470.509 400.260 380.361 420.432 350.327 370.451 380.571 470.367 430.639 430.386 450.980 400.276 44
Chen Liu, Yasutaka Furukawa: MASC: Multi-scale Affinity with Sparse Convolution for 3D Instance Segmentation.
SegGroup_inspermissive0.445 460.667 400.773 290.185 520.317 440.656 380.000 490.407 460.134 460.381 410.267 460.217 400.476 370.714 410.452 350.629 440.514 411.000 10.222 48
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
3D-SISpermissive0.382 471.000 10.432 530.245 470.190 490.577 450.013 430.263 490.033 520.320 450.240 480.075 470.422 430.857 300.117 510.699 350.271 530.883 470.235 47
Ji Hou, Angela Dai, Matthias Niessner: 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans. CVPR 2019
Hier3Dcopyleft0.323 480.667 400.542 500.264 460.157 520.550 470.000 490.205 520.009 530.270 480.218 490.075 470.500 350.688 440.007 570.698 360.301 500.459 540.200 49
Tan: HCFS3D: Hierarchical Coupled Feature Selection Network for 3D Semantic and Instance Segmentation.
UNet-backbone0.319 490.667 400.715 410.233 480.189 500.479 530.008 440.218 500.067 510.201 510.173 500.107 460.123 530.438 490.150 490.615 460.355 460.916 430.093 56
R-PointNet0.306 500.500 520.405 540.311 440.348 420.589 430.054 370.068 550.126 470.283 470.290 450.028 520.219 510.214 520.331 440.396 550.275 510.821 500.245 46
Region-18class0.284 510.250 560.751 370.228 500.270 460.521 490.000 490.468 440.008 550.205 500.127 510.000 560.068 550.070 550.262 480.652 420.323 480.740 510.173 50
SemRegionNet-20cls0.250 520.333 530.613 470.229 490.163 510.493 500.000 490.304 480.107 480.147 530.100 520.052 500.231 490.119 530.039 530.445 530.325 470.654 520.141 52
tmp0.248 530.667 400.437 520.188 510.153 530.491 510.000 490.208 510.094 500.153 520.099 530.057 490.217 520.119 530.039 530.466 520.302 490.640 530.140 53
3D-BEVIS0.248 530.667 400.566 480.076 560.035 570.394 550.027 420.035 560.098 490.099 550.030 560.025 530.098 540.375 510.126 500.604 470.181 550.854 490.171 51
Cathrin Elich, Francis Engelmann, Jonas Schult, Theodora Kontogianni, Bastian Leibe: 3D-BEVIS: Birds-Eye-View Instance Segmentation.
ASIS0.199 550.333 530.253 560.167 540.140 540.438 540.000 490.177 530.008 540.121 540.069 540.004 550.231 500.429 500.036 550.445 540.273 520.333 560.119 55
Sgpn_scannet0.143 560.208 570.390 550.169 530.065 550.275 560.029 410.069 540.000 560.087 560.043 550.014 540.027 570.000 560.112 520.351 560.168 560.438 550.138 54
MaskRCNN 2d->3d Proj0.058 570.333 530.002 570.000 570.053 560.002 570.002 480.021 570.000 560.045 570.024 570.238 390.065 560.000 560.014 560.107 570.020 570.110 570.006 57


This table lists the benchmark results for the 2D semantic label scenario.


Method Infoavg ioubathtubbedbookshelfcabinetchaircountercurtaindeskdoorfloorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwallwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Virtual MVFusion (R)0.745 10.861 10.839 10.881 10.672 10.512 10.422 150.898 10.723 10.714 10.954 20.454 10.509 10.773 10.895 10.756 10.820 10.653 10.935 10.891 10.728 1
Abhijit Kundu, Xiaoqi Yin, Alireza Fathi, David Ross, Brian Brewington, Thomas Funkhouser, Caroline Pantofaru: Virtual Multi-view Fusion for 3D Semantic Segmentation. ECCV 2020
BPNet_2Dcopyleft0.670 20.822 30.795 30.836 20.659 20.481 20.451 110.769 30.656 30.567 30.931 30.395 40.390 40.700 30.534 30.689 90.770 20.574 30.865 60.831 30.675 4
Wenbo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia and Tien-Tsin Wong: Bidirectional Projection Network for Cross Dimension Scene Understanding. CVPR 2021 (Oral)
CU-Hybrid-2D Net0.636 30.825 20.820 20.179 200.648 30.463 30.549 20.742 60.676 20.628 20.961 10.420 20.379 50.684 60.381 150.732 20.723 30.599 20.827 130.851 20.634 6
CMX0.613 40.681 70.725 90.502 120.634 50.297 150.478 90.830 20.651 40.537 60.924 40.375 50.315 120.686 50.451 120.714 40.543 180.504 50.894 40.823 40.688 3
DMMF_3d0.605 50.651 80.744 70.782 30.637 40.387 40.536 30.732 70.590 60.540 50.856 180.359 90.306 130.596 110.539 20.627 180.706 40.497 70.785 180.757 160.476 19
MCA-Net0.595 60.533 170.756 60.746 40.590 80.334 70.506 60.670 120.587 70.500 100.905 80.366 80.352 80.601 100.506 60.669 150.648 70.501 60.839 120.769 120.516 18
RFBNet0.592 70.616 90.758 50.659 50.581 90.330 80.469 100.655 150.543 120.524 70.924 40.355 100.336 100.572 140.479 80.671 130.648 70.480 90.814 160.814 50.614 9
FAN_NV_RVC0.586 80.510 180.764 40.079 230.620 70.330 80.494 70.753 40.573 80.556 40.884 130.405 30.303 140.718 20.452 110.672 120.658 50.509 40.898 30.813 60.727 2
DCRedNet0.583 90.682 60.723 100.542 110.510 170.310 120.451 110.668 130.549 110.520 80.920 60.375 50.446 20.528 170.417 130.670 140.577 150.478 100.862 70.806 70.628 8
MIX6D_RVC0.582 100.695 40.687 140.225 180.632 60.328 100.550 10.748 50.623 50.494 130.890 110.350 120.254 200.688 40.454 100.716 30.597 140.489 80.881 50.768 130.575 12
SSMAcopyleft0.577 110.695 40.716 120.439 140.563 110.314 110.444 130.719 80.551 100.503 90.887 120.346 130.348 90.603 90.353 170.709 50.600 120.457 120.901 20.786 80.599 11
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. International Journal of Computer Vision, 2019
UNIV_CNP_RVC_UE0.566 120.569 160.686 160.435 150.524 140.294 160.421 160.712 90.543 120.463 150.872 140.320 140.363 70.611 80.477 90.686 100.627 90.443 150.862 70.775 110.639 5
EMSAFormer0.564 130.581 130.736 80.564 100.546 130.219 200.517 40.675 110.486 170.427 190.904 90.352 110.320 110.589 120.528 40.708 60.464 210.413 190.847 110.786 80.611 10
SN_RN152pyrx8_RVCcopyleft0.546 140.572 140.663 180.638 70.518 150.298 140.366 210.633 180.510 150.446 170.864 160.296 170.267 170.542 160.346 180.704 70.575 160.431 160.853 100.766 140.630 7
UDSSEG_RVC0.545 150.610 110.661 190.588 80.556 120.268 180.482 80.642 170.572 90.475 140.836 200.312 150.367 60.630 70.189 200.639 170.495 200.452 130.826 140.756 170.541 14
segfomer with 6d0.542 160.594 120.687 140.146 210.579 100.308 130.515 50.703 100.472 180.498 110.868 150.369 70.282 150.589 120.390 140.701 80.556 170.416 180.860 90.759 150.539 16
FuseNetpermissive0.535 170.570 150.681 170.182 190.512 160.290 170.431 140.659 140.504 160.495 120.903 100.308 160.428 30.523 180.365 160.676 110.621 110.470 110.762 190.779 100.541 14
Caner Hazirbas, Lingni Ma, Csaba Domokos, Daniel Cremers: FuseNet: Incorporating Depth into Semantic Segmentation via Fusion-based CNN Architecture. ACCV 2016
AdapNet++copyleft0.503 180.613 100.722 110.418 160.358 230.337 60.370 200.479 210.443 190.368 210.907 70.207 200.213 220.464 210.525 50.618 190.657 60.450 140.788 170.721 200.408 22
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. International Journal of Computer Vision, 2019
3DMV (2d proj)0.498 190.481 210.612 200.579 90.456 190.343 50.384 180.623 190.525 140.381 200.845 190.254 190.264 190.557 150.182 210.581 210.598 130.429 170.760 200.661 220.446 21
Angela Dai, Matthias Niessner: 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation. ECCV'18
MSeg1080_RVCpermissive0.485 200.505 190.709 130.092 220.427 200.241 190.411 170.654 160.385 230.457 160.861 170.053 230.279 160.503 190.481 70.645 160.626 100.365 210.748 210.725 190.529 17
John Lambert*, Zhuang Liu*, Ozan Sener, James Hays, Vladlen Koltun: MSeg: A Composite Dataset for Multi-domain Semantic Segmentation. CVPR 2020
ILC-PSPNet0.475 210.490 200.581 210.289 170.507 180.067 230.379 190.610 200.417 210.435 180.822 220.278 180.267 170.503 190.228 190.616 200.533 190.375 200.820 150.729 180.560 13
Enet (reimpl)0.376 220.264 230.452 230.452 130.365 210.181 210.143 230.456 220.409 220.346 220.769 230.164 210.218 210.359 220.123 230.403 230.381 230.313 230.571 220.685 210.472 20
Re-implementation of Adam Paszke, Abhishek Chaurasia, Sangpil Kim, Eugenio Culurciello: ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation.
ScanNet (2d proj)permissive0.330 230.293 220.521 220.657 60.361 220.161 220.250 220.004 230.440 200.183 230.836 200.125 220.060 230.319 230.132 220.417 220.412 220.344 220.541 230.427 230.109 23
Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner: ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. CVPR'17
DMMF0.003 240.000 240.005 240.000 240.000 240.037 240.001 240.000 240.001 240.005 240.003 240.000 240.000 240.000 240.000 240.000 240.002 240.001 240.000 240.006 240.000 24


This table lists the benchmark results for the 2D semantic instance scenario.




Method Infoavg ap 50%bathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
UniDet_RVC0.358 10.554 10.543 10.128 10.402 10.381 10.200 10.461 10.328 10.138 10.232 10.148 20.466 10.109 10.538 10.506 10.294 10.862 10.159 1
MaskRCNN_ScanNetpermissive0.227 20.228 20.381 20.013 20.237 20.339 20.089 20.339 20.150 20.134 20.143 20.179 10.255 20.053 20.331 20.244 20.154 20.687 20.127 2
Re-implementation of Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross Girshick: Mask R-CNN. ICCV'17


This table lists the benchmark results for the scene type classification scenario.




Method Infoavg recallapartmentbathroombedroom / hotelbookstore / libraryconference roomcopy/mail roomhallwaykitchenlaundry roomliving room / loungemiscofficestorage / basement / garage
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
multi-taskpermissive0.700 10.500 11.000 10.882 20.500 21.000 11.000 10.500 21.000 11.000 10.778 10.000 20.938 10.000 2
Shengyu Huang, Mikhail Usvyatsov, Konrad Schindler: Indoor Scene Recognition in 3D. IROS 2020
3DASPP-SCE0.691 20.500 10.938 20.824 31.000 11.000 10.500 21.000 10.857 20.500 20.556 30.000 20.812 20.500 1
SE-ResNeXt-SSMA0.498 30.000 40.812 30.941 10.500 20.500 30.500 20.500 20.429 40.500 20.667 20.500 10.625 30.000 2
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. arXiv
resnet50_scannet0.353 40.250 30.812 30.529 40.500 20.500 30.000 40.500 20.571 30.000 40.556 30.000 20.375 40.000 2