Presenting the ScanNet200 Benchmark

We present the ScanNet200 benchmark, which studies an order of magnitude more class categories than previous version of ScanNet. The scene geometry is shared within the two tasks, but the parsing of surface annotation allows for a larger vocabulary and more realistic setting for in the wild 3D understanding methods.

The ScanNet200 benchmark includes both finer-grained categories as well as a large number of previously unaddressed classes. This induces a much more challenging setting regarding the diversity of naturally observed semantic classes seen in the raw ScanNet RGB-D observations, where the data also reflects naturally encountered class imbalances. The difference in category frequencies between ScanNet and ScanNet200 can be seen in the Figure above.

ScanNet200 Benchmark

This table lists the benchmark results for the ScanNet200 3D semantic label scenario.




Method Infoavg iouhead ioucommon ioutail ioualarm clockarmchairbackpackbagballbarbasketbathroom cabinetbathroom counterbathroom stallbathroom stall doorbathroom vanitybathtubbedbenchbicyclebinblackboardblanketblindsboardbookbookshelfbottlebowlboxbroombucketbulletin boardcabinetcalendarcandlecartcase of water bottlescd caseceilingceiling lightchairclockclosetcloset doorcloset rodcloset wallclothesclothes dryercoat rackcoffee kettlecoffee makercoffee tablecolumncomputer towercontainercopiercouchcountercratecupcurtaincushiondecorationdeskdining tabledish rackdishwasherdividerdoordoorframedresserdumbbelldustpanend tablefanfile cabinetfire alarmfire extinguisherfireplacefloorfolded chairfurnitureguitarguitar casehair dryerhandicap barhatheadphonesironing boardjacketkeyboardkeyboard pianokitchen cabinetkitchen counterladderlamplaptoplaundry basketlaundry detergentlaundry hamperledgelightlight switchluggagemachinemailboxmatmattressmicrowavemini fridgemirrormonitormousemusic standnightstandobjectoffice chairottomanovenpaperpaper bagpaper cutterpaper towel dispenserpaper towel rollpersonpianopicturepillarpillowpipeplantplateplungerposterpotted plantpower outletpower stripprinterprojectorprojector screenpurserackradiatorrailrange hoodrecycling binrefrigeratorscaleseatshelfshoeshowershower curtainshower curtain rodshower doorshower floorshower headshower wallsignsinksoap dishsoap dispensersofa chairspeakerstair railstairsstandstoolstorage binstorage containerstorage organizerstovestructurestuffed animalsuitcasetabletelephonetissue boxtoastertoaster oventoilettoilet papertoilet paper dispensertoilet paper holdertoilet seat cover dispensertoweltrash bintrash cantraytubetvtv standvacuum cleanerventwallwardrobewashing machinewater bottlewater coolerwater pitcherwhiteboardwindowwindowsill
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
OA-CNN-L_ScanNet2000.333 20.558 10.269 20.124 40.448 60.080 40.272 30.000 10.000 10.000 10.342 50.515 20.524 20.713 80.789 20.158 40.384 30.000 30.806 20.125 20.000 40.496 30.332 20.498 70.227 40.024 20.474 10.000 10.003 20.071 40.487 10.000 30.000 10.110 20.000 20.876 10.013 80.703 10.000 30.076 40.473 40.355 40.906 20.000 10.000 10.476 40.706 10.000 70.672 60.835 50.748 20.015 70.223 20.860 30.000 10.000 40.572 20.000 50.509 30.313 20.662 10.398 50.396 10.411 60.276 10.527 10.711 10.000 20.076 50.946 10.166 20.000 10.022 20.160 10.183 40.493 40.699 30.637 20.403 20.330 50.406 40.526 20.024 10.000 10.392 40.000 50.016 80.000 30.196 20.915 20.112 30.557 30.197 10.352 40.877 20.000 30.000 10.592 60.103 60.000 80.067 10.000 10.089 10.735 30.625 30.130 50.568 20.836 20.271 10.534 30.043 60.799 20.001 20.445 10.000 10.000 20.024 10.661 20.000 10.262 10.000 10.591 30.517 70.373 30.788 30.021 20.000 10.455 10.517 40.320 30.823 40.200 80.001 80.150 30.100 40.000 10.736 20.668 20.103 60.052 30.662 10.720 10.000 10.602 40.112 30.002 30.000 10.637 40.000 20.000 10.621 30.569 10.398 20.412 30.234 30.949 10.363 10.492 70.495 30.251 30.665 30.000 10.001 50.805 20.833 20.794 40.000 10.821 10.314 20.843 50.000 10.560 20.245 20.262 20.713 10.370 5
PPT-SpUNet-F.T.0.332 30.556 20.270 10.123 50.519 10.091 20.349 20.000 10.000 10.000 10.339 60.383 50.498 50.833 20.807 10.241 10.584 20.000 30.755 30.124 30.000 40.608 20.330 30.530 40.314 10.000 40.374 30.000 10.000 30.197 10.459 30.000 30.000 10.117 10.000 20.876 10.095 10.682 20.000 30.086 30.518 20.433 10.930 10.000 10.000 10.563 30.542 50.077 40.715 20.858 30.756 10.008 80.171 40.874 20.000 10.039 10.550 30.000 50.545 20.256 30.657 30.453 10.351 30.449 50.213 20.392 30.611 40.000 20.037 60.946 10.138 50.000 10.000 40.063 30.308 10.537 20.796 10.673 10.323 50.392 30.400 50.509 30.000 20.000 10.649 10.000 50.023 50.000 30.000 30.914 30.002 70.506 70.163 50.359 30.872 30.000 30.000 10.623 20.112 30.001 70.000 40.000 10.021 20.753 10.565 70.150 10.579 10.806 40.267 20.616 10.042 70.783 40.000 30.374 50.000 10.000 20.000 20.620 40.000 10.000 40.000 10.572 60.634 10.350 40.792 20.000 30.000 10.376 40.535 20.378 20.855 10.672 10.074 50.000 40.185 30.000 10.727 30.660 30.076 80.000 60.432 40.646 30.000 10.594 50.006 70.000 40.000 10.658 20.000 20.000 10.661 10.549 30.300 50.291 50.045 50.942 40.304 20.600 30.572 20.135 70.695 10.000 10.008 30.793 30.942 10.899 20.000 10.816 20.181 40.897 10.000 10.679 10.223 30.264 10.691 20.345 6
Xiaoyang Wu, Zhuotao Tian, Xin Wen, Bohao Peng, Xihui Liu, Kaicheng Yu, Hengshuang Zhao: Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt Training.
OctFormer ScanNet200permissive0.326 40.539 40.265 30.131 30.499 20.110 10.522 10.000 10.000 10.000 10.318 80.427 40.455 60.743 60.765 40.175 30.842 10.000 30.828 10.204 10.033 20.429 40.335 10.601 10.312 20.000 40.357 40.000 10.000 30.047 50.423 40.000 30.000 10.105 30.000 20.873 30.079 60.670 40.000 30.117 10.471 50.432 20.829 50.000 10.000 10.584 20.417 80.089 30.684 50.837 40.705 70.021 60.178 30.892 10.000 10.028 20.505 50.000 50.457 40.200 60.662 10.412 30.244 60.496 30.000 80.451 20.626 30.000 20.102 40.943 30.138 50.000 10.000 40.149 20.291 20.534 30.722 20.632 30.331 40.253 70.453 30.487 40.000 20.000 10.479 20.000 50.022 60.000 30.000 30.900 40.128 20.684 10.164 40.413 10.854 50.000 30.000 10.512 80.074 80.003 60.000 40.000 10.000 30.469 60.613 40.132 40.529 30.871 10.227 70.582 20.026 80.787 30.000 30.339 60.000 10.000 20.000 20.626 30.000 10.029 30.000 10.587 40.612 30.411 20.724 50.000 30.000 10.407 30.552 10.513 10.849 20.655 20.408 10.000 40.296 10.000 10.686 60.645 50.145 30.022 40.414 50.633 40.000 10.637 10.224 10.000 40.000 10.650 30.000 20.000 10.622 20.535 40.343 30.483 20.230 40.943 30.289 30.618 20.596 10.140 60.679 20.000 10.022 10.783 40.620 60.906 10.000 10.806 30.137 60.865 20.000 10.378 40.000 70.168 80.680 30.227 7
Peng-Shuai Wang: OctFormer: Octree-based Transformers for 3D Point Clouds. SIGGRAPH 2023
CeCo0.340 10.551 30.247 40.181 10.475 40.057 80.142 70.000 10.000 10.000 10.387 30.463 30.499 40.924 10.774 30.213 20.257 40.000 30.546 80.100 50.006 30.615 10.177 80.534 20.246 30.000 40.400 20.000 10.338 10.006 70.484 20.609 10.000 10.083 40.000 20.873 30.089 40.661 50.000 30.048 80.560 10.408 30.892 30.000 10.000 10.586 10.616 30.000 70.692 40.900 10.721 30.162 10.228 10.860 30.000 10.000 40.575 10.083 20.550 10.347 10.624 40.410 40.360 20.740 10.109 50.321 60.660 20.000 20.121 20.939 40.143 30.000 10.400 10.003 50.190 30.564 10.652 40.615 40.421 10.304 60.579 10.547 10.000 20.000 10.296 50.000 50.030 40.096 10.000 30.916 10.037 40.551 40.171 30.376 20.865 40.286 10.000 10.633 10.102 70.027 40.011 30.000 10.000 30.474 50.742 10.133 30.311 40.824 30.242 40.503 50.068 40.828 10.000 30.429 20.000 10.063 10.000 20.781 10.000 10.000 40.000 10.665 10.633 20.450 10.818 10.000 30.000 10.429 20.532 30.226 40.825 30.510 60.377 20.709 10.079 60.000 10.753 10.683 10.102 70.063 20.401 70.620 60.000 10.619 20.000 80.000 40.000 10.595 60.000 20.000 10.345 50.564 20.411 10.603 10.384 20.945 20.266 40.643 10.367 50.304 10.663 40.000 10.010 20.726 60.767 30.898 30.000 10.784 40.435 10.861 40.000 10.447 30.000 70.257 30.656 40.377 4
: Understanding Imbalanced Semantic Segmentation Through Neural Collapse.
AWCS0.305 50.508 50.225 50.142 20.463 50.063 60.195 50.000 10.000 10.000 10.467 20.551 10.504 30.773 30.764 50.142 50.029 80.000 30.626 60.100 50.000 40.360 50.179 60.507 60.137 60.006 30.300 50.000 10.000 30.172 30.364 60.512 20.000 10.056 50.000 20.865 50.093 30.634 80.000 30.071 60.396 60.296 70.876 40.000 10.000 10.373 50.436 70.063 60.749 10.877 20.721 30.131 20.124 50.804 60.000 10.000 40.515 40.010 40.452 50.252 40.578 50.417 20.179 80.484 40.171 30.337 50.606 50.000 20.115 30.937 50.142 40.000 10.008 30.000 70.157 70.484 50.402 80.501 60.339 30.553 10.529 20.478 50.000 20.000 10.404 30.001 40.022 60.077 20.000 30.894 60.219 10.628 20.093 60.305 50.886 10.233 20.000 10.603 30.112 30.023 50.000 40.000 10.000 30.741 20.664 20.097 60.253 50.782 50.264 30.523 40.154 10.707 70.000 30.411 30.000 10.000 20.000 20.332 70.000 10.000 40.000 10.602 20.595 40.185 70.656 70.159 10.000 10.355 50.424 60.154 60.729 60.516 50.220 40.620 20.084 50.000 10.707 50.651 40.173 10.014 50.381 80.582 70.000 10.619 20.049 60.000 40.000 10.702 10.000 20.000 10.302 70.489 60.317 40.334 40.392 10.922 50.254 50.533 60.394 40.129 80.613 60.000 10.000 60.820 10.649 50.749 50.000 10.782 50.282 30.863 30.000 10.288 70.006 50.220 50.633 50.542 1
LGroundpermissive0.272 60.485 60.184 60.106 60.476 30.077 50.218 40.000 10.000 10.000 10.547 10.295 60.540 10.746 50.745 60.058 70.112 70.005 10.658 50.077 80.000 40.322 60.178 70.512 50.190 50.199 10.277 60.000 10.000 30.173 20.399 50.000 30.000 10.039 70.000 20.858 60.085 50.676 30.002 10.103 20.498 30.323 50.703 60.000 10.000 10.296 60.549 40.216 10.702 30.768 60.718 50.028 40.092 70.786 70.000 10.000 40.453 70.022 30.251 80.252 40.572 60.348 60.321 40.514 20.063 60.279 70.552 60.000 20.019 70.932 60.132 70.000 10.000 40.000 70.156 80.457 60.623 50.518 50.265 70.358 40.381 60.395 60.000 20.000 10.127 80.012 30.051 10.000 30.000 30.886 70.014 50.437 80.179 20.244 60.826 60.000 30.000 10.599 40.136 10.085 20.000 40.000 10.000 30.565 40.612 50.143 20.207 60.566 60.232 60.446 60.127 20.708 60.000 30.384 40.000 10.000 20.000 20.402 50.000 10.059 20.000 10.525 80.566 50.229 60.659 60.000 30.000 10.265 60.446 50.147 70.720 80.597 40.066 60.000 40.187 20.000 10.726 40.467 80.134 50.000 60.413 60.629 50.000 10.363 70.055 50.022 20.000 10.626 50.000 20.000 10.323 60.479 80.154 70.117 60.028 70.901 60.243 60.415 80.295 80.143 50.610 70.000 10.000 60.777 50.397 80.324 70.000 10.778 60.179 50.702 70.000 10.274 80.404 10.233 40.622 60.398 3
David Rozenberszki, Or Litany, Angela Dai: Language-Grounded Indoor 3D Semantic Segmentation in the Wild. arXiv
CSC-Pretrainpermissive0.249 80.455 80.171 70.079 80.418 70.059 70.186 60.000 10.000 10.000 10.335 70.250 70.316 70.766 40.697 80.142 50.170 50.003 20.553 70.112 40.097 10.201 80.186 50.476 80.081 70.000 40.216 80.000 10.000 30.001 80.314 80.000 30.000 10.055 60.000 20.832 80.094 20.659 60.002 10.076 40.310 80.293 80.664 80.000 10.000 10.175 80.634 20.130 20.552 80.686 80.700 80.076 30.110 60.770 80.000 10.000 40.430 80.000 50.319 60.166 70.542 80.327 70.205 70.332 70.052 70.375 40.444 80.000 20.012 80.930 80.203 10.000 10.000 40.046 40.175 50.413 70.592 60.471 70.299 60.152 80.340 70.247 80.000 20.000 10.225 60.058 20.037 20.000 30.207 10.862 80.014 50.548 50.033 70.233 70.816 70.000 30.000 10.542 70.123 20.121 10.019 20.000 10.000 30.463 70.454 80.045 80.128 80.557 70.235 50.441 70.063 50.484 80.000 30.308 80.000 10.000 20.000 20.318 80.000 10.000 40.000 10.545 70.543 60.164 80.734 40.000 30.000 10.215 80.371 70.198 50.743 50.205 70.062 70.000 40.079 60.000 10.683 70.547 70.142 40.000 60.441 30.579 80.000 10.464 60.098 40.041 10.000 10.590 70.000 20.000 10.373 40.494 50.174 60.105 70.001 80.895 70.222 70.537 50.307 70.180 40.625 50.000 10.000 60.591 80.609 70.398 60.000 10.766 80.014 80.638 80.000 10.377 50.004 60.206 70.609 80.465 2
Ji Hou, Benjamin Graham, Matthias Nießner, Saining Xie: Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts. CVPR 2021
Minkowski 34Dpermissive0.253 70.463 70.154 80.102 70.381 80.084 30.134 80.000 10.000 10.000 10.386 40.141 80.279 80.737 70.703 70.014 80.164 60.000 30.663 40.092 70.000 40.224 70.291 40.531 30.056 80.000 40.242 70.000 10.000 30.013 60.331 70.000 30.000 10.035 80.001 10.858 60.059 70.650 70.000 30.056 70.353 70.299 60.670 70.000 10.000 10.284 70.484 60.071 50.594 70.720 70.710 60.027 50.068 80.813 50.000 10.005 30.492 60.164 10.274 70.111 80.571 70.307 80.293 50.307 80.150 40.163 80.531 70.002 10.545 10.932 60.093 80.000 10.000 40.002 60.159 60.368 80.581 70.440 80.228 80.406 20.282 80.294 70.000 20.000 10.189 70.060 10.036 30.000 30.000 30.897 50.000 80.525 60.025 80.205 80.771 80.000 30.000 10.593 50.108 50.044 30.000 40.000 10.000 30.282 80.589 60.094 70.169 70.466 80.227 70.419 80.125 30.757 50.002 10.334 70.000 10.000 20.000 20.357 60.000 10.000 40.000 10.582 50.513 80.337 50.612 80.000 30.000 10.250 70.352 80.136 80.724 70.655 20.280 30.000 40.046 80.000 10.606 80.559 60.159 20.102 10.445 20.655 20.000 10.310 80.117 20.000 40.000 10.581 80.026 10.000 10.265 80.483 70.084 80.097 80.044 60.865 80.142 80.588 40.351 60.272 20.596 80.000 10.003 40.622 70.720 40.096 80.000 10.771 70.016 70.772 60.000 10.302 60.194 40.214 60.621 70.197 8
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019


This table lists the benchmark results for the ScanNet200 3D semantic instance scenario.




Method Infoavg ap 25%head ap 25%common ap 25%tail ap 25%alarm clockarmchairbackpackbagballbarbasketbathroom cabinetbathroom counterbathroom stallbathroom stall doorbathroom vanitybathtubbedbenchbicyclebinblackboardblanketblindsboardbookbookshelfbottlebowlboxbroombucketbulletin boardcabinetcalendarcandlecartcase of water bottlescd caseceilingceiling lightchairclockclosetcloset doorcloset rodcloset wallclothesclothes dryercoat rackcoffee kettlecoffee makercoffee tablecolumncomputer towercontainercopiercouchcountercratecupcurtaincushiondecorationdeskdining tabledish rackdishwasherdividerdoordoorframedresserdumbbelldustpanend tablefanfile cabinetfire alarmfire extinguisherfireplacefolded chairfurnitureguitarguitar casehair dryerhandicap barhatheadphonesironing boardjacketkeyboardkeyboard pianokitchen cabinetkitchen counterladderlamplaptoplaundry basketlaundry detergentlaundry hamperledgelightlight switchluggagemachinemailboxmatmattressmicrowavemini fridgemirrormonitormousemusic standnightstandobjectoffice chairottomanovenpaperpaper bagpaper cutterpaper towel dispenserpaper towel rollpersonpianopicturepillarpillowpipeplantplateplungerposterpotted plantpower outletpower stripprinterprojectorprojector screenpurserackradiatorrailrange hoodrecycling binrefrigeratorscaleseatshelfshoeshowershower curtainshower curtain rodshower doorshower floorshower headshower wallsignsinksoap dishsoap dispensersofa chairspeakerstair railstairsstandstoolstorage binstorage containerstorage organizerstovestructurestuffed animalsuitcasetabletelephonetissue boxtoastertoaster oventoilettoilet papertoilet paper dispensertoilet paper holdertoilet seat cover dispensertoweltrash bintrash cantraytubetvtv standvacuum cleanerventwardrobewashing machinewater bottlewater coolerwater pitcherwhiteboardwindowwindowsill
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Mask3D Scannet2000.445 10.653 10.392 10.254 10.648 10.097 10.125 50.000 10.000 10.000 10.657 10.971 10.451 21.000 11.000 10.640 10.500 10.045 11.000 10.241 20.409 10.363 10.440 10.686 30.300 10.000 10.201 10.000 10.009 10.290 10.556 11.000 10.000 10.063 30.000 10.830 10.573 10.844 20.333 10.204 10.058 50.158 50.552 20.056 10.000 11.000 10.725 40.750 10.927 11.000 10.888 40.042 30.120 20.615 40.226 10.250 10.890 10.792 10.677 20.510 20.818 10.699 10.512 20.167 50.125 10.315 20.943 10.309 10.017 30.200 30.000 10.188 10.000 10.183 30.815 11.000 10.827 10.741 10.442 30.414 40.600 10.000 10.000 10.458 10.049 30.321 10.381 10.000 10.908 20.400 10.841 10.260 10.710 10.966 10.265 10.000 10.924 10.152 10.025 20.500 10.027 10.028 11.000 10.556 50.016 10.080 50.500 10.694 30.608 10.084 10.604 30.194 10.538 30.000 10.500 10.000 20.354 40.000 11.000 10.000 10.761 20.930 10.053 40.890 31.000 10.008 20.262 10.358 21.000 11.000 10.792 40.966 11.000 10.765 20.004 20.930 10.780 20.330 20.027 20.625 10.974 40.050 10.412 50.021 20.000 30.000 20.778 10.000 10.000 10.493 20.746 20.454 10.335 20.396 10.930 50.551 21.000 10.552 10.606 10.853 10.000 10.004 10.806 11.000 10.727 20.000 10.042 30.745 20.000 10.399 40.391 10.630 10.721 10.619 1
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
TD3D Scannet2000.379 20.603 20.306 20.190 20.635 20.073 20.500 10.000 10.000 10.000 10.495 30.735 20.275 51.000 10.979 20.590 20.000 40.021 20.000 30.146 30.000 20.356 20.173 50.795 10.226 20.000 10.173 20.000 10.000 20.226 20.390 20.000 20.000 10.250 10.000 10.706 20.061 30.885 10.093 20.186 20.259 40.200 10.667 10.000 20.000 10.667 20.825 10.250 40.834 41.000 10.958 10.553 10.111 30.748 10.220 20.051 20.866 20.792 10.390 50.045 50.800 20.302 50.517 10.533 30.113 20.427 10.843 20.000 20.458 10.600 10.000 10.101 20.000 10.259 10.717 20.500 20.615 20.520 20.526 20.457 10.270 40.000 10.000 10.400 20.088 20.294 20.181 20.000 11.000 10.400 10.710 50.103 30.477 50.905 20.061 20.000 10.906 20.102 20.232 10.125 20.000 20.003 20.792 31.000 10.000 20.102 30.125 40.559 50.523 30.075 20.715 10.000 20.424 50.000 10.396 20.250 10.638 10.000 10.000 20.000 10.622 50.833 20.221 10.970 10.250 20.038 10.260 20.415 10.125 21.000 11.000 10.857 20.000 20.908 10.012 10.869 30.836 10.635 10.111 10.625 11.000 10.020 20.510 10.003 30.009 21.000 10.778 10.000 10.000 10.370 30.755 10.288 20.333 30.274 21.000 10.557 10.731 20.456 20.433 30.769 50.000 10.000 20.621 41.000 10.458 40.000 10.196 20.817 10.000 10.472 10.222 30.205 50.689 20.274 3
Minkowski 34D Inst.permissive0.280 40.488 40.192 50.124 40.593 40.010 40.500 10.000 10.000 10.000 10.447 40.535 40.445 31.000 10.861 40.400 30.225 20.000 30.000 30.142 40.000 20.074 40.342 30.467 50.067 30.000 10.119 50.000 10.000 20.000 40.337 50.000 20.000 10.000 40.000 10.506 50.070 20.804 40.000 30.000 40.333 30.172 30.150 50.000 20.000 10.479 50.745 30.000 50.830 51.000 10.904 30.167 20.090 40.732 20.000 30.000 30.443 40.000 30.500 30.542 10.772 50.396 40.077 50.385 40.044 40.118 50.777 40.000 20.000 40.200 30.000 10.000 30.000 10.148 40.502 40.500 20.419 40.159 50.281 40.404 50.317 30.000 10.000 10.200 30.000 40.077 30.000 30.000 10.750 30.200 30.715 40.021 40.551 20.828 50.000 30.000 10.743 40.059 50.000 30.000 30.000 20.000 30.125 50.648 30.000 20.191 20.500 10.669 40.502 40.000 50.568 40.000 20.516 40.000 10.000 30.000 20.305 50.000 10.000 20.000 10.825 10.833 20.021 50.918 20.000 30.000 30.191 40.346 40.100 40.981 31.000 10.286 40.000 20.000 50.000 30.868 40.648 50.292 30.000 30.375 31.000 10.000 30.500 20.000 40.333 10.000 20.538 50.000 10.000 10.213 50.518 40.098 40.528 10.250 30.997 30.284 50.677 30.398 30.167 40.790 40.000 10.000 20.618 50.903 50.200 50.000 10.333 10.333 40.000 10.442 30.083 40.213 40.587 40.131 5
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019
CSC-Pretrain Inst.permissive0.275 50.466 50.218 40.110 50.625 30.007 50.500 10.000 10.000 10.000 10.000 50.222 50.377 41.000 10.661 50.400 30.000 40.000 30.000 30.119 50.000 20.000 50.277 40.685 40.067 30.000 10.132 30.000 10.000 20.000 40.367 40.000 20.000 10.000 40.000 10.591 30.055 40.783 50.000 30.014 30.500 20.161 40.278 30.000 20.000 10.667 20.768 20.500 20.866 21.000 10.829 50.000 40.019 50.555 50.000 30.000 30.305 50.000 30.750 10.200 40.783 40.429 30.395 30.677 20.020 50.286 30.584 50.000 20.000 40.115 50.000 10.000 30.000 10.145 50.423 50.500 20.364 50.369 40.571 10.448 30.206 50.000 10.000 10.200 30.106 10.065 50.000 30.000 10.750 30.200 30.774 20.000 50.501 30.841 40.000 30.000 10.692 50.063 40.000 30.000 30.000 20.000 30.500 40.649 20.000 20.084 40.125 40.719 10.413 50.004 40.450 50.000 20.638 10.000 10.000 30.000 20.505 30.000 10.000 20.000 10.727 30.833 20.221 20.779 50.000 30.000 30.168 50.311 50.125 20.571 40.500 50.143 50.000 20.250 40.000 30.869 20.667 40.162 50.000 30.250 41.000 10.000 30.500 20.000 40.000 30.000 20.689 40.000 10.000 10.312 40.383 50.114 30.333 30.000 40.997 30.420 30.613 40.212 50.500 20.819 20.000 10.000 20.768 21.000 10.918 10.000 10.000 40.278 50.000 10.333 50.000 50.353 20.546 50.258 4
Ji Hou, Benjamin Graham, Matthias Nießner, Saining Xie: Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts. CVPR 2021
LGround Inst.permissive0.314 30.529 30.225 30.155 30.578 50.010 30.500 10.000 10.000 10.000 10.515 20.556 30.696 11.000 10.927 30.400 30.083 30.000 31.000 10.252 10.000 20.167 30.350 20.731 20.067 30.000 10.123 40.000 10.000 20.036 30.372 30.000 20.000 10.250 10.000 10.569 40.031 50.810 30.000 30.000 40.630 10.183 20.278 30.000 20.000 10.582 40.589 50.500 20.863 31.000 10.940 20.000 40.144 10.716 30.000 30.000 30.484 30.000 30.500 30.400 30.798 30.500 20.278 40.750 10.093 30.166 40.783 30.000 20.200 20.400 20.000 10.000 30.000 10.219 20.539 30.500 20.578 30.413 30.181 50.457 20.375 20.000 10.000 10.050 50.000 40.077 40.000 30.000 10.500 50.000 50.743 30.250 20.488 40.846 30.000 30.000 10.800 30.069 30.000 30.000 30.000 20.000 31.000 10.607 40.000 20.200 10.500 10.694 20.528 20.063 30.659 20.000 20.594 20.000 10.000 30.000 20.571 20.000 10.000 20.000 10.716 40.647 50.221 20.857 40.000 30.000 30.217 30.346 30.071 50.530 51.000 10.429 30.000 20.286 30.000 30.826 50.706 30.208 40.000 30.250 40.744 50.000 30.500 20.042 10.000 30.000 20.746 30.000 10.000 10.517 10.625 30.085 50.333 30.000 41.000 10.378 40.533 50.376 40.042 50.814 30.000 10.000 20.765 31.000 10.600 30.000 10.000 40.667 30.000 10.472 10.333 20.337 30.605 30.305 2
David Rozenberszki, Or Litany, Angela Dai: Language-Grounded Indoor 3D Semantic Segmentation in the Wild.


ScanNet Benchmark

This table lists the benchmark results for the 3D semantic label scenario.


Method Infoavg ioubathtubbedbookshelfcabinetchaircountercurtaindeskdoorfloorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwallwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Mix3Dpermissive0.781 10.964 10.855 10.843 130.781 30.858 90.575 40.831 250.685 90.714 20.979 10.594 40.310 200.801 10.892 120.841 20.819 30.723 30.940 80.887 20.725 17
Alexey Nekrasov, Jonas Schult, Or Litany, Bastian Leibe, Francis Engelmann: Mix3D: Out-of-Context Data Augmentation for 3D Scenes. 3DV 2021 (Oral)
Swin3Dpermissive0.779 20.861 160.818 100.836 160.790 10.875 20.576 30.905 30.704 30.739 10.969 60.611 10.349 60.756 160.958 10.702 350.805 120.708 60.916 240.898 10.801 1
OctFormerpermissive0.766 30.925 30.808 160.849 70.786 20.846 190.566 60.876 100.690 70.674 80.960 110.576 110.226 580.753 180.904 60.777 70.815 50.722 40.923 200.877 80.776 4
Peng-Shuai Wang: OctFormer: Octree-based Transformers for 3D Point Clouds. SIGGRAPH 2023
PPT-SpUNet-Joint0.766 30.932 20.794 250.829 190.751 150.854 110.540 140.903 40.630 260.672 90.963 90.565 150.357 40.788 20.900 80.737 190.802 130.685 120.950 20.887 20.780 2
Xiaoyang Wu, Zhuotao Tian, Xin Wen, Bohao Peng, Xihui Liu, Kaicheng Yu, Hengshuang Zhao: Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt Training.
CU-Hybrid Net0.764 50.924 40.819 80.840 140.757 100.853 130.580 10.848 190.709 20.643 160.958 140.587 80.295 260.753 180.884 160.758 130.815 50.725 20.927 180.867 140.743 9
OccuSeg+Semantic0.764 50.758 520.796 230.839 150.746 170.907 10.562 70.850 180.680 110.672 90.978 20.610 20.335 110.777 50.819 370.847 10.830 10.691 100.972 10.885 40.727 15
O-CNNpermissive0.762 70.924 40.823 50.844 120.770 50.852 140.577 20.847 210.711 10.640 200.958 140.592 50.217 640.762 130.888 130.758 130.813 80.726 10.932 160.868 130.744 8
Peng-Shuai Wang, Yang Liu, Yu-Xiao Guo, Chun-Yu Sun, Xin Tong: O-CNN: Octree-based Convolutional Neural Networks for 3D Shape Analysis. SIGGRAPH 2017
OA-CNN-L_ScanNet200.756 80.783 380.826 40.858 40.776 40.837 250.548 110.896 70.649 190.675 70.962 100.586 90.335 110.771 80.802 410.770 90.787 270.691 100.936 110.880 70.761 6
PointTransformerV20.752 90.742 590.809 150.872 10.758 90.860 80.552 90.891 80.610 340.687 30.960 110.559 180.304 230.766 100.926 30.767 100.797 180.644 250.942 60.876 110.722 19
Xiaoyang Wu, Yixing Lao, Li Jiang, Xihui Liu, Hengshuang Zhao: Point Transformer V2: Grouped Vector Attention and Partition-based Pooling. NeurIPS 2022
DMF-Net0.752 90.906 80.793 270.802 340.689 320.825 360.556 80.867 120.681 100.602 360.960 110.555 200.365 30.779 40.859 220.747 160.795 220.717 50.917 230.856 220.764 5
C.Yang, Y.Yan, W.Zhao, J.Ye, X.Yang, A.Hussain, B.Dong, K.Huang: Towards Deeper and Better Multi-view Feature Fusion for 3D Semantic Segmentation. ICONIP 2023
BPNetcopyleft0.749 110.909 60.818 100.811 270.752 130.839 240.485 390.842 220.673 120.644 150.957 170.528 290.305 220.773 70.859 220.788 40.818 40.693 90.916 240.856 220.723 18
Wenbo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia, Tien-Tsin Wong: Bidirectional Projection Network for Cross Dimension Scene Understanding. CVPR 2021 (Oral)
PointConvFormer0.749 110.793 350.790 280.807 300.750 160.856 100.524 200.881 90.588 460.642 190.977 40.591 60.274 370.781 30.929 20.804 30.796 190.642 260.947 40.885 40.715 22
Wenxuan Wu, Qi Shan, Li Fuxin: PointConvFormer: Revenge of the Point-based Convolution.
MSP0.748 130.623 850.804 180.859 30.745 180.824 380.501 290.912 20.690 70.685 50.956 190.567 140.320 170.768 90.918 40.720 250.802 130.676 150.921 210.881 60.779 3
StratifiedFormerpermissive0.747 140.901 90.803 190.845 110.757 100.846 190.512 240.825 280.696 60.645 140.956 190.576 110.262 490.744 230.861 210.742 170.770 360.705 70.899 370.860 190.734 10
Xin Lai*, Jianhui Liu*, Li Jiang, Liwei Wang, Hengshuang Zhao, Shu Liu, Xiaojuan Qi, Jiaya Jia: Stratified Transformer for 3D Point Cloud Segmentation. CVPR 2022
VMNetpermissive0.746 150.870 140.838 20.858 40.729 230.850 160.501 290.874 110.587 470.658 120.956 190.564 160.299 240.765 110.900 80.716 280.812 90.631 310.939 90.858 200.709 23
Zeyu HU, Xuyang Bai, Jiaxiang Shang, Runze Zhang, Jiayu Dong, Xin Wang, Guangyuan Sun, Hongbo Fu, Chiew-Lan Tai: VMNet: Voxel-Mesh Network for Geodesic-Aware 3D Semantic Segmentation. ICCV 2021 (Oral)
Virtual MVFusion0.746 150.771 460.819 80.848 90.702 290.865 70.397 760.899 50.699 40.664 110.948 470.588 70.330 130.746 220.851 290.764 110.796 190.704 80.935 120.866 150.728 13
Abhijit Kundu, Xiaoqi Yin, Alireza Fathi, David Ross, Brian Brewington, Thomas Funkhouser, Caroline Pantofaru: Virtual Multi-view Fusion for 3D Semantic Segmentation. ECCV 2020
Retro-FPN0.744 170.842 220.800 200.767 470.740 190.836 270.541 130.914 10.672 130.626 250.958 140.552 210.272 390.777 50.886 150.696 360.801 160.674 170.941 70.858 200.717 20
Peng Xiang*, Xin Wen*, Yu-Shen Liu, Hui Zhang, Yi Fang, Zhizhong Han: Retrospective Feature Pyramid Network for Point Cloud Semantic Segmentation. ICCV 2023
EQ-Net0.743 180.620 860.799 220.849 70.730 220.822 410.493 360.897 60.664 140.681 60.955 220.562 170.378 10.760 140.903 70.738 180.801 160.673 180.907 290.877 80.745 7
Zetong Yang*, Li Jiang*, Yanan Sun, Bernt Schiele, Jiaya JIa: A Unified Query-based Paradigm for Point Cloud Understanding. CVPR 2022
LRPNet0.742 190.816 300.806 170.807 300.752 130.828 340.575 40.839 240.699 40.637 210.954 280.520 310.320 170.755 170.834 330.760 120.772 330.676 150.915 260.862 170.717 20
SAT0.742 190.860 170.765 410.819 220.769 60.848 170.533 160.829 260.663 150.631 230.955 220.586 90.274 370.753 180.896 100.729 200.760 430.666 200.921 210.855 240.733 11
LargeKernel3D0.739 210.909 60.820 70.806 320.740 190.852 140.545 120.826 270.594 450.643 160.955 220.541 230.263 480.723 260.858 240.775 80.767 370.678 140.933 140.848 300.694 29
Yukang Chen*, Jianhui Liu*, Xiangyu Zhang, Xiaojuan Qi, Jiaya Jia: LargeKernel3D: Scaling up Kernels in 3D Sparse CNNs. CVPR 2023
RPN0.736 220.776 420.790 280.851 60.754 120.854 110.491 380.866 130.596 440.686 40.955 220.536 240.342 80.624 420.869 180.787 50.802 130.628 320.927 180.875 120.704 26
MinkowskiNetpermissive0.736 220.859 180.818 100.832 180.709 270.840 230.521 220.853 170.660 170.643 160.951 370.544 220.286 310.731 240.893 110.675 440.772 330.683 130.874 570.852 280.727 15
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019
IPCA0.731 240.890 100.837 30.864 20.726 240.873 30.530 190.824 290.489 780.647 130.978 20.609 30.336 100.624 420.733 510.758 130.776 310.570 570.949 30.877 80.728 13
PointTransformer++0.725 250.727 670.811 140.819 220.765 70.841 220.502 280.814 340.621 290.623 270.955 220.556 190.284 320.620 440.866 190.781 60.757 460.648 230.932 160.862 170.709 23
SparseConvNet0.725 250.647 820.821 60.846 100.721 250.869 40.533 160.754 490.603 400.614 290.955 220.572 130.325 150.710 270.870 170.724 230.823 20.628 320.934 130.865 160.683 32
MatchingNet0.724 270.812 320.812 130.810 280.735 210.834 290.495 350.860 160.572 530.602 360.954 280.512 330.280 340.757 150.845 310.725 220.780 290.606 430.937 100.851 290.700 28
PNE0.721 280.840 230.789 300.833 170.690 300.823 400.509 250.864 150.618 300.629 240.957 170.500 360.266 460.763 120.797 430.674 480.791 250.621 370.892 420.855 240.708 25
INS-Conv-semantic0.717 290.751 550.759 440.812 260.704 280.868 50.537 150.842 220.609 360.608 320.953 310.534 260.293 270.616 450.864 200.719 270.793 230.640 270.933 140.845 340.663 37
PointMetaBase0.714 300.835 240.785 310.821 200.684 340.846 190.531 180.865 140.614 310.596 400.953 310.500 360.246 540.674 280.888 130.692 370.764 390.624 340.849 720.844 350.675 34
contrastBoundarypermissive0.705 310.769 490.775 360.809 290.687 330.820 440.439 640.812 350.661 160.591 420.945 550.515 320.171 820.633 390.856 250.720 250.796 190.668 190.889 450.847 310.689 30
Liyao Tang, Yibing Zhan, Zhe Chen, Baosheng Yu, Dacheng Tao: Contrastive Boundary Learning for Point Cloud Segmentation. CVPR2022
ClickSeg_Semantic0.703 320.774 440.800 200.793 380.760 80.847 180.471 420.802 380.463 850.634 220.968 80.491 400.271 410.726 250.910 50.706 320.815 50.551 680.878 520.833 360.570 68
RFCR0.702 330.889 110.745 530.813 250.672 370.818 480.493 360.815 330.623 270.610 300.947 490.470 480.249 530.594 480.848 300.705 330.779 300.646 240.892 420.823 420.611 51
Jingyu Gong, Jiachen Xu, Xin Tan, Haichuan Song, Yanyun Qu, Yuan Xie, Lizhuang Ma: Omni-Supervised Point Cloud Segmentation via Gradual Receptive Field Component Reasoning. CVPR2021
One Thing One Click0.701 340.825 280.796 230.723 540.716 260.832 300.433 660.816 310.634 240.609 310.969 60.418 730.344 70.559 600.833 340.715 290.808 110.560 620.902 340.847 310.680 33
JSENetpermissive0.699 350.881 130.762 420.821 200.667 380.800 610.522 210.792 410.613 320.607 330.935 750.492 390.205 690.576 530.853 270.691 380.758 450.652 220.872 600.828 390.649 41
Zeyu HU, Mingmin Zhen, Xuyang BAI, Hongbo Fu, Chiew-lan Tai: JSENet: Joint Semantic Segmentation and Edge Detection Network for 3D Point Clouds. ECCV 2020
One-Thing-One-Click0.693 360.743 580.794 250.655 780.684 340.822 410.497 340.719 590.622 280.617 280.977 40.447 600.339 90.750 210.664 670.703 340.790 260.596 470.946 50.855 240.647 42
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
PicassoNet-IIpermissive0.692 370.732 630.772 370.786 390.677 360.866 60.517 230.848 190.509 700.626 250.952 350.536 240.225 600.545 660.704 580.689 410.810 100.564 610.903 330.854 270.729 12
Huan Lei, Naveed Akhtar, Mubarak Shah, and Ajmal Mian: Geometric feature learning for 3D meshes.
Feature_GeometricNetpermissive0.690 380.884 120.754 480.795 370.647 440.818 480.422 680.802 380.612 330.604 340.945 550.462 510.189 770.563 590.853 270.726 210.765 380.632 300.904 310.821 450.606 55
Kangcheng Liu, Ben M. Chen: https://arxiv.org/abs/2012.09439. arXiv Preprint
FusionNet0.688 390.704 720.741 570.754 510.656 400.829 320.501 290.741 540.609 360.548 490.950 410.522 300.371 20.633 390.756 460.715 290.771 350.623 350.861 680.814 470.658 38
Feihu Zhang, Jin Fang, Benjamin Wah, Philip Torr: Deep FusionNet for Point Cloud Semantic Segmentation. ECCV 2020
Feature-Geometry Netpermissive0.685 400.866 150.748 500.819 220.645 460.794 640.450 530.802 380.587 470.604 340.945 550.464 500.201 720.554 620.840 320.723 240.732 550.602 450.907 290.822 440.603 58
DGNet0.684 410.712 710.784 320.782 430.658 390.835 280.499 330.823 300.641 210.597 390.950 410.487 410.281 330.575 540.619 700.647 580.764 390.620 380.871 630.846 330.688 31
KP-FCNN0.684 410.847 210.758 460.784 410.647 440.814 510.473 410.772 440.605 380.594 410.935 750.450 580.181 800.587 490.805 400.690 390.785 280.614 390.882 490.819 460.632 47
H. Thomas, C. Qi, J. Deschaud, B. Marcotegui, F. Goulette, L. Guibas.: KPConv: Flexible and Deformable Convolution for Point Clouds. ICCV 2019
VACNN++0.684 410.728 660.757 470.776 440.690 300.804 580.464 470.816 310.577 520.587 430.945 550.508 350.276 360.671 290.710 560.663 500.750 490.589 520.881 500.832 380.653 40
Superpoint Network0.683 440.851 200.728 610.800 360.653 420.806 560.468 440.804 360.572 530.602 360.946 520.453 570.239 570.519 710.822 350.689 410.762 420.595 490.895 400.827 400.630 48
PointContrast_LA_SEM0.683 440.757 530.784 320.786 390.639 480.824 380.408 710.775 430.604 390.541 510.934 790.532 270.269 430.552 630.777 440.645 610.793 230.640 270.913 270.824 410.671 35
VI-PointConv0.676 460.770 480.754 480.783 420.621 520.814 510.552 90.758 470.571 550.557 470.954 280.529 280.268 450.530 690.682 620.675 440.719 580.603 440.888 460.833 360.665 36
Xingyi Li, Wenxuan Wu, Xiaoli Z. Fern, Li Fuxin: The Devils in the Point Clouds: Studying the Robustness of Point Cloud Convolutions.
ROSMRF3D0.673 470.789 360.748 500.763 490.635 500.814 510.407 730.747 510.581 510.573 440.950 410.484 420.271 410.607 460.754 470.649 550.774 320.596 470.883 480.823 420.606 55
SALANet0.670 480.816 300.770 390.768 460.652 430.807 550.451 500.747 510.659 180.545 500.924 850.473 470.149 920.571 560.811 390.635 640.746 500.623 350.892 420.794 590.570 68
PointASNLpermissive0.666 490.703 730.781 340.751 530.655 410.830 310.471 420.769 450.474 810.537 530.951 370.475 460.279 350.635 370.698 610.675 440.751 480.553 670.816 790.806 510.703 27
Xu Yan, Chaoda Zheng, Zhen Li, Sheng Wang, Shuguang Cui: PointASNL: Robust Point Clouds Processing using Nonlocal Neural Networks with Adaptive Sampling. CVPR 2020
PointConvpermissive0.666 490.781 390.759 440.699 630.644 470.822 410.475 400.779 420.564 580.504 670.953 310.428 670.203 710.586 510.754 470.661 510.753 470.588 530.902 340.813 490.642 43
Wenxuan Wu, Zhongang Qi, Li Fuxin: PointConv: Deep Convolutional Networks on 3D Point Clouds. CVPR 2019
PPCNN++permissive0.663 510.746 560.708 650.722 550.638 490.820 440.451 500.566 860.599 420.541 510.950 410.510 340.313 190.648 340.819 370.616 690.682 740.590 510.869 640.810 500.656 39
Pyunghwan Ahn, Juyoung Yang, Eojindl Yi, Chanho Lee, Junmo Kim: Projection-based Point Convolution for Efficient Point Cloud Segmentation. IEEE Access
DCM-Net0.658 520.778 400.702 680.806 320.619 530.813 540.468 440.693 670.494 730.524 590.941 660.449 590.298 250.510 730.821 360.675 440.727 570.568 590.826 770.803 530.637 45
Jonas Schult*, Francis Engelmann*, Theodora Kontogianni, Bastian Leibe: DualConvMesh-Net: Joint Geodesic and Euclidean Convolutions on 3D Meshes. CVPR 2020 [Oral]
HPGCNN0.656 530.698 740.743 550.650 790.564 700.820 440.505 270.758 470.631 250.479 720.945 550.480 440.226 580.572 550.774 450.690 390.735 530.614 390.853 710.776 740.597 61
Jisheng Dang, Qingyong Hu, Yulan Guo, Jun Yang: HPGCNN.
SAFNet-segpermissive0.654 540.752 540.734 590.664 760.583 650.815 500.399 750.754 490.639 220.535 550.942 640.470 480.309 210.665 300.539 760.650 540.708 630.635 290.857 700.793 610.642 43
Linqing Zhao, Jiwen Lu, Jie Zhou: Similarity-Aware Fusion Network for 3D Semantic Segmentation. IROS 2021
RandLA-Netpermissive0.645 550.778 400.731 600.699 630.577 660.829 320.446 550.736 550.477 800.523 610.945 550.454 550.269 430.484 800.749 500.618 670.738 510.599 460.827 760.792 640.621 50
PointConv-SFPN0.641 560.776 420.703 670.721 560.557 730.826 350.451 500.672 710.563 590.483 710.943 630.425 700.162 870.644 350.726 520.659 520.709 620.572 560.875 550.786 690.559 74
MVPNetpermissive0.641 560.831 250.715 630.671 730.590 610.781 700.394 770.679 690.642 200.553 480.937 720.462 510.256 500.649 330.406 890.626 650.691 710.666 200.877 530.792 640.608 54
Maximilian Jaritz, Jiayuan Gu, Hao Su: Multi-view PointNet for 3D Scene Understanding. GMDL Workshop, ICCV 2019
PointMRNet0.640 580.717 700.701 690.692 660.576 670.801 600.467 460.716 600.563 590.459 770.953 310.429 660.169 840.581 520.854 260.605 700.710 600.550 690.894 410.793 610.575 66
FPConvpermissive0.639 590.785 370.760 430.713 610.603 560.798 620.392 780.534 910.603 400.524 590.948 470.457 530.250 520.538 670.723 540.598 740.696 690.614 390.872 600.799 540.567 71
Yiqun Lin, Zizheng Yan, Haibin Huang, Dong Du, Ligang Liu, Shuguang Cui, Xiaoguang Han: FPConv: Learning Local Flattening for Point Convolution. CVPR 2020
PD-Net0.638 600.797 340.769 400.641 840.590 610.820 440.461 480.537 900.637 230.536 540.947 490.388 810.206 680.656 310.668 650.647 580.732 550.585 540.868 650.793 610.473 93
PointSPNet0.637 610.734 620.692 760.714 600.576 670.797 630.446 550.743 530.598 430.437 820.942 640.403 770.150 910.626 410.800 420.649 550.697 680.557 650.846 730.777 730.563 72
SConv0.636 620.830 260.697 720.752 520.572 690.780 720.445 570.716 600.529 640.530 560.951 370.446 610.170 830.507 750.666 660.636 630.682 740.541 740.886 470.799 540.594 62
Supervoxel-CNN0.635 630.656 800.711 640.719 570.613 540.757 810.444 600.765 460.534 630.566 450.928 830.478 450.272 390.636 360.531 780.664 490.645 840.508 820.864 670.792 640.611 51
joint point-basedpermissive0.634 640.614 870.778 350.667 750.633 510.825 360.420 690.804 360.467 830.561 460.951 370.494 380.291 280.566 570.458 840.579 810.764 390.559 640.838 740.814 470.598 60
Hung-Yueh Chiang, Yen-Liang Lin, Yueh-Cheng Liu, Winston H. Hsu: A Unified Point-Based Framework for 3D Segmentation. 3DV 2019
PointMTL0.632 650.731 640.688 790.675 700.591 600.784 690.444 600.565 870.610 340.492 690.949 450.456 540.254 510.587 490.706 570.599 730.665 800.612 420.868 650.791 680.579 65
PointNet2-SFPN0.631 660.771 460.692 760.672 710.524 780.837 250.440 630.706 650.538 620.446 790.944 610.421 720.219 630.552 630.751 490.591 770.737 520.543 730.901 360.768 760.557 75
APCF-Net0.631 660.742 590.687 810.672 710.557 730.792 670.408 710.665 720.545 610.508 640.952 350.428 670.186 780.634 380.702 590.620 660.706 640.555 660.873 580.798 560.581 64
Haojia, Lin: Adaptive Pyramid Context Fusion for Point Cloud Perception. GRSL
3DSM_DMMF0.631 660.626 840.745 530.801 350.607 550.751 820.506 260.729 580.565 570.491 700.866 990.434 620.197 750.595 470.630 690.709 310.705 650.560 620.875 550.740 840.491 88
FusionAwareConv0.630 690.604 890.741 570.766 480.590 610.747 830.501 290.734 560.503 720.527 570.919 890.454 550.323 160.550 650.420 880.678 430.688 720.544 710.896 390.795 580.627 49
Jiazhao Zhang, Chenyang Zhu, Lintao Zheng, Kai Xu: Fusion-Aware Point Convolution for Online Semantic 3D Scene Segmentation. CVPR 2020
DenSeR0.628 700.800 330.625 910.719 570.545 760.806 560.445 570.597 810.448 880.519 620.938 710.481 430.328 140.489 790.499 830.657 530.759 440.592 500.881 500.797 570.634 46
SegGroup_sempermissive0.627 710.818 290.747 520.701 620.602 570.764 780.385 820.629 780.490 760.508 640.931 820.409 750.201 720.564 580.725 530.618 670.692 700.539 750.873 580.794 590.548 78
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
SIConv0.625 720.830 260.694 740.757 500.563 710.772 760.448 540.647 750.520 660.509 630.949 450.431 650.191 760.496 770.614 710.647 580.672 780.535 770.876 540.783 700.571 67
HPEIN0.618 730.729 650.668 820.647 810.597 590.766 770.414 700.680 680.520 660.525 580.946 520.432 630.215 650.493 780.599 720.638 620.617 890.570 570.897 380.806 510.605 57
Li Jiang, Hengshuang Zhao, Shu Liu, Xiaoyong Shen, Chi-Wing Fu, Jiaya Jia: Hierarchical Point-Edge Interaction Network for Point Cloud Semantic Segmentation. ICCV 2019
SPH3D-GCNpermissive0.610 740.858 190.772 370.489 960.532 770.792 670.404 740.643 770.570 560.507 660.935 750.414 740.046 1010.510 730.702 590.602 720.705 650.549 700.859 690.773 750.534 81
Huan Lei, Naveed Akhtar, and Ajmal Mian: Spherical Kernel for Efficient Graph Convolution on 3D Point Clouds. TPAMI 2020
AttAN0.609 750.760 510.667 830.649 800.521 790.793 650.457 490.648 740.528 650.434 840.947 490.401 780.153 900.454 820.721 550.648 570.717 590.536 760.904 310.765 770.485 89
Gege Zhang, Qinghua Ma, Licheng Jiao, Fang Liu and Qigong Sun: AttAN: Attention Adversarial Networks for 3D Point Cloud Semantic Segmentation. IJCAI2020
wsss-transformer0.600 760.634 830.743 550.697 650.601 580.781 700.437 650.585 840.493 740.446 790.933 800.394 790.011 1030.654 320.661 680.603 710.733 540.526 780.832 750.761 790.480 90
dtc_net0.596 770.683 750.725 620.715 590.549 750.803 590.444 600.647 750.493 740.495 680.941 660.409 750.000 1050.424 870.544 750.598 740.703 670.522 790.912 280.792 640.520 84
LAP-D0.594 780.720 680.692 760.637 850.456 880.773 750.391 800.730 570.587 470.445 810.940 690.381 820.288 290.434 850.453 860.591 770.649 820.581 550.777 830.749 830.610 53
DPC0.592 790.720 680.700 700.602 890.480 840.762 800.380 830.713 630.585 500.437 820.940 690.369 840.288 290.434 850.509 820.590 790.639 870.567 600.772 840.755 810.592 63
Francis Engelmann, Theodora Kontogianni, Bastian Leibe: Dilated Point Convolutions: On the Receptive Field Size of Point Convolutions on 3D Point Clouds. ICRA 2020
CCRFNet0.589 800.766 500.659 860.683 680.470 870.740 850.387 810.620 800.490 760.476 730.922 870.355 870.245 550.511 720.511 810.571 820.643 850.493 860.872 600.762 780.600 59
ROSMRF0.580 810.772 450.707 660.681 690.563 710.764 780.362 850.515 920.465 840.465 760.936 740.427 690.207 670.438 830.577 730.536 850.675 770.486 870.723 900.779 710.524 83
SD-DETR0.576 820.746 560.609 950.445 1000.517 800.643 960.366 840.714 620.456 860.468 750.870 980.432 630.264 470.558 610.674 630.586 800.688 720.482 880.739 880.733 860.537 80
SQN_0.1%0.569 830.676 770.696 730.657 770.497 810.779 730.424 670.548 880.515 680.376 890.902 960.422 710.357 40.379 900.456 850.596 760.659 810.544 710.685 930.665 970.556 76
TextureNetpermissive0.566 840.672 790.664 840.671 730.494 820.719 860.445 570.678 700.411 940.396 870.935 750.356 860.225 600.412 880.535 770.565 830.636 880.464 900.794 820.680 940.568 70
Jingwei Huang, Haotian Zhang, Li Yi, Thomas Funkerhouser, Matthias Niessner, Leonidas Guibas: TextureNet: Consistent Local Parametrizations for Learning from High-Resolution Signals on Meshes. CVPR
DVVNet0.562 850.648 810.700 700.770 450.586 640.687 900.333 890.650 730.514 690.475 740.906 930.359 850.223 620.340 920.442 870.422 960.668 790.501 830.708 910.779 710.534 81
Pointnet++ & Featurepermissive0.557 860.735 610.661 850.686 670.491 830.744 840.392 780.539 890.451 870.375 900.946 520.376 830.205 690.403 890.356 920.553 840.643 850.497 840.824 780.756 800.515 85
GMLPs0.538 870.495 970.693 750.647 810.471 860.793 650.300 920.477 930.505 710.358 910.903 950.327 900.081 980.472 810.529 790.448 940.710 600.509 800.746 860.737 850.554 77
PanopticFusion-label0.529 880.491 980.688 790.604 880.386 930.632 970.225 1020.705 660.434 910.293 970.815 1000.348 880.241 560.499 760.669 640.507 870.649 820.442 960.796 810.602 1000.561 73
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
subcloud_weak0.516 890.676 770.591 980.609 860.442 890.774 740.335 880.597 810.422 930.357 920.932 810.341 890.094 970.298 940.528 800.473 920.676 760.495 850.602 990.721 890.349 100
Online SegFusion0.515 900.607 880.644 890.579 910.434 900.630 980.353 860.628 790.440 890.410 850.762 1030.307 920.167 850.520 700.403 900.516 860.565 920.447 940.678 940.701 910.514 86
Davide Menini, Suryansh Kumar, Martin R. Oswald, Erik Sandstroem, Cristian Sminchisescu, Luc van Gool: A Real-Time Learning Framework for Joint 3D Reconstruction and Semantic Segmentation. Robotics and Automation Letters Submission
3DMV, FTSDF0.501 910.558 930.608 960.424 1020.478 850.690 890.246 980.586 830.468 820.450 780.911 910.394 790.160 880.438 830.212 990.432 950.541 970.475 890.742 870.727 870.477 91
PCNN0.498 920.559 920.644 890.560 930.420 920.711 880.229 1000.414 940.436 900.352 930.941 660.324 910.155 890.238 990.387 910.493 880.529 980.509 800.813 800.751 820.504 87
3DMV0.484 930.484 990.538 1000.643 830.424 910.606 1010.310 900.574 850.433 920.378 880.796 1010.301 930.214 660.537 680.208 1000.472 930.507 1010.413 990.693 920.602 1000.539 79
Angela Dai, Matthias Niessner: 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation. ECCV'18
PointCNN with RGBpermissive0.458 940.577 910.611 940.356 1040.321 1010.715 870.299 940.376 980.328 1010.319 950.944 610.285 950.164 860.216 1020.229 970.484 900.545 960.456 920.755 850.709 900.475 92
Yangyan Li, Rui Bu, Mingchao Sun, Baoquan Chen: PointCNN. NeurIPS 2018
FCPNpermissive0.447 950.679 760.604 970.578 920.380 940.682 910.291 950.106 1040.483 790.258 1020.920 880.258 990.025 1020.231 1010.325 930.480 910.560 940.463 910.725 890.666 960.231 104
Dario Rethage, Johanna Wald, Jürgen Sturm, Nassir Navab, Federico Tombari: Fully-Convolutional Point Networks for Large-Scale Point Clouds. ECCV 2018
DGCNN_reproducecopyleft0.446 960.474 1000.623 920.463 980.366 960.651 940.310 900.389 970.349 990.330 940.937 720.271 970.126 940.285 950.224 980.350 1010.577 910.445 950.625 970.723 880.394 96
Yue Wang, Yongbin Sun, Ziwei Liu, Sanjay E. Sarma, Michael M. Bronstein, Justin M. Solomon: Dynamic Graph CNN for Learning on Point Clouds. TOG 2019
PNET20.442 970.548 940.548 990.597 900.363 970.628 990.300 920.292 990.374 960.307 960.881 970.268 980.186 780.238 990.204 1010.407 970.506 1020.449 930.667 950.620 990.462 94
SurfaceConvPF0.442 970.505 960.622 930.380 1030.342 990.654 930.227 1010.397 960.367 970.276 990.924 850.240 1000.198 740.359 910.262 950.366 980.581 900.435 970.640 960.668 950.398 95
Hao Pan, Shilin Liu, Yang Liu, Xin Tong: Convolutional Neural Networks on 3D Surfaces Using Parallel Frames.
Tangent Convolutionspermissive0.438 990.437 1020.646 880.474 970.369 950.645 950.353 860.258 1010.282 1030.279 980.918 900.298 940.147 930.283 960.294 940.487 890.562 930.427 980.619 980.633 980.352 99
Maxim Tatarchenko, Jaesik Park, Vladlen Koltun, Qian-Yi Zhou: Tangent convolutions for dense prediction in 3d. CVPR 2018
3DWSSS0.425 1000.525 950.647 870.522 940.324 1000.488 1040.077 1050.712 640.353 980.401 860.636 1050.281 960.176 810.340 920.565 740.175 1050.551 950.398 1000.370 1050.602 1000.361 98
SPLAT Netcopyleft0.393 1010.472 1010.511 1010.606 870.311 1020.656 920.245 990.405 950.328 1010.197 1030.927 840.227 1020.000 1050.001 1060.249 960.271 1040.510 990.383 1020.593 1000.699 920.267 102
Hang Su, Varun Jampani, Deqing Sun, Subhransu Maji, Evangelos Kalogerakis, Ming-Hsuan Yang, Jan Kautz: SPLATNet: Sparse Lattice Networks for Point Cloud Processing. CVPR 2018
ScanNet+FTSDF0.383 1020.297 1040.491 1020.432 1010.358 980.612 1000.274 960.116 1030.411 940.265 1000.904 940.229 1010.079 990.250 970.185 1020.320 1020.510 990.385 1010.548 1010.597 1030.394 96
PointNet++permissive0.339 1030.584 900.478 1030.458 990.256 1040.360 1050.250 970.247 1020.278 1040.261 1010.677 1040.183 1030.117 950.212 1030.145 1040.364 990.346 1050.232 1050.548 1010.523 1040.252 103
Charles R. Qi, Li Yi, Hao Su, Leonidas J. Guibas: pointnet++: deep hierarchical feature learning on point sets in a metric space.
SSC-UNetpermissive0.308 1040.353 1030.290 1050.278 1050.166 1050.553 1020.169 1040.286 1000.147 1050.148 1050.908 920.182 1040.064 1000.023 1050.018 1060.354 1000.363 1030.345 1030.546 1030.685 930.278 101
ScanNetpermissive0.306 1050.203 1050.366 1040.501 950.311 1020.524 1030.211 1030.002 1060.342 1000.189 1040.786 1020.145 1050.102 960.245 980.152 1030.318 1030.348 1040.300 1040.460 1040.437 1050.182 105
Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner: ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. CVPR'17
ERROR0.054 1060.000 1060.041 1060.172 1060.030 1060.062 1060.001 1060.035 1050.004 1060.051 1060.143 1060.019 1060.003 1040.041 1040.050 1050.003 1060.054 1060.018 1060.005 1060.264 1060.082 106


This table lists the benchmark results for the 3D semantic instance scenario.




Method Infoavg ap 25%bathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
TD3D0.875 11.000 10.976 130.877 90.783 150.970 10.889 10.828 120.945 30.803 70.713 110.720 110.709 91.000 10.936 60.934 30.873 71.000 10.791 6
Queryformer0.874 21.000 10.978 110.809 250.876 10.936 60.702 90.716 260.920 50.875 40.766 40.772 30.818 41.000 10.995 10.916 40.892 11.000 10.767 9
SoftGroup++0.874 21.000 10.972 140.947 10.839 50.898 130.556 250.913 20.881 110.756 90.828 20.748 70.821 21.000 10.937 50.937 10.887 21.000 10.821 3
Mask3D0.870 41.000 10.985 70.782 330.818 90.938 50.760 50.749 230.923 40.877 30.760 50.785 20.820 31.000 10.912 90.864 230.878 50.983 400.825 2
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
SoftGrouppermissive0.865 51.000 10.969 150.860 120.860 20.913 90.558 220.899 30.911 60.760 80.828 10.736 80.802 60.981 300.919 80.875 140.877 61.000 10.820 4
Thang Vu, Kookhoi Kim, Tung M. Luu, Xuan Thanh Nguyen, Chang D. Yoo: SoftGroup for 3D Instance Segmentaiton on Point Clouds. CVPR 2022 [Oral]
MAFT0.860 61.000 10.990 60.810 240.829 60.949 30.809 40.688 320.836 160.904 10.751 70.796 10.741 71.000 10.864 240.848 300.837 131.000 10.828 1
IPCA-Inst0.851 71.000 10.968 160.884 80.842 40.862 250.693 110.812 170.888 90.677 210.783 30.698 120.807 51.000 10.911 130.865 220.865 91.000 10.757 11
SPFormerpermissive0.851 71.000 10.994 20.806 260.774 170.942 40.637 140.849 100.859 140.889 20.720 90.730 90.665 141.000 10.911 130.868 210.873 81.000 10.796 5
Sun Jiahao, Qing Chunmei, Tan Junpeng, Xu Xiangmin: Superpoint Transformer for 3D Scene Instance Segmentation. AAAI 2023 [Oral]
ISBNetpermissive0.845 91.000 10.976 120.798 270.794 120.916 70.757 60.667 340.882 100.842 50.715 100.757 50.832 11.000 10.905 160.803 450.843 121.000 10.715 18
Tuan Duc Ngo, Binh-Son Hua, Khoi Nguyen: ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution. CVPR 2023
SphereSeg0.835 101.000 10.963 190.891 60.794 110.954 20.822 30.710 270.961 20.721 130.693 170.530 320.653 161.000 10.867 230.857 260.859 100.991 370.771 8
TopoSeg0.832 111.000 10.981 90.933 20.819 80.826 330.524 310.841 110.811 200.681 200.759 60.687 130.727 80.981 300.911 130.883 100.853 111.000 10.756 12
GraphCut0.832 111.000 10.922 330.724 420.798 100.902 120.701 100.856 80.859 130.715 140.706 120.748 60.640 271.000 10.934 70.862 240.880 31.000 10.729 14
PBNetpermissive0.825 131.000 10.963 180.837 160.843 30.865 200.822 20.647 360.878 120.733 110.639 250.683 140.650 171.000 10.853 250.870 180.820 141.000 10.744 13
W.Zhao, Y.Yan, C.Yang, J.Ye,X.Yang,K.Huang: Divide and Conquer: 3D Instance Segmentation With Point-Wise Binarization. ICCV 2023
SSEC0.820 141.000 10.983 80.924 30.826 70.817 360.415 400.899 40.793 240.673 220.731 80.636 190.653 151.000 10.939 40.804 430.878 41.000 10.780 7
DKNet0.815 151.000 10.930 250.844 140.765 210.915 80.534 290.805 190.805 220.807 60.654 190.763 40.650 171.000 10.794 370.881 110.766 181.000 10.758 10
Yizheng Wu, Min Shi, Shuaiyuan Du, Hao Lu, Zhiguo Cao, Weicai Zhong: 3D Instances as 1D Kernels. ECCV 2022
RPGN0.806 161.000 10.992 40.789 290.723 330.891 140.650 130.810 180.832 170.665 240.699 150.658 150.700 101.000 10.881 180.832 340.774 160.997 300.613 35
Shichao Dong, Guosheng Lin, Tzu-Yi Hung: Learning Regional Purity for Instance Segmentation on 3D Point Clouds. ECCV 2022
HAISpermissive0.803 171.000 10.994 20.820 200.759 220.855 260.554 260.882 50.827 190.615 300.676 180.638 180.646 251.000 10.912 90.797 470.767 170.994 350.726 15
Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang: Hierarchical Aggregation for 3D Instance Segmentation. ICCV 2021
Box2Mask0.803 171.000 10.962 200.874 100.707 370.887 170.686 120.598 400.961 10.715 150.694 160.469 370.700 101.000 10.912 90.902 50.753 230.997 300.637 29
Julian Chibane, Francis Engelmann, Tuan Anh Tran, Gerard Pons-Moll: Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation Using Bounding Boxes. ECCV 2022
Mask-Group0.792 191.000 10.968 170.812 210.766 200.864 210.460 340.815 160.888 80.598 340.651 220.639 170.600 320.918 350.941 20.896 60.721 301.000 10.723 16
Min Zhong, Xinghao Chen, Xiaokang Chen, Gang Zeng, Yunhe Wang: MaskGroup: Hierarchical Point Grouping and Masking for 3D Instance Segmentation. ICME 2022
CSC-Pretrained0.791 201.000 10.996 10.829 190.767 190.889 160.600 170.819 150.770 290.594 350.620 280.541 290.700 101.000 10.941 20.889 80.763 191.000 10.526 44
SSTNetpermissive0.789 211.000 10.840 470.888 70.717 340.835 290.717 80.684 330.627 430.724 120.652 210.727 100.600 321.000 10.912 90.822 370.757 221.000 10.691 23
Zhihao Liang, Zhihao Li, Songcen Xu, Mingkui Tan, Kui Jia: Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks. ICCV2021
GICN0.788 221.000 10.978 100.867 110.781 160.833 300.527 300.824 130.806 210.549 430.596 310.551 250.700 101.000 10.853 250.935 20.733 271.000 10.651 26
DENet0.786 231.000 10.929 260.736 400.750 280.720 490.755 70.934 10.794 230.590 360.561 370.537 300.650 171.000 10.882 170.804 440.789 151.000 10.719 17
DualGroup0.782 241.000 10.927 270.811 220.772 180.853 270.631 160.805 190.773 260.613 310.611 290.610 210.650 170.835 460.881 180.879 130.750 251.000 10.675 24
PointGroup0.778 251.000 10.900 370.798 280.715 350.863 220.493 320.706 280.895 70.569 410.701 130.576 230.639 281.000 10.880 200.851 280.719 310.997 300.709 20
Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia: PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation. CVPR 2020 [oral]
PE0.776 261.000 10.900 380.860 120.728 320.869 180.400 410.857 70.774 250.568 420.701 140.602 220.646 250.933 340.843 280.890 70.691 380.997 300.709 19
Biao Zhang, Peter Wonka: Point Cloud Instance Segmentation using Probabilistic Embeddings. CVPR 2021
AOIA0.767 271.000 10.937 220.810 230.740 300.906 100.550 270.800 210.706 350.577 400.624 270.544 280.596 370.857 380.879 220.880 120.750 240.992 360.658 25
DD-UNet+Group0.764 281.000 10.897 400.837 150.753 250.830 320.459 360.824 130.699 370.629 280.653 200.438 400.650 171.000 10.880 200.858 250.690 391.000 10.650 27
H. Liu, R. Liu, K. Yang, J. Zhang, K. Peng, R. Stiefelhagen: HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor. ICCVW 2021
INS-Conv-instance0.762 291.000 10.923 300.765 360.785 140.905 110.600 170.655 350.646 420.683 190.647 230.530 310.650 171.000 10.824 300.830 350.693 370.944 440.644 28
Dyco3Dcopyleft0.761 301.000 10.935 230.893 50.752 270.863 230.600 170.588 410.742 320.641 260.633 260.546 270.550 390.857 380.789 390.853 270.762 200.987 380.699 21
Tong He; Chunhua Shen; Anton van den Hengel: DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution. CVPR2021
OccuSeg+instance0.742 311.000 10.923 300.785 300.745 290.867 190.557 230.578 440.729 330.670 230.644 240.488 350.577 381.000 10.794 370.830 350.620 471.000 10.550 40
Lei Han, Tian Zheng, Lan Xu, Lu Fang: OccuSeg: Occupancy-aware 3D Instance Segmentation. CVPR2020
RWSeg0.739 321.000 10.899 390.759 380.753 260.823 340.282 450.691 310.658 400.582 390.594 320.547 260.628 301.000 10.795 360.868 200.728 291.000 10.692 22
3D-MPA0.737 331.000 10.933 240.785 300.794 130.831 310.279 470.588 410.695 380.616 290.559 380.556 240.650 171.000 10.809 340.875 150.696 351.000 10.608 37
Francis Engelmann, Martin Bokeloh, Alireza Fathi, Bastian Leibe, Matthias Nießner: 3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation. CVPR 2020
MTML0.731 341.000 10.992 40.779 350.609 460.746 440.308 440.867 60.601 460.607 320.539 410.519 330.550 391.000 10.824 300.869 190.729 281.000 10.616 33
Jean Lahoud, Bernard Ghanem, Marc Pollefeys, Martin R. Oswald: 3D Instance Segmentation via Multi-task Metric Learning. ICCV 2019 [oral]
OSIS0.725 351.000 10.885 430.653 480.657 430.801 370.576 210.695 300.828 180.698 170.534 420.457 390.500 460.857 380.831 290.841 320.627 451.000 10.619 32
SSEN0.724 361.000 10.926 280.781 340.661 410.845 280.596 200.529 470.764 310.653 250.489 480.461 380.500 460.859 370.765 400.872 170.761 211.000 10.577 38
Dongsu Zhang, Junha Chun, Sang Kyun Cha, Young Min Kim: Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with Deep Metric Learning. Arxiv
NeuralBF0.718 371.000 10.945 210.901 40.754 240.817 350.460 340.700 290.772 270.688 180.568 360.000 580.500 460.981 300.606 490.872 160.740 261.000 10.614 34
Weiwei Sun, Daniel Rebain, Renjie Liao, Vladimir Tankovich, Soroosh Yazdani, Kwang Moo Yi, Andrea Tagliasacchi: NeuralBF: Neural Bilateral Filtering for Top-down Instance Segmentation on Point Clouds. WACV 2023
Sparse R-CNN0.714 381.000 10.926 290.694 430.699 390.890 150.636 150.516 480.693 390.743 100.588 330.369 440.601 310.594 510.800 350.886 90.676 400.986 390.546 41
SALoss-ResNet0.695 391.000 10.855 450.579 530.589 480.735 470.484 330.588 410.856 150.634 270.571 350.298 450.500 461.000 10.824 300.818 380.702 340.935 490.545 42
Zhidong Liang, Ming Yang, Hao Li, Chunxiang Wang: 3D Instance Embedding Learning With a Structure-Aware Loss Function for Point Cloud Segmentation. IEEE Robotics and Automation Letters (IROS2020)
PanopticFusion-inst0.693 401.000 10.852 460.655 470.616 450.788 390.334 430.763 220.771 280.457 530.555 390.652 160.518 430.857 380.765 400.732 530.631 430.944 440.577 39
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
Occipital-SCS0.688 411.000 10.913 340.730 410.737 310.743 460.442 370.855 90.655 410.546 440.546 400.263 470.508 450.889 360.568 500.771 500.705 330.889 520.625 31
3D-BoNet0.687 421.000 10.887 420.836 170.587 490.643 560.550 270.620 370.724 340.522 480.501 460.243 480.512 441.000 10.751 420.807 420.661 420.909 510.612 36
Bo Yang, Jianan Wang, Ronald Clark, Qingyong Hu, Sen Wang, Andrew Markham, Niki Trigoni: Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds. NeurIPS 2019 Spotlight
ClickSeg_Instance0.685 431.000 10.818 490.600 510.715 360.795 380.557 230.533 460.591 480.601 330.519 440.429 420.638 290.938 330.706 440.817 400.624 460.944 440.502 46
PCJC0.684 441.000 10.895 410.757 390.659 420.862 240.189 540.739 240.606 450.712 160.581 340.515 340.650 170.857 380.357 550.785 480.631 440.889 520.635 30
SPG_WSIS0.678 451.000 10.880 440.836 170.701 380.727 480.273 490.607 390.706 360.541 460.515 450.174 500.600 320.857 380.716 430.846 310.711 321.000 10.506 45
One_Thing_One_Clickpermissive0.675 461.000 10.823 480.782 320.621 440.766 410.211 510.736 250.560 500.586 370.522 430.636 200.453 500.641 500.853 250.850 290.694 360.997 300.411 50
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
SegGroup_inspermissive0.637 471.000 10.923 320.593 520.561 500.746 450.143 560.504 490.766 300.485 510.442 490.372 430.530 420.714 470.815 330.775 490.673 411.000 10.431 49
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
MASCpermissive0.615 480.711 540.802 500.540 540.757 230.777 400.029 570.577 450.588 490.521 490.600 300.436 410.534 410.697 480.616 480.838 330.526 490.980 410.534 43
Chen Liu, Yasutaka Furukawa: MASC: Multi-scale Affinity with Sparse Convolution for 3D Instance Segmentation.
UNet-backbone0.605 491.000 10.909 350.764 370.603 470.704 500.415 390.301 540.548 510.461 520.394 500.267 460.386 520.857 380.649 470.817 390.504 500.959 420.356 53
3D-SISpermissive0.558 501.000 10.773 510.614 500.503 520.691 520.200 520.412 500.498 540.546 450.311 550.103 540.600 320.857 380.382 520.799 460.445 560.938 480.371 51
Ji Hou, Angela Dai, Matthias Niessner: 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans. CVPR 2019
R-PointNet0.544 510.500 570.655 570.661 460.663 400.765 420.432 380.214 560.612 440.584 380.499 470.204 490.286 560.429 540.655 460.650 580.539 480.950 430.499 47
Hier3Dcopyleft0.540 521.000 10.727 520.626 490.467 550.693 510.200 520.412 500.480 550.528 470.318 540.077 570.600 320.688 490.382 520.768 510.472 520.941 470.350 54
Tan: HCFS3D: Hierarchical Coupled Feature Selection Network for 3D Semantic and Instance Segmentation.
Region-18class0.497 530.250 590.902 360.689 440.540 510.747 430.276 480.610 380.268 580.489 500.348 510.000 580.243 580.220 570.663 450.814 410.459 540.928 500.496 48
tmp0.474 541.000 10.727 520.433 570.481 540.673 540.022 590.380 520.517 530.436 550.338 530.128 520.343 540.429 540.291 570.728 540.473 510.833 550.300 56
SemRegionNet-20cls0.470 551.000 10.727 520.447 560.481 530.678 530.024 580.380 520.518 520.440 540.339 520.128 520.350 530.429 540.212 580.711 550.465 530.833 550.290 57
ASIS0.422 560.333 580.707 550.676 450.401 560.650 550.350 420.177 570.594 470.376 560.202 560.077 560.404 510.571 520.197 590.674 570.447 550.500 580.260 58
3D-BEVIS0.401 570.667 550.687 560.419 580.137 590.587 570.188 550.235 550.359 570.211 580.093 590.080 550.311 550.571 520.382 520.754 520.300 580.874 540.357 52
Cathrin Elich, Francis Engelmann, Jonas Schult, Theodora Kontogianni, Bastian Leibe: 3D-BEVIS: Birds-Eye-View Instance Segmentation.
Sgpn_scannet0.390 580.556 560.636 580.493 550.353 570.539 580.271 500.160 580.450 560.359 570.178 570.146 510.250 570.143 580.347 560.698 560.436 570.667 570.331 55
MaskRCNN 2d->3d Proj0.261 590.903 530.081 590.008 590.233 580.175 590.280 460.106 590.150 590.203 590.175 580.480 360.218 590.143 580.542 510.404 590.153 590.393 590.049 59


This table lists the benchmark results for the 2D semantic label scenario.


Method Infoavg ioubathtubbedbookshelfcabinetchaircountercurtaindeskdoorfloorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwallwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Virtual MVFusion (R)0.745 10.861 10.839 10.881 10.672 10.512 10.422 150.898 10.723 10.714 10.954 20.454 10.509 10.773 10.895 10.756 10.820 10.653 10.935 10.891 10.728 1
Abhijit Kundu, Xiaoqi Yin, Alireza Fathi, David Ross, Brian Brewington, Thomas Funkhouser, Caroline Pantofaru: Virtual Multi-view Fusion for 3D Semantic Segmentation. ECCV 2020
BPNet_2Dcopyleft0.670 20.822 30.795 30.836 20.659 20.481 20.451 110.769 30.656 30.567 30.931 30.395 40.390 40.700 30.534 30.689 90.770 20.574 30.865 60.831 30.675 4
Wenbo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia and Tien-Tsin Wong: Bidirectional Projection Network for Cross Dimension Scene Understanding. CVPR 2021 (Oral)
CU-Hybrid-2D Net0.636 30.825 20.820 20.179 200.648 30.463 30.549 20.742 60.676 20.628 20.961 10.420 20.379 50.684 60.381 150.732 20.723 30.599 20.827 130.851 20.634 6
CMX0.613 40.681 70.725 90.502 120.634 50.297 150.478 90.830 20.651 40.537 60.924 40.375 50.315 120.686 50.451 120.714 40.543 180.504 50.894 40.823 40.688 3
DMMF_3d0.605 50.651 80.744 70.782 30.637 40.387 40.536 30.732 70.590 60.540 50.856 180.359 90.306 130.596 110.539 20.627 180.706 40.497 70.785 180.757 160.476 19
MCA-Net0.595 60.533 170.756 60.746 40.590 80.334 70.506 60.670 120.587 70.500 100.905 80.366 80.352 80.601 100.506 60.669 150.648 70.501 60.839 120.769 120.516 18
RFBNet0.592 70.616 90.758 50.659 50.581 90.330 80.469 100.655 150.543 120.524 70.924 40.355 100.336 100.572 140.479 80.671 130.648 70.480 90.814 160.814 50.614 9
FAN_NV_RVC0.586 80.510 180.764 40.079 230.620 70.330 80.494 70.753 40.573 80.556 40.884 130.405 30.303 140.718 20.452 110.672 120.658 50.509 40.898 30.813 60.727 2
DCRedNet0.583 90.682 60.723 100.542 110.510 170.310 120.451 110.668 130.549 110.520 80.920 60.375 50.446 20.528 170.417 130.670 140.577 150.478 100.862 70.806 70.628 8
MIX6D_RVC0.582 100.695 40.687 140.225 180.632 60.328 100.550 10.748 50.623 50.494 130.890 110.350 120.254 200.688 40.454 100.716 30.597 140.489 80.881 50.768 130.575 12
SSMAcopyleft0.577 110.695 40.716 120.439 140.563 110.314 110.444 130.719 80.551 100.503 90.887 120.346 130.348 90.603 90.353 170.709 50.600 120.457 120.901 20.786 80.599 11
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. International Journal of Computer Vision, 2019
UNIV_CNP_RVC_UE0.566 120.569 160.686 160.435 150.524 140.294 160.421 160.712 90.543 120.463 150.872 140.320 140.363 70.611 80.477 90.686 100.627 90.443 150.862 70.775 110.639 5
EMSAFormer0.564 130.581 130.736 80.564 100.546 130.219 200.517 40.675 110.486 170.427 190.904 90.352 110.320 110.589 120.528 40.708 60.464 210.413 190.847 110.786 80.611 10
SN_RN152pyrx8_RVCcopyleft0.546 140.572 140.663 180.638 70.518 150.298 140.366 210.633 180.510 150.446 170.864 160.296 170.267 170.542 160.346 180.704 70.575 160.431 160.853 100.766 140.630 7
UDSSEG_RVC0.545 150.610 110.661 190.588 80.556 120.268 180.482 80.642 170.572 90.475 140.836 200.312 150.367 60.630 70.189 200.639 170.495 200.452 130.826 140.756 170.541 14
segfomer with 6d0.542 160.594 120.687 140.146 210.579 100.308 130.515 50.703 100.472 180.498 110.868 150.369 70.282 150.589 120.390 140.701 80.556 170.416 180.860 90.759 150.539 16
FuseNetpermissive0.535 170.570 150.681 170.182 190.512 160.290 170.431 140.659 140.504 160.495 120.903 100.308 160.428 30.523 180.365 160.676 110.621 110.470 110.762 190.779 100.541 14
Caner Hazirbas, Lingni Ma, Csaba Domokos, Daniel Cremers: FuseNet: Incorporating Depth into Semantic Segmentation via Fusion-based CNN Architecture. ACCV 2016
AdapNet++copyleft0.503 180.613 100.722 110.418 160.358 230.337 60.370 200.479 210.443 190.368 210.907 70.207 200.213 220.464 210.525 50.618 190.657 60.450 140.788 170.721 200.408 22
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. International Journal of Computer Vision, 2019
3DMV (2d proj)0.498 190.481 210.612 200.579 90.456 190.343 50.384 180.623 190.525 140.381 200.845 190.254 190.264 190.557 150.182 210.581 210.598 130.429 170.760 200.661 220.446 21
Angela Dai, Matthias Niessner: 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation. ECCV'18
MSeg1080_RVCpermissive0.485 200.505 190.709 130.092 220.427 200.241 190.411 170.654 160.385 230.457 160.861 170.053 230.279 160.503 190.481 70.645 160.626 100.365 210.748 210.725 190.529 17
John Lambert*, Zhuang Liu*, Ozan Sener, James Hays, Vladlen Koltun: MSeg: A Composite Dataset for Multi-domain Semantic Segmentation. CVPR 2020
ILC-PSPNet0.475 210.490 200.581 210.289 170.507 180.067 230.379 190.610 200.417 210.435 180.822 220.278 180.267 170.503 190.228 190.616 200.533 190.375 200.820 150.729 180.560 13
Enet (reimpl)0.376 220.264 230.452 230.452 130.365 210.181 210.143 230.456 220.409 220.346 220.769 230.164 210.218 210.359 220.123 230.403 230.381 230.313 230.571 220.685 210.472 20
Re-implementation of Adam Paszke, Abhishek Chaurasia, Sangpil Kim, Eugenio Culurciello: ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation.
ScanNet (2d proj)permissive0.330 230.293 220.521 220.657 60.361 220.161 220.250 220.004 230.440 200.183 230.836 200.125 220.060 230.319 230.132 220.417 220.412 220.344 220.541 230.427 230.109 23
Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner: ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. CVPR'17
DMMF0.003 240.000 240.005 240.000 240.000 240.037 240.001 240.000 240.001 240.005 240.003 240.000 240.000 240.000 240.000 240.000 240.002 240.001 240.000 240.006 240.000 24


This table lists the benchmark results for the 2D semantic instance scenario.




Method Infoavg apbathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
UniDet_RVC0.205 10.381 10.323 10.037 10.226 10.177 10.063 10.277 10.120 10.067 10.131 10.074 20.317 10.080 10.235 10.289 10.141 10.678 10.080 1
MaskRCNN_ScanNetpermissive0.119 20.129 20.212 20.002 20.112 20.148 20.014 20.205 20.044 20.066 20.078 20.095 10.142 20.030 20.128 20.139 20.080 20.459 20.057 2
Re-implementation of Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross Girshick: Mask R-CNN. ICCV'17


This table lists the benchmark results for the scene type classification scenario.




Method Infoavg recallapartmentbathroombedroom / hotelbookstore / libraryconference roomcopy/mail roomhallwaykitchenlaundry roomliving room / loungemiscofficestorage / basement / garage
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
multi-taskpermissive0.700 10.500 11.000 10.882 20.500 21.000 11.000 10.500 21.000 11.000 10.778 10.000 20.938 10.000 2
Shengyu Huang, Mikhail Usvyatsov, Konrad Schindler: Indoor Scene Recognition in 3D. IROS 2020
3DASPP-SCE0.691 20.500 10.938 20.824 31.000 11.000 10.500 21.000 10.857 20.500 20.556 30.000 20.812 20.500 1
SE-ResNeXt-SSMA0.498 30.000 40.812 30.941 10.500 20.500 30.500 20.500 20.429 40.500 20.667 20.500 10.625 30.000 2
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. arXiv
resnet50_scannet0.353 40.250 30.812 30.529 40.500 20.500 30.000 40.500 20.571 30.000 40.556 30.000 20.375 40.000 2