The 3D semantic labeling task involves predicting a semantic labeling of a 3D scan mesh.

Evaluation and metrics

Our evaluation ranks all methods according to the PASCAL VOC intersection-over-union metric (IoU). IoU = TP/(TP+FP+FN), where TP, FP, and FN are the numbers of true positive, false positive, and false negative pixels, respectively. Predicted labels are evaluated per-vertex over the respective 3D scan mesh; for 3D approaches that operate on other representations like grids or points, the predicted labels should be mapped onto the mesh vertices (e.g., one such example for grid to mesh vertices is provided in the evaluation helpers).



This table lists the benchmark results for the ScanNet200 3D semantic label scenario.




Method Infoavg iouhead ioucommon ioutail iouwallchairfloortabledoorcouchcabinetshelfdeskoffice chairbedpillowsinkpicturewindowtoiletbookshelfmonitorcurtainbookarmchaircoffee tableboxrefrigeratorlampkitchen cabinettowelclothestvnightstandcounterdresserstoolcushionplantceilingbathtubend tabledining tablekeyboardbagbackpacktoilet paperprintertv standwhiteboardblanketshower curtaintrash canclosetstairsmicrowavestoveshoecomputer towerbottlebinottomanbenchboardwashing machinemirrorcopierbasketsofa chairfile cabinetfanlaptopshowerpaperpersonpaper towel dispenserovenblindsrackplateblackboardpianosuitcaserailradiatorrecycling bincontainerwardrobesoap dispensertelephonebucketclockstandlightlaundry basketpipeclothes dryerguitartoilet paper holderseatspeakercolumnbicycleladderbathroom stallshower wallcupjacketstorage bincoffee makerdishwasherpaper towel rollmachinematwindowsillbartoasterbulletin boardironing boardfireplacesoap dishkitchen counterdoorframetoilet paper dispensermini fridgefire extinguisherballhatshower curtain rodwater coolerpaper cuttertrayshower doorpillarledgetoaster ovenmousetoilet seat cover dispenserfurniturecartstorage containerscaletissue boxlight switchcratepower outletdecorationsignprojectorcloset doorvacuum cleanercandleplungerstuffed animalheadphonesdish rackbroomguitar caserange hooddustpanhair dryerwater bottlehandicap barpurseventshower floorwater pitchermailboxbowlpaper bagalarm clockmusic standprojector screendividerlaundry detergentbathroom counterobjectbathroom vanitycloset walllaundry hamperbathroom stall doorceiling lighttrash bindumbbellstair railtubebathroom cabinetcd casecloset rodcoffee kettlestructureshower headkeyboard pianocase of water bottlescoat rackstorage organizerfolded chairfire alarmpower stripcalendarposterpotted plantluggagemattress
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
OctFormer ScanNet200permissive0.326 40.539 40.265 30.131 30.806 30.670 40.943 30.535 40.662 10.705 70.423 40.407 30.505 50.003 60.765 40.582 20.686 60.227 70.680 30.943 30.601 10.854 50.892 10.335 10.417 80.357 40.724 50.453 30.632 30.596 10.432 20.783 40.512 80.021 60.244 60.637 10.000 10.787 30.873 30.743 60.000 80.000 50.534 30.110 10.499 20.289 30.626 30.620 60.168 80.204 10.849 20.679 20.117 10.633 40.684 10.650 30.552 10.684 50.312 20.000 30.175 30.429 40.865 20.413 10.837 40.000 10.145 30.626 30.451 20.487 40.513 10.000 10.529 30.613 40.000 40.033 20.000 10.000 30.828 10.871 10.622 20.587 40.411 20.137 60.645 50.343 30.000 30.000 30.000 10.022 60.000 20.026 80.829 50.000 10.022 40.089 30.842 10.253 70.318 80.296 10.178 30.291 20.224 10.584 20.200 60.132 40.000 30.128 20.227 70.000 10.230 40.047 50.149 20.331 40.412 30.618 20.164 40.102 40.522 10.000 10.655 20.378 40.469 60.000 10.000 50.000 30.105 30.000 40.000 30.483 20.000 30.000 20.028 20.000 10.000 10.906 10.000 10.339 60.000 10.000 40.457 40.000 10.612 30.000 10.000 10.408 10.000 70.900 40.000 40.000 30.000 10.029 30.000 10.074 80.455 60.479 20.427 40.079 60.140 60.496 30.414 50.022 10.000 10.471 50.000 10.000 20.000 40.722 20.000 20.000 10.000 10.138 50.000 20.000 20.000 30.000 1
Peng-Shuai Wang: OctFormer: Octree-based Transformers for 3D Point Clouds. SIGGRAPH 2023
CeCo0.340 10.551 30.247 40.181 10.784 40.661 50.939 40.564 20.624 40.721 30.484 20.429 20.575 10.027 40.774 30.503 50.753 10.242 40.656 40.945 20.534 20.865 40.860 30.177 80.616 30.400 20.818 10.579 10.615 40.367 50.408 30.726 60.633 10.162 10.360 20.619 20.000 10.828 10.873 30.924 10.109 50.083 20.564 10.057 80.475 40.266 40.781 10.767 30.257 30.100 50.825 30.663 40.048 80.620 60.551 40.595 60.532 30.692 40.246 30.000 30.213 20.615 10.861 40.376 20.900 10.000 10.102 70.660 20.321 60.547 10.226 40.000 10.311 40.742 10.011 30.006 30.000 10.000 30.546 80.824 30.345 50.665 10.450 10.435 10.683 10.411 10.338 10.000 30.000 10.030 40.000 20.068 40.892 30.000 10.063 20.000 70.257 40.304 60.387 30.079 60.228 10.190 30.000 80.586 10.347 10.133 30.000 30.037 40.377 40.000 10.384 20.006 70.003 50.421 10.410 40.643 10.171 30.121 20.142 70.000 10.510 60.447 30.474 50.000 10.000 50.286 10.083 40.000 40.000 30.603 10.096 10.063 10.000 40.000 10.000 10.898 30.000 10.429 20.000 10.400 10.550 10.000 10.633 20.000 10.000 10.377 20.000 70.916 10.000 40.000 30.000 10.000 40.000 10.102 70.499 40.296 50.463 30.089 40.304 10.740 10.401 70.010 20.000 10.560 10.000 10.000 20.709 10.652 40.000 20.000 10.000 10.143 30.000 20.000 20.609 10.000 1
: Understanding Imbalanced Semantic Segmentation Through Neural Collapse.
PPT-SpUNet-F.T.0.332 30.556 20.270 10.123 50.816 20.682 20.946 10.549 30.657 30.756 10.459 30.376 40.550 30.001 70.807 10.616 10.727 30.267 20.691 20.942 40.530 40.872 30.874 20.330 30.542 50.374 30.792 20.400 50.673 10.572 20.433 10.793 30.623 20.008 80.351 30.594 50.000 10.783 40.876 10.833 20.213 20.000 50.537 20.091 20.519 10.304 20.620 40.942 10.264 10.124 30.855 10.695 10.086 30.646 30.506 70.658 20.535 20.715 20.314 10.000 30.241 10.608 20.897 10.359 30.858 30.000 10.076 80.611 40.392 30.509 30.378 20.000 10.579 10.565 70.000 40.000 40.000 10.000 30.755 30.806 40.661 10.572 60.350 40.181 40.660 30.300 50.000 30.000 30.000 10.023 50.000 20.042 70.930 10.000 10.000 60.077 40.584 20.392 30.339 60.185 30.171 40.308 10.006 70.563 30.256 30.150 10.000 30.002 70.345 60.000 10.045 50.197 10.063 30.323 50.453 10.600 30.163 50.037 60.349 20.000 10.672 10.679 10.753 10.000 10.000 50.000 30.117 10.000 40.000 30.291 50.000 30.000 20.039 10.000 10.000 10.899 20.000 10.374 50.000 10.000 40.545 20.000 10.634 10.000 10.000 10.074 50.223 30.914 30.000 40.021 20.000 10.000 40.000 10.112 30.498 50.649 10.383 50.095 10.135 70.449 50.432 40.008 30.000 10.518 20.000 10.000 20.000 40.796 10.000 20.000 10.000 10.138 50.000 20.000 20.000 30.000 1
Xiaoyang Wu, Zhuotao Tian, Xin Wen, Bohao Peng, Xihui Liu, Kaicheng Yu, Hengshuang Zhao: Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt Training.
OA-CNN-L_ScanNet2000.333 20.558 10.269 20.124 40.821 10.703 10.946 10.569 10.662 10.748 20.487 10.455 10.572 20.000 80.789 20.534 30.736 20.271 10.713 10.949 10.498 70.877 20.860 30.332 20.706 10.474 10.788 30.406 40.637 20.495 30.355 40.805 20.592 60.015 70.396 10.602 40.000 10.799 20.876 10.713 80.276 10.000 50.493 40.080 40.448 60.363 10.661 20.833 20.262 20.125 20.823 40.665 30.076 40.720 10.557 30.637 40.517 40.672 60.227 40.000 30.158 40.496 30.843 50.352 40.835 50.000 10.103 60.711 10.527 10.526 20.320 30.000 10.568 20.625 30.067 10.000 40.000 10.001 20.806 20.836 20.621 30.591 30.373 30.314 20.668 20.398 20.003 20.000 30.000 10.016 80.024 10.043 60.906 20.000 10.052 30.000 70.384 30.330 50.342 50.100 40.223 20.183 40.112 30.476 40.313 20.130 50.196 20.112 30.370 50.000 10.234 30.071 40.160 10.403 20.398 50.492 70.197 10.076 50.272 30.000 10.200 80.560 20.735 30.000 10.000 50.000 30.110 20.002 30.021 20.412 30.000 30.000 20.000 40.000 10.000 10.794 40.000 10.445 10.000 10.022 20.509 30.000 10.517 70.000 10.000 10.001 80.245 20.915 20.024 20.089 10.000 10.262 10.000 10.103 60.524 20.392 40.515 20.013 80.251 30.411 60.662 10.001 50.000 10.473 40.000 10.000 20.150 30.699 30.000 20.000 10.000 10.166 20.000 20.024 10.000 30.000 1
AWCS0.305 50.508 50.225 50.142 20.782 50.634 80.937 50.489 60.578 50.721 30.364 60.355 50.515 40.023 50.764 50.523 40.707 50.264 30.633 50.922 50.507 60.886 10.804 60.179 60.436 70.300 50.656 70.529 20.501 60.394 40.296 70.820 10.603 30.131 20.179 80.619 20.000 10.707 70.865 50.773 30.171 30.010 40.484 50.063 60.463 50.254 50.332 70.649 50.220 50.100 50.729 60.613 60.071 60.582 70.628 20.702 10.424 60.749 10.137 60.000 30.142 50.360 50.863 30.305 50.877 20.000 10.173 10.606 50.337 50.478 50.154 60.000 10.253 50.664 20.000 40.000 40.000 10.000 30.626 60.782 50.302 70.602 20.185 70.282 30.651 40.317 40.000 30.000 30.000 10.022 60.000 20.154 10.876 40.000 10.014 50.063 60.029 80.553 10.467 20.084 50.124 50.157 70.049 60.373 50.252 40.097 60.000 30.219 10.542 10.000 10.392 10.172 30.000 70.339 30.417 20.533 60.093 60.115 30.195 50.000 10.516 50.288 70.741 20.000 10.001 40.233 20.056 50.000 40.159 10.334 40.077 20.000 20.000 40.000 10.000 10.749 50.000 10.411 30.000 10.008 30.452 50.000 10.595 40.000 10.000 10.220 40.006 50.894 60.006 30.000 30.000 10.000 40.000 10.112 30.504 30.404 30.551 10.093 30.129 80.484 40.381 80.000 60.000 10.396 60.000 10.000 20.620 20.402 80.000 20.000 10.000 10.142 40.000 20.000 20.512 20.000 1
LGroundpermissive0.272 60.485 60.184 60.106 60.778 60.676 30.932 60.479 80.572 60.718 50.399 50.265 60.453 70.085 20.745 60.446 60.726 40.232 60.622 60.901 60.512 50.826 60.786 70.178 70.549 40.277 60.659 60.381 60.518 50.295 80.323 50.777 50.599 40.028 40.321 40.363 70.000 10.708 60.858 60.746 50.063 60.022 30.457 60.077 50.476 30.243 60.402 50.397 80.233 40.077 80.720 80.610 70.103 20.629 50.437 80.626 50.446 50.702 30.190 50.005 10.058 70.322 60.702 70.244 60.768 60.000 10.134 50.552 60.279 70.395 60.147 70.000 10.207 60.612 50.000 40.000 40.000 10.000 30.658 50.566 60.323 60.525 80.229 60.179 50.467 80.154 70.000 30.002 10.000 10.051 10.000 20.127 20.703 60.000 10.000 60.216 10.112 70.358 40.547 10.187 20.092 70.156 80.055 50.296 60.252 40.143 20.000 30.014 50.398 30.000 10.028 70.173 20.000 70.265 70.348 60.415 80.179 20.019 70.218 40.000 10.597 40.274 80.565 40.000 10.012 30.000 30.039 70.022 20.000 30.117 60.000 30.000 20.000 40.000 10.000 10.324 70.000 10.384 40.000 10.000 40.251 80.000 10.566 50.000 10.000 10.066 60.404 10.886 70.199 10.000 30.000 10.059 20.000 10.136 10.540 10.127 80.295 60.085 50.143 50.514 20.413 60.000 60.000 10.498 30.000 10.000 20.000 40.623 50.000 20.000 10.000 10.132 70.000 20.000 20.000 30.000 1
David Rozenberszki, Or Litany, Angela Dai: Language-Grounded Indoor 3D Semantic Segmentation in the Wild. arXiv
CSC-Pretrainpermissive0.249 80.455 80.171 70.079 80.766 80.659 60.930 80.494 50.542 80.700 80.314 80.215 80.430 80.121 10.697 80.441 70.683 70.235 50.609 80.895 70.476 80.816 70.770 80.186 50.634 20.216 80.734 40.340 70.471 70.307 70.293 80.591 80.542 70.076 30.205 70.464 60.000 10.484 80.832 80.766 40.052 70.000 50.413 70.059 70.418 70.222 70.318 80.609 70.206 70.112 40.743 50.625 50.076 40.579 80.548 50.590 70.371 70.552 80.081 70.003 20.142 50.201 80.638 80.233 70.686 80.000 10.142 40.444 80.375 40.247 80.198 50.000 10.128 80.454 80.019 20.097 10.000 10.000 30.553 70.557 70.373 40.545 70.164 80.014 80.547 70.174 60.000 30.002 10.000 10.037 20.000 20.063 50.664 80.000 10.000 60.130 20.170 50.152 80.335 70.079 60.110 60.175 50.098 40.175 80.166 70.045 80.207 10.014 50.465 20.000 10.001 80.001 80.046 40.299 60.327 70.537 50.033 70.012 80.186 60.000 10.205 70.377 50.463 70.000 10.058 20.000 30.055 60.041 10.000 30.105 70.000 30.000 20.000 40.000 10.000 10.398 60.000 10.308 80.000 10.000 40.319 60.000 10.543 60.000 10.000 10.062 70.004 60.862 80.000 40.000 30.000 10.000 40.000 10.123 20.316 70.225 60.250 70.094 20.180 40.332 70.441 30.000 60.000 10.310 80.000 10.000 20.000 40.592 60.000 20.000 10.000 10.203 10.000 20.000 20.000 30.000 1
Ji Hou, Benjamin Graham, Matthias Nießner, Saining Xie: Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts. CVPR 2021
Minkowski 34Dpermissive0.253 70.463 70.154 80.102 70.771 70.650 70.932 60.483 70.571 70.710 60.331 70.250 70.492 60.044 30.703 70.419 80.606 80.227 70.621 70.865 80.531 30.771 80.813 50.291 40.484 60.242 70.612 80.282 80.440 80.351 60.299 60.622 70.593 50.027 50.293 50.310 80.000 10.757 50.858 60.737 70.150 40.164 10.368 80.084 30.381 80.142 80.357 60.720 40.214 60.092 70.724 70.596 80.056 70.655 20.525 60.581 80.352 80.594 70.056 80.000 30.014 80.224 70.772 60.205 80.720 70.000 10.159 20.531 70.163 80.294 70.136 80.000 10.169 70.589 60.000 40.000 40.000 10.002 10.663 40.466 80.265 80.582 50.337 50.016 70.559 60.084 80.000 30.000 30.000 10.036 30.000 20.125 30.670 70.000 10.102 10.071 50.164 60.406 20.386 40.046 80.068 80.159 60.117 20.284 70.111 80.094 70.000 30.000 80.197 80.000 10.044 60.013 60.002 60.228 80.307 80.588 40.025 80.545 10.134 80.000 10.655 20.302 60.282 80.000 10.060 10.000 30.035 80.000 40.000 30.097 80.000 30.000 20.005 30.000 10.000 10.096 80.000 10.334 70.000 10.000 40.274 70.000 10.513 80.000 10.000 10.280 30.194 40.897 50.000 40.000 30.000 10.000 40.000 10.108 50.279 80.189 70.141 80.059 70.272 20.307 80.445 20.003 40.000 10.353 70.000 10.026 10.000 40.581 70.001 10.000 10.000 10.093 80.002 10.000 20.000 30.000 1
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019