The 3D semantic labeling task involves predicting a semantic labeling of a 3D scan mesh.

Evaluation and metrics

Our evaluation ranks all methods according to the PASCAL VOC intersection-over-union metric (IoU). IoU = TP/(TP+FP+FN), where TP, FP, and FN are the numbers of true positive, false positive, and false negative pixels, respectively. Predicted labels are evaluated per-vertex over the respective 3D scan mesh; for 3D approaches that operate on other representations like grids or points, the predicted labels should be mapped onto the mesh vertices (e.g., one such example for grid to mesh vertices is provided in the evaluation helpers).



This table lists the benchmark results for the ScanNet200 3D semantic label scenario.




Method Infoavg iouhead ioucommon ioutail iouwallchairfloortabledoorcouchcabinetshelfdeskoffice chairbedpillowsinkpicturewindowtoiletbookshelfmonitorcurtainbookarmchaircoffee tableboxrefrigeratorlampkitchen cabinettowelclothestvnightstandcounterdresserstoolcushionplantceilingbathtubend tabledining tablekeyboardbagbackpacktoilet paperprintertv standwhiteboardblanketshower curtaintrash canclosetstairsmicrowavestoveshoecomputer towerbottlebinottomanbenchboardwashing machinemirrorcopierbasketsofa chairfile cabinetfanlaptopshowerpaperpersonpaper towel dispenserovenblindsrackplateblackboardpianosuitcaserailradiatorrecycling bincontainerwardrobesoap dispensertelephonebucketclockstandlightlaundry basketpipeclothes dryerguitartoilet paper holderseatspeakercolumnbicycleladderbathroom stallshower wallcupjacketstorage bincoffee makerdishwasherpaper towel rollmachinematwindowsillbartoasterbulletin boardironing boardfireplacesoap dishkitchen counterdoorframetoilet paper dispensermini fridgefire extinguisherballhatshower curtain rodwater coolerpaper cuttertrayshower doorpillarledgetoaster ovenmousetoilet seat cover dispenserfurniturecartstorage containerscaletissue boxlight switchcratepower outletdecorationsignprojectorcloset doorvacuum cleanercandleplungerstuffed animalheadphonesdish rackbroomguitar caserange hooddustpanhair dryerwater bottlehandicap barpurseventshower floorwater pitchermailboxbowlpaper bagalarm clockmusic standprojector screendividerlaundry detergentbathroom counterobjectbathroom vanitycloset walllaundry hamperbathroom stall doorceiling lighttrash bindumbbellstair railtubebathroom cabinetcd casecloset rodcoffee kettlestructureshower headkeyboard pianocase of water bottlescoat rackstorage organizerfolded chairfire alarmpower stripcalendarposterpotted plantluggagemattress
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
PonderV2 ScanNet2000.346 20.552 40.270 30.175 30.810 40.682 30.950 20.560 40.641 50.761 10.398 70.357 60.570 40.113 20.804 30.603 30.750 30.283 20.681 40.952 20.548 20.874 30.852 60.290 60.700 20.356 60.792 30.445 50.545 60.436 50.351 60.787 50.611 40.050 50.290 70.519 70.000 10.825 30.888 20.842 20.259 20.100 20.558 30.070 70.497 40.247 70.457 60.889 20.248 50.106 60.817 60.691 30.094 40.729 10.636 30.620 70.503 60.660 80.243 40.000 30.212 40.590 30.860 60.400 30.881 20.000 20.202 10.622 50.408 40.499 50.261 50.000 10.385 50.636 40.000 40.000 50.000 10.000 30.433 100.843 30.660 30.574 70.481 20.336 30.677 30.486 20.000 30.030 10.000 10.034 40.000 30.080 50.869 60.000 10.000 70.000 70.540 40.727 20.232 100.115 40.186 40.193 40.000 90.403 50.326 30.103 70.000 30.290 20.392 50.000 10.346 40.062 60.424 20.375 40.431 30.667 20.115 70.082 60.239 40.000 10.504 80.606 30.584 50.000 10.002 40.186 40.104 50.000 50.394 20.384 50.083 30.000 30.007 40.000 10.000 10.880 40.000 10.377 60.000 10.263 20.565 20.000 10.608 50.000 10.000 10.304 30.009 50.924 10.000 40.000 40.000 10.000 40.000 10.128 20.584 10.475 40.412 50.076 70.269 30.621 30.509 30.010 30.000 10.491 50.063 10.000 20.472 30.880 10.000 20.000 10.000 10.179 30.125 10.000 20.441 40.000 1
Haoyi Zhu, Honghui Yang, Xiaoyang Wu, Di Huang, Sha Zhang, Xianglong He, Tong He, Hengshuang Zhao, Chunhua Shen, Yu Qiao, Wanli Ouyang: PonderV2: Pave the Way for 3D Foundataion Model with A Universal Pre-training Paradigm.
PTv3 ScanNet2000.393 10.592 10.330 10.216 10.851 10.687 20.971 10.586 10.755 10.752 30.505 10.404 40.575 10.000 90.848 10.616 10.761 10.349 10.738 10.978 10.546 30.860 60.926 10.346 10.654 30.384 30.828 10.523 30.699 10.583 20.387 40.822 10.688 10.118 30.474 10.603 40.000 10.832 10.903 10.753 60.140 60.000 60.650 10.109 20.520 10.457 10.497 50.871 30.281 10.192 20.887 10.748 10.168 10.727 20.733 10.740 10.644 10.714 30.190 60.000 30.256 10.449 50.914 10.514 10.759 80.337 10.172 30.692 20.617 10.636 10.325 30.000 10.641 10.782 10.000 40.065 20.000 10.000 30.842 10.903 10.661 10.662 20.612 10.405 20.731 10.566 10.000 30.000 40.000 10.017 90.301 10.088 40.941 10.000 10.077 20.000 70.717 20.790 10.310 90.026 100.264 10.349 10.220 20.397 60.366 10.115 60.000 30.337 10.463 30.000 10.531 10.218 10.593 10.455 10.469 10.708 10.210 10.592 10.108 100.000 10.728 10.682 10.671 40.000 10.000 60.407 10.136 10.022 20.575 10.436 30.259 10.428 10.048 10.000 10.000 10.879 50.000 10.480 10.000 10.133 30.597 10.000 10.690 10.000 10.000 10.009 90.000 80.921 20.000 40.151 10.000 10.000 40.000 10.109 60.494 70.622 20.394 60.073 80.141 70.798 10.528 20.026 10.000 10.551 20.000 20.000 20.134 50.717 40.000 20.000 10.000 10.188 20.000 30.000 20.791 10.000 1
OA-CNN-L_ScanNet2000.333 40.558 20.269 40.124 60.821 20.703 10.946 30.569 20.662 20.748 40.487 20.455 10.572 30.000 90.789 40.534 50.736 40.271 30.713 20.949 30.498 90.877 20.860 40.332 30.706 10.474 10.788 50.406 60.637 30.495 40.355 50.805 30.592 80.015 90.396 20.602 50.000 10.799 40.876 30.713 100.276 10.000 60.493 60.080 50.448 80.363 20.661 20.833 40.262 30.125 30.823 50.665 50.076 60.720 30.557 50.637 50.517 50.672 70.227 50.000 30.158 60.496 40.843 70.352 60.835 60.000 20.103 80.711 10.527 20.526 30.320 40.000 10.568 30.625 50.067 10.000 50.000 10.001 20.806 30.836 40.621 50.591 40.373 50.314 40.668 40.398 40.003 20.000 40.000 10.016 100.024 20.043 80.906 30.000 10.052 40.000 70.384 50.330 70.342 50.100 50.223 30.183 60.112 40.476 40.313 40.130 50.196 20.112 50.370 70.000 10.234 50.071 50.160 30.403 30.398 70.492 90.197 20.076 70.272 30.000 10.200 100.560 40.735 30.000 10.000 60.000 50.110 30.002 40.021 40.412 40.000 50.000 30.000 60.000 10.000 10.794 60.000 10.445 20.000 10.022 40.509 50.000 10.517 90.000 10.000 10.001 100.245 20.915 40.024 20.089 20.000 10.262 10.000 10.103 80.524 30.392 60.515 20.013 100.251 40.411 80.662 10.001 70.000 10.473 60.000 20.000 20.150 40.699 50.000 20.000 10.000 10.166 40.000 30.024 10.000 50.000 1
PPT-SpUNet-F.T.0.332 50.556 30.270 20.123 70.816 30.682 30.946 30.549 50.657 40.756 20.459 40.376 50.550 50.001 80.807 20.616 10.727 50.267 40.691 30.942 60.530 60.872 40.874 30.330 40.542 70.374 40.792 30.400 70.673 20.572 30.433 10.793 40.623 30.008 100.351 40.594 60.000 10.783 60.876 30.833 30.213 30.000 60.537 40.091 30.519 20.304 30.620 40.942 10.264 20.124 40.855 20.695 20.086 50.646 50.506 90.658 30.535 30.715 20.314 10.000 30.241 20.608 20.897 20.359 50.858 40.000 20.076 100.611 60.392 50.509 40.378 20.000 10.579 20.565 90.000 40.000 50.000 10.000 30.755 40.806 60.661 10.572 80.350 60.181 60.660 50.300 70.000 30.000 40.000 10.023 60.000 30.042 90.930 20.000 10.000 70.077 40.584 30.392 50.339 60.185 30.171 60.308 20.006 80.563 30.256 50.150 10.000 30.002 90.345 80.000 10.045 70.197 20.063 50.323 70.453 20.600 50.163 60.037 80.349 20.000 10.672 20.679 20.753 10.000 10.000 60.000 50.117 20.000 50.000 50.291 70.000 50.000 30.039 20.000 10.000 10.899 20.000 10.374 70.000 10.000 60.545 40.000 10.634 20.000 10.000 10.074 60.223 30.914 50.000 40.021 30.000 10.000 40.000 10.112 40.498 60.649 10.383 70.095 10.135 90.449 70.432 60.008 50.000 10.518 30.000 20.000 20.000 60.796 20.000 20.000 10.000 10.138 70.000 30.000 20.000 50.000 1
Xiaoyang Wu, Zhuotao Tian, Xin Wen, Bohao Peng, Xihui Liu, Kaicheng Yu, Hengshuang Zhao: Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt Training.
OctFormer ScanNet200permissive0.326 60.539 60.265 50.131 50.806 50.670 60.943 50.535 60.662 20.705 90.423 50.407 30.505 70.003 70.765 60.582 40.686 80.227 90.680 50.943 50.601 10.854 70.892 20.335 20.417 100.357 50.724 70.453 40.632 40.596 10.432 20.783 60.512 100.021 80.244 80.637 10.000 10.787 50.873 50.743 80.000 100.000 60.534 50.110 10.499 30.289 40.626 30.620 80.168 100.204 10.849 30.679 40.117 20.633 60.684 20.650 40.552 20.684 60.312 20.000 30.175 50.429 60.865 30.413 20.837 50.000 20.145 50.626 40.451 30.487 60.513 10.000 10.529 40.613 60.000 40.033 30.000 10.000 30.828 20.871 20.622 40.587 50.411 40.137 80.645 70.343 50.000 30.000 40.000 10.022 70.000 30.026 100.829 70.000 10.022 50.089 30.842 10.253 90.318 80.296 10.178 50.291 30.224 10.584 20.200 80.132 40.000 30.128 40.227 90.000 10.230 60.047 70.149 40.331 60.412 50.618 40.164 50.102 50.522 10.000 10.655 30.378 60.469 80.000 10.000 60.000 50.105 40.000 50.000 50.483 20.000 50.000 30.028 30.000 10.000 10.906 10.000 10.339 80.000 10.000 60.457 60.000 10.612 40.000 10.000 10.408 10.000 80.900 60.000 40.000 40.000 10.029 30.000 10.074 100.455 80.479 30.427 40.079 60.140 80.496 50.414 70.022 20.000 10.471 70.000 20.000 20.000 60.722 30.000 20.000 10.000 10.138 70.000 30.000 20.000 50.000 1
Peng-Shuai Wang: OctFormer: Octree-based Transformers for 3D Point Clouds. SIGGRAPH 2023
CeCo0.340 30.551 50.247 60.181 20.784 60.661 70.939 60.564 30.624 60.721 50.484 30.429 20.575 10.027 50.774 50.503 70.753 20.242 60.656 60.945 40.534 40.865 50.860 40.177 100.616 50.400 20.818 20.579 10.615 50.367 70.408 30.726 80.633 20.162 10.360 30.619 20.000 10.828 20.873 50.924 10.109 70.083 30.564 20.057 100.475 60.266 50.781 10.767 50.257 40.100 70.825 40.663 60.048 100.620 80.551 60.595 80.532 40.692 50.246 30.000 30.213 30.615 10.861 50.376 40.900 10.000 20.102 90.660 30.321 80.547 20.226 60.000 10.311 60.742 20.011 30.006 40.000 10.000 30.546 90.824 50.345 70.665 10.450 30.435 10.683 20.411 30.338 10.000 40.000 10.030 50.000 30.068 60.892 40.000 10.063 30.000 70.257 60.304 80.387 30.079 70.228 20.190 50.000 90.586 10.347 20.133 30.000 30.037 60.377 60.000 10.384 30.006 90.003 70.421 20.410 60.643 30.171 40.121 30.142 80.000 10.510 70.447 50.474 70.000 10.000 60.286 20.083 60.000 50.000 50.603 10.096 20.063 20.000 60.000 10.000 10.898 30.000 10.429 30.000 10.400 10.550 30.000 10.633 30.000 10.000 10.377 20.000 80.916 30.000 40.000 40.000 10.000 40.000 10.102 90.499 50.296 70.463 30.089 40.304 10.740 20.401 90.010 30.000 10.560 10.000 20.000 20.709 10.652 60.000 20.000 10.000 10.143 50.000 30.000 20.609 20.000 1
: Understanding Imbalanced Semantic Segmentation Through Neural Collapse.
AWCS0.305 70.508 70.225 70.142 40.782 70.634 100.937 70.489 80.578 70.721 50.364 80.355 70.515 60.023 60.764 70.523 60.707 70.264 50.633 70.922 70.507 80.886 10.804 80.179 80.436 90.300 70.656 90.529 20.501 80.394 60.296 90.820 20.603 50.131 20.179 100.619 20.000 10.707 90.865 70.773 40.171 40.010 50.484 70.063 80.463 70.254 60.332 90.649 70.220 70.100 70.729 80.613 80.071 80.582 90.628 40.702 20.424 80.749 10.137 80.000 30.142 70.360 70.863 40.305 70.877 30.000 20.173 20.606 70.337 70.478 70.154 80.000 10.253 70.664 30.000 40.000 50.000 10.000 30.626 70.782 70.302 90.602 30.185 90.282 50.651 60.317 60.000 30.000 40.000 10.022 70.000 30.154 10.876 50.000 10.014 60.063 60.029 100.553 30.467 20.084 60.124 70.157 90.049 70.373 70.252 60.097 80.000 30.219 30.542 10.000 10.392 20.172 40.000 90.339 50.417 40.533 80.093 80.115 40.195 60.000 10.516 60.288 90.741 20.000 10.001 50.233 30.056 70.000 50.159 30.334 60.077 40.000 30.000 60.000 10.000 10.749 70.000 10.411 40.000 10.008 50.452 70.000 10.595 60.000 10.000 10.220 50.006 60.894 80.006 30.000 40.000 10.000 40.000 10.112 40.504 40.404 50.551 10.093 30.129 100.484 60.381 100.000 80.000 10.396 80.000 20.000 20.620 20.402 100.000 20.000 10.000 10.142 60.000 30.000 20.512 30.000 1
LGroundpermissive0.272 80.485 80.184 80.106 80.778 80.676 50.932 80.479 100.572 80.718 70.399 60.265 80.453 90.085 30.745 80.446 80.726 60.232 80.622 80.901 80.512 70.826 80.786 90.178 90.549 60.277 80.659 80.381 80.518 70.295 100.323 70.777 70.599 60.028 60.321 50.363 90.000 10.708 80.858 80.746 70.063 80.022 40.457 80.077 60.476 50.243 80.402 70.397 100.233 60.077 100.720 100.610 90.103 30.629 70.437 100.626 60.446 70.702 40.190 60.005 10.058 90.322 80.702 90.244 80.768 70.000 20.134 70.552 80.279 90.395 80.147 90.000 10.207 80.612 70.000 40.000 50.000 10.000 30.658 60.566 80.323 80.525 100.229 80.179 70.467 100.154 90.000 30.002 20.000 10.051 10.000 30.127 20.703 80.000 10.000 70.216 10.112 90.358 60.547 10.187 20.092 90.156 100.055 60.296 80.252 60.143 20.000 30.014 70.398 40.000 10.028 90.173 30.000 90.265 90.348 80.415 100.179 30.019 90.218 50.000 10.597 50.274 100.565 60.000 10.012 30.000 50.039 90.022 20.000 50.117 80.000 50.000 30.000 60.000 10.000 10.324 90.000 10.384 50.000 10.000 60.251 100.000 10.566 70.000 10.000 10.066 70.404 10.886 90.199 10.000 40.000 10.059 20.000 10.136 10.540 20.127 100.295 80.085 50.143 60.514 40.413 80.000 80.000 10.498 40.000 20.000 20.000 60.623 70.000 20.000 10.000 10.132 90.000 30.000 20.000 50.000 1
David Rozenberszki, Or Litany, Angela Dai: Language-Grounded Indoor 3D Semantic Segmentation in the Wild. arXiv
CSC-Pretrainpermissive0.249 100.455 100.171 90.079 100.766 100.659 80.930 100.494 70.542 100.700 100.314 100.215 100.430 100.121 10.697 100.441 90.683 90.235 70.609 100.895 90.476 100.816 90.770 100.186 70.634 40.216 100.734 60.340 90.471 90.307 90.293 100.591 100.542 90.076 40.205 90.464 80.000 10.484 100.832 100.766 50.052 90.000 60.413 90.059 90.418 90.222 90.318 100.609 90.206 90.112 50.743 70.625 70.076 60.579 100.548 70.590 90.371 90.552 100.081 90.003 20.142 70.201 100.638 100.233 90.686 100.000 20.142 60.444 100.375 60.247 100.198 70.000 10.128 100.454 100.019 20.097 10.000 10.000 30.553 80.557 90.373 60.545 90.164 100.014 100.547 90.174 80.000 30.002 20.000 10.037 20.000 30.063 70.664 100.000 10.000 70.130 20.170 70.152 100.335 70.079 70.110 80.175 70.098 50.175 100.166 90.045 100.207 10.014 70.465 20.000 10.001 100.001 100.046 60.299 80.327 90.537 70.033 90.012 100.186 70.000 10.205 90.377 70.463 90.000 10.058 20.000 50.055 80.041 10.000 50.105 90.000 50.000 30.000 60.000 10.000 10.398 80.000 10.308 100.000 10.000 60.319 80.000 10.543 80.000 10.000 10.062 80.004 70.862 100.000 40.000 40.000 10.000 40.000 10.123 30.316 90.225 80.250 90.094 20.180 50.332 90.441 50.000 80.000 10.310 100.000 20.000 20.000 60.592 80.000 20.000 10.000 10.203 10.000 30.000 20.000 50.000 1
Ji Hou, Benjamin Graham, Matthias Nießner, Saining Xie: Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts. CVPR 2021
Minkowski 34Dpermissive0.253 90.463 90.154 100.102 90.771 90.650 90.932 80.483 90.571 90.710 80.331 90.250 90.492 80.044 40.703 90.419 100.606 100.227 90.621 90.865 100.531 50.771 100.813 70.291 50.484 80.242 90.612 100.282 100.440 100.351 80.299 80.622 90.593 70.027 70.293 60.310 100.000 10.757 70.858 80.737 90.150 50.164 10.368 100.084 40.381 100.142 100.357 80.720 60.214 80.092 90.724 90.596 100.056 90.655 40.525 80.581 100.352 100.594 90.056 100.000 30.014 100.224 90.772 80.205 100.720 90.000 20.159 40.531 90.163 100.294 90.136 100.000 10.169 90.589 80.000 40.000 50.000 10.002 10.663 50.466 100.265 100.582 60.337 70.016 90.559 80.084 100.000 30.000 40.000 10.036 30.000 30.125 30.670 90.000 10.102 10.071 50.164 80.406 40.386 40.046 90.068 100.159 80.117 30.284 90.111 100.094 90.000 30.000 100.197 100.000 10.044 80.013 80.002 80.228 100.307 100.588 60.025 100.545 20.134 90.000 10.655 30.302 80.282 100.000 10.060 10.000 50.035 100.000 50.000 50.097 100.000 50.000 30.005 50.000 10.000 10.096 100.000 10.334 90.000 10.000 60.274 90.000 10.513 100.000 10.000 10.280 40.194 40.897 70.000 40.000 40.000 10.000 40.000 10.108 70.279 100.189 90.141 100.059 90.272 20.307 100.445 40.003 60.000 10.353 90.000 20.026 10.000 60.581 90.001 10.000 10.000 10.093 100.002 20.000 20.000 50.000 1
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019