The 3D semantic labeling task involves predicting a semantic labeling of a 3D scan mesh.

Evaluation and metrics

Our evaluation ranks all methods according to the PASCAL VOC intersection-over-union metric (IoU). IoU = TP/(TP+FP+FN), where TP, FP, and FN are the numbers of true positive, false positive, and false negative pixels, respectively. Predicted labels are evaluated per-vertex over the respective 3D scan mesh; for 3D approaches that operate on other representations like grids or points, the predicted labels should be mapped onto the mesh vertices (e.g., one such example for grid to mesh vertices is provided in the evaluation helpers).



This table lists the benchmark results for the ScanNet200 3D semantic label scenario.




Method Infoavg iouhead ioucommon ioutail iouwallchairfloortabledoorcouchcabinetshelfdeskoffice chairbedpillowsinkpicturewindowtoiletbookshelfmonitorcurtainbookarmchaircoffee tableboxrefrigeratorlampkitchen cabinettowelclothestvnightstandcounterdresserstoolcushionplantceilingbathtubend tabledining tablekeyboardbagbackpacktoilet paperprintertv standwhiteboardblanketshower curtaintrash canclosetstairsmicrowavestoveshoecomputer towerbottlebinottomanbenchboardwashing machinemirrorcopierbasketsofa chairfile cabinetfanlaptopshowerpaperpersonpaper towel dispenserovenblindsrackplateblackboardpianosuitcaserailradiatorrecycling bincontainerwardrobesoap dispensertelephonebucketclockstandlightlaundry basketpipeclothes dryerguitartoilet paper holderseatspeakercolumnbicycleladderbathroom stallshower wallcupjacketstorage bincoffee makerdishwasherpaper towel rollmachinematwindowsillbartoasterbulletin boardironing boardfireplacesoap dishkitchen counterdoorframetoilet paper dispensermini fridgefire extinguisherballhatshower curtain rodwater coolerpaper cuttertrayshower doorpillarledgetoaster ovenmousetoilet seat cover dispenserfurniturecartstorage containerscaletissue boxlight switchcratepower outletdecorationsignprojectorcloset doorvacuum cleanercandleplungerstuffed animalheadphonesdish rackbroomguitar caserange hooddustpanhair dryerwater bottlehandicap barpurseventshower floorwater pitchermailboxbowlpaper bagalarm clockmusic standprojector screendividerlaundry detergentbathroom counterobjectbathroom vanitycloset walllaundry hamperbathroom stall doorceiling lighttrash bindumbbellstair railtubebathroom cabinetcd casecloset rodcoffee kettlestructureshower headkeyboard pianocase of water bottlescoat rackstorage organizerfolded chairfire alarmpower stripcalendarposterpotted plantluggagemattress
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Minkowski 34Dpermissive0.253 30.463 30.154 40.102 30.771 30.650 40.932 20.483 30.571 30.710 30.331 30.250 30.492 20.044 30.703 30.419 40.606 40.227 40.621 30.865 40.531 20.771 40.813 20.291 10.484 40.242 30.612 40.282 40.440 40.351 20.299 30.622 30.593 30.027 40.293 30.310 40.000 10.757 20.858 20.737 40.150 10.164 10.368 40.084 10.381 40.142 40.357 30.720 20.214 30.092 30.724 30.596 40.056 30.655 10.525 30.581 40.352 40.594 30.056 40.000 30.014 40.224 30.772 20.205 40.720 30.000 10.159 10.531 30.163 40.294 30.136 40.000 10.169 30.589 30.000 30.000 30.000 10.002 10.663 10.466 40.265 40.582 20.337 20.016 30.559 20.084 40.000 20.000 30.000 10.036 30.000 10.125 20.670 30.000 10.102 10.071 30.164 30.406 10.386 30.046 40.068 40.159 30.117 10.284 30.111 40.094 30.000 20.000 40.197 40.000 10.044 20.013 20.002 30.228 40.307 40.588 20.025 40.545 10.134 40.000 10.655 10.302 30.282 40.000 10.060 10.000 20.035 40.000 30.000 10.097 40.000 20.000 20.005 10.000 10.000 10.096 40.000 10.334 30.000 10.000 20.274 30.000 10.513 40.000 10.000 10.280 20.194 20.897 20.000 20.000 10.000 10.000 20.000 10.108 30.279 40.189 30.141 40.059 40.272 20.307 40.445 10.003 20.000 10.353 30.000 10.026 10.000 20.581 40.001 10.000 10.000 10.093 40.002 10.000 10.000 20.000 1
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019
CeCo0.340 10.551 10.247 10.181 10.784 10.661 20.939 10.564 10.624 10.721 10.484 10.429 10.575 10.027 40.774 10.503 10.753 10.242 10.656 10.945 10.534 10.865 10.860 10.177 40.616 20.400 10.818 10.579 10.615 10.367 10.408 10.726 20.633 10.162 10.360 10.619 10.000 10.828 10.873 10.924 10.109 20.083 20.564 10.057 40.475 20.266 10.781 10.767 10.257 10.100 20.825 10.663 10.048 40.620 30.551 10.595 20.532 10.692 20.246 10.000 30.213 10.615 10.861 10.376 10.900 10.000 10.102 40.660 10.321 20.547 10.226 10.000 10.311 10.742 10.011 20.006 20.000 10.000 20.546 40.824 10.345 20.665 10.450 10.435 10.683 10.411 10.338 10.000 30.000 10.030 40.000 10.068 30.892 10.000 10.063 20.000 40.257 10.304 30.387 20.079 20.228 10.190 10.000 40.586 10.347 10.133 20.000 20.037 10.377 30.000 10.384 10.006 30.003 20.421 10.410 10.643 10.171 20.121 20.142 30.000 10.510 30.447 10.474 20.000 10.000 40.286 10.083 10.000 30.000 10.603 10.096 10.063 10.000 20.000 10.000 10.898 10.000 10.429 10.000 10.400 10.550 10.000 10.633 10.000 10.000 10.377 10.000 40.916 10.000 20.000 10.000 10.000 20.000 10.102 40.499 20.296 10.463 10.089 20.304 10.740 10.401 40.010 10.000 10.560 10.000 10.000 20.709 10.652 10.000 20.000 10.000 10.143 20.000 20.000 10.609 10.000 1
: Understanding Imbalanced Semantic Segmentation Through Neural Collapse.
LGroundpermissive0.272 20.485 20.184 20.106 20.778 20.676 10.932 20.479 40.572 20.718 20.399 20.265 20.453 30.085 20.745 20.446 20.726 20.232 30.622 20.901 20.512 30.826 20.786 30.178 30.549 30.277 20.659 30.381 20.518 20.295 40.323 20.777 10.599 20.028 30.321 20.363 30.000 10.708 30.858 20.746 30.063 30.022 30.457 20.077 20.476 10.243 20.402 20.397 40.233 20.077 40.720 40.610 30.103 10.629 20.437 40.626 10.446 20.702 10.190 20.005 10.058 30.322 20.702 30.244 20.768 20.000 10.134 30.552 20.279 30.395 20.147 30.000 10.207 20.612 20.000 30.000 30.000 10.000 20.658 20.566 20.323 30.525 40.229 30.179 20.467 40.154 30.000 20.002 10.000 10.051 10.000 10.127 10.703 20.000 10.000 30.216 10.112 40.358 20.547 10.187 10.092 30.156 40.055 30.296 20.252 20.143 10.000 20.014 20.398 20.000 10.028 30.173 10.000 40.265 30.348 20.415 40.179 10.019 30.218 10.000 10.597 20.274 40.565 10.000 10.012 30.000 20.039 30.022 20.000 10.117 20.000 20.000 20.000 20.000 10.000 10.324 30.000 10.384 20.000 10.000 20.251 40.000 10.566 20.000 10.000 10.066 30.404 10.886 30.199 10.000 10.000 10.059 10.000 10.136 10.540 10.127 40.295 20.085 30.143 40.514 20.413 30.000 30.000 10.498 20.000 10.000 20.000 20.623 20.000 20.000 10.000 10.132 30.000 20.000 10.000 20.000 1
David Rozenberszki, Or Litany, Angela Dai: Language-Grounded Indoor 3D Semantic Segmentation in the Wild. arXiv
CSC-Pretrainpermissive0.249 40.455 40.171 30.079 40.766 40.659 30.930 40.494 20.542 40.700 40.314 40.215 40.430 40.121 10.697 40.441 30.683 30.235 20.609 40.895 30.476 40.816 30.770 40.186 20.634 10.216 40.734 20.340 30.471 30.307 30.293 40.591 40.542 40.076 20.205 40.464 20.000 10.484 40.832 40.766 20.052 40.000 40.413 30.059 30.418 30.222 30.318 40.609 30.206 40.112 10.743 20.625 20.076 20.579 40.548 20.590 30.371 30.552 40.081 30.003 20.142 20.201 40.638 40.233 30.686 40.000 10.142 20.444 40.375 10.247 40.198 20.000 10.128 40.454 40.019 10.097 10.000 10.000 20.553 30.557 30.373 10.545 30.164 40.014 40.547 30.174 20.000 20.002 10.000 10.037 20.000 10.063 40.664 40.000 10.000 30.130 20.170 20.152 40.335 40.079 20.110 20.175 20.098 20.175 40.166 30.045 40.207 10.014 20.465 10.000 10.001 40.001 40.046 10.299 20.327 30.537 30.033 30.012 40.186 20.000 10.205 40.377 20.463 30.000 10.058 20.000 20.055 20.041 10.000 10.105 30.000 20.000 20.000 20.000 10.000 10.398 20.000 10.308 40.000 10.000 20.319 20.000 10.543 30.000 10.000 10.062 40.004 30.862 40.000 20.000 10.000 10.000 20.000 10.123 20.316 30.225 20.250 30.094 10.180 30.332 30.441 20.000 30.000 10.310 40.000 10.000 20.000 20.592 30.000 20.000 10.000 10.203 10.000 20.000 10.000 20.000 1
Ji Hou, Benjamin Graham, Matthias Nie├čner, Saining Xie: Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts. CVPR 2021