The 3D semantic instance prediction task involves detecting and segmenting the object in an 3D scan mesh.

Evaluation and metrics

Similarly to the ScanNet benchmark in ScanNet200 our evaluation ranks all methods according to the average precision for each class. We report the mean average precision AP at overlap 0.25 (AP 25%), overlap 0.5 (AP 50%), and over overlaps in the range [0.5:0.95:0.05] (AP) for all 200 categories. Note that multiple predictions of the same ground truth instance are penalized as false positives.



This table lists the benchmark results for the ScanNet200 3D semantic instance scenario.




Method Infoavg ap 25%head ap 25%common ap 25%tail ap 25%chairtabledoorcouchcabinetshelfdeskoffice chairbedpillowsinkpicturewindowtoiletbookshelfmonitorcurtainbookarmchaircoffee tableboxrefrigeratorlampkitchen cabinettowelclothestvnightstandcounterdresserstoolcushionplantceilingbathtubend tabledining tablekeyboardbagbackpacktoilet paperprintertv standwhiteboardblanketshower curtaintrash canclosetstairsmicrowavestoveshoecomputer towerbottlebinottomanbenchboardwashing machinemirrorcopierbasketsofa chairfile cabinetfanlaptopshowerpaperpersonpaper towel dispenserovenblindsrackplateblackboardpianosuitcaserailradiatorrecycling bincontainerwardrobesoap dispensertelephonebucketclockstandlightlaundry basketpipeclothes dryerguitartoilet paper holderseatspeakercolumnbicycleladderbathroom stallshower wallcupjacketstorage bincoffee makerdishwasherpaper towel rollmachinematwindowsillbartoasterbulletin boardironing boardfireplacesoap dishkitchen counterdoorframetoilet paper dispensermini fridgefire extinguisherballhatshower curtain rodwater coolerpaper cuttertrayshower doorpillarledgetoaster ovenmousetoilet seat cover dispenserfurniturecartstorage containerscaletissue boxlight switchcratepower outletdecorationsignprojectorcloset doorvacuum cleanercandleplungerstuffed animalheadphonesdish rackbroomguitar caserange hooddustpanhair dryerwater bottlehandicap barpurseventshower floorwater pitchermailboxbowlpaper bagalarm clockmusic standprojector screendividerlaundry detergentbathroom counterobjectbathroom vanitycloset walllaundry hamperbathroom stall doorceiling lighttrash bindumbbellstair railtubebathroom cabinetcd casecloset rodcoffee kettlestructureshower headkeyboard pianocase of water bottlescoat rackstorage organizerfolded chairfire alarmpower stripcalendarposterpotted plantluggagemattress
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort by
ODIN - Ins200permissive0.451 10.637 20.407 10.277 10.742 60.699 30.855 10.826 60.626 10.441 10.742 30.003 30.941 30.637 10.910 20.616 50.679 30.944 50.695 30.877 30.763 10.357 20.723 50.475 10.779 50.494 10.782 20.795 10.334 10.824 10.867 30.108 30.701 10.638 10.000 30.873 10.749 20.667 60.203 10.500 30.886 10.116 10.583 50.571 10.688 11.000 10.760 10.162 31.000 10.852 20.078 30.833 50.887 10.778 10.577 10.859 40.550 10.000 30.542 30.028 50.667 30.874 11.000 10.125 10.232 40.870 20.406 20.337 30.167 20.000 20.671 10.742 20.500 10.000 20.000 10.528 11.000 10.417 40.597 10.872 10.275 10.000 40.800 20.850 10.000 20.528 10.000 30.215 30.000 10.238 10.667 10.000 30.019 30.250 41.000 10.429 40.599 20.778 20.221 10.370 10.284 10.278 60.400 30.125 10.000 10.200 30.404 20.000 10.250 30.714 10.500 10.504 30.769 10.677 30.750 10.963 10.500 10.000 10.500 50.333 51.000 10.000 10.000 40.438 10.500 10.000 31.000 10.333 30.226 20.250 30.250 10.000 30.000 10.668 30.000 10.494 50.000 10.000 30.750 10.000 10.833 20.000 10.000 10.777 30.333 20.944 20.000 10.333 10.000 11.000 10.000 10.089 30.407 40.600 10.823 20.080 20.264 40.469 40.717 10.000 20.000 10.500 20.000 10.000 10.000 21.000 10.125 10.333 10.000 20.200 30.000 20.000 21.000 10.000 1
Mask3D Scannet2000.445 20.653 10.392 20.254 20.844 20.746 20.818 20.888 40.556 20.262 20.890 10.025 21.000 10.608 20.930 10.694 30.721 10.930 60.686 40.966 10.615 50.440 10.725 40.201 20.890 30.414 50.827 10.552 20.158 60.806 20.924 10.042 40.512 30.412 60.226 10.604 40.830 11.000 10.125 20.792 10.815 20.097 20.648 10.551 30.354 51.000 10.630 20.241 21.000 10.853 10.204 10.974 40.841 20.778 10.358 30.927 10.300 20.045 10.640 10.363 10.745 20.710 21.000 10.000 20.330 20.943 10.315 30.600 11.000 10.027 10.080 60.556 60.500 10.409 10.000 10.194 21.000 10.500 10.493 30.761 30.053 50.042 30.780 30.454 20.009 10.333 20.050 10.321 10.000 10.084 20.552 30.008 20.027 20.750 10.500 20.442 30.657 10.765 30.120 30.183 40.021 31.000 10.510 20.016 20.000 10.400 10.619 10.000 10.396 10.290 20.000 20.741 10.699 21.000 10.260 20.017 40.125 60.000 10.792 40.399 41.000 10.000 10.049 30.265 20.063 40.000 31.000 10.335 20.381 10.500 10.250 10.004 20.000 10.727 20.000 10.538 30.000 10.188 10.677 30.000 10.930 10.000 10.000 10.966 10.391 10.908 30.000 10.028 20.000 11.000 10.000 10.152 10.451 20.458 20.971 10.573 10.606 10.167 60.625 20.004 10.000 10.058 60.000 10.000 11.000 11.000 10.000 20.056 20.000 20.200 30.309 10.000 21.000 10.000 1
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
TD3D Scannet200permissive0.379 30.603 30.306 30.190 30.885 10.755 10.800 30.958 10.390 30.260 30.866 20.232 10.979 20.523 40.869 40.559 60.689 21.000 10.795 10.905 20.748 20.173 60.825 10.173 30.970 10.457 20.615 30.456 30.200 20.621 50.906 20.553 10.517 20.510 20.220 20.715 20.706 31.000 10.113 30.792 10.717 30.073 30.635 20.557 20.638 21.000 10.205 60.146 41.000 10.769 60.186 21.000 10.710 60.778 10.415 20.834 50.226 30.021 20.590 20.356 20.817 10.477 61.000 10.000 20.635 10.843 30.427 10.270 50.125 30.000 20.102 41.000 10.125 30.000 20.000 10.000 30.000 40.125 50.370 40.622 60.221 20.196 20.836 10.288 30.000 20.093 30.020 20.294 20.000 10.075 30.667 10.038 10.111 10.250 40.000 50.526 20.495 40.908 10.111 40.259 20.003 40.667 20.045 60.000 30.000 10.400 10.274 40.000 10.274 20.226 30.000 20.520 20.302 60.731 20.103 40.458 20.500 10.000 11.000 10.472 10.792 40.000 10.088 20.061 30.250 20.009 20.250 30.333 30.181 30.396 20.051 30.012 10.000 10.458 50.000 10.424 60.000 10.101 20.390 60.000 10.833 20.000 10.000 10.857 20.222 41.000 10.000 10.003 30.000 10.000 30.000 10.102 20.275 60.400 30.735 30.061 40.433 30.533 30.625 20.000 20.000 10.259 50.000 10.000 10.000 20.500 30.000 20.000 31.000 10.600 10.000 20.250 10.000 30.000 1
Maksim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: Top-Down Beats Bottom-Up in 3D Instance Segmentation. WACV 2024
Minkowski 34D Inst.permissive0.280 50.488 50.192 60.124 50.804 40.518 50.772 60.904 30.337 60.191 50.443 50.000 40.861 50.502 50.868 50.669 40.587 50.997 30.467 60.828 60.732 30.342 40.745 30.119 60.918 20.404 60.419 50.398 40.172 40.618 60.743 50.167 20.077 60.500 30.000 30.568 50.506 61.000 10.044 50.000 40.502 50.010 50.593 40.284 60.305 60.903 60.213 50.142 50.981 40.790 50.000 51.000 10.715 50.538 60.346 50.830 60.067 40.000 30.400 40.074 40.333 50.551 31.000 10.000 20.292 30.777 50.118 60.317 40.100 50.000 20.191 30.648 40.000 40.000 20.000 10.000 30.000 40.500 10.213 60.825 20.021 60.333 10.648 60.098 50.000 20.000 40.000 30.077 40.000 10.000 60.150 60.000 30.000 40.000 60.225 30.281 50.447 50.000 60.090 50.148 50.000 50.479 50.542 10.000 30.000 10.200 30.131 60.000 10.250 30.000 50.000 20.159 60.396 50.677 30.021 50.000 50.500 10.000 11.000 10.442 30.125 60.000 10.000 40.000 40.000 50.333 10.000 40.528 10.000 40.000 40.000 40.000 30.000 10.200 60.000 10.516 40.000 10.000 30.500 40.000 10.833 20.000 10.000 10.286 50.083 50.750 40.000 10.000 40.000 10.000 30.000 10.059 60.445 30.200 40.535 50.070 30.167 50.385 50.375 40.000 20.000 10.333 40.000 10.000 10.000 20.500 30.000 20.000 30.000 20.200 30.000 20.000 20.000 30.000 1
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019
CSC-Pretrain Inst.permissive0.275 60.466 60.218 50.110 60.783 50.383 60.783 50.829 50.367 50.168 60.305 60.000 40.661 60.413 60.869 30.719 10.546 60.997 30.685 50.841 50.555 60.277 50.768 20.132 40.779 50.448 40.364 60.212 60.161 50.768 30.692 60.000 50.395 40.500 30.000 30.450 60.591 41.000 10.020 60.000 40.423 60.007 60.625 30.420 40.505 41.000 10.353 30.119 60.571 50.819 30.014 41.000 10.774 30.689 50.311 60.866 20.067 40.000 30.400 40.000 60.278 60.501 41.000 10.000 20.162 60.584 60.286 40.206 60.125 30.000 20.084 50.649 30.000 40.000 20.000 10.000 30.000 40.125 50.312 50.727 40.221 30.000 40.667 50.114 40.000 20.000 40.000 30.065 60.000 10.004 50.278 40.000 30.000 40.500 20.000 50.571 10.000 60.250 50.019 60.145 60.000 50.667 20.200 50.000 30.000 10.200 30.258 50.000 10.000 50.000 50.000 20.369 50.429 40.613 50.000 60.000 50.500 10.000 10.500 50.333 50.500 50.000 10.106 10.000 40.000 50.000 30.000 40.333 30.000 40.000 40.000 40.000 30.000 10.918 10.000 10.638 10.000 10.000 30.750 10.000 10.833 20.000 10.000 10.143 60.000 60.750 40.000 10.000 40.000 10.000 30.000 10.063 50.377 50.200 40.222 60.055 50.500 20.677 20.250 50.000 20.000 10.500 20.000 10.000 10.000 20.500 30.000 20.000 30.000 20.115 60.000 20.000 20.000 30.000 1
Ji Hou, Benjamin Graham, Matthias Nießner, Saining Xie: Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts. CVPR 2021
LGround Inst.permissive0.314 40.529 40.225 40.155 40.810 30.625 40.798 40.940 20.372 40.217 40.484 40.000 40.927 40.528 30.826 60.694 20.605 41.000 10.731 20.846 40.716 40.350 30.589 60.123 50.857 40.457 30.578 40.376 50.183 30.765 40.800 40.000 50.278 50.500 30.000 30.659 30.569 51.000 10.093 40.000 40.539 40.010 40.578 60.378 50.571 31.000 10.337 40.252 10.530 60.814 40.000 50.744 60.743 40.746 40.346 40.863 30.067 40.000 30.400 40.167 30.667 30.488 51.000 10.000 20.208 50.783 40.166 50.375 20.071 60.000 20.200 20.607 50.000 40.000 20.000 10.000 31.000 10.500 10.517 20.716 50.221 30.000 40.706 40.085 60.000 20.000 40.000 30.077 50.000 10.063 40.278 40.000 30.000 40.500 20.083 40.181 60.515 30.286 40.144 20.219 30.042 20.582 40.400 30.000 30.000 10.000 60.305 30.000 10.000 50.036 40.000 20.413 40.500 30.533 60.250 30.200 30.500 10.000 11.000 10.472 11.000 10.000 10.000 40.000 40.250 20.000 30.000 40.333 30.000 40.000 40.000 40.000 30.000 10.600 40.000 10.594 20.000 10.000 30.500 40.000 10.647 60.000 10.000 10.429 40.333 20.500 60.000 10.000 40.000 10.000 30.000 10.069 40.696 10.050 60.556 40.031 60.042 60.750 10.250 50.000 20.000 10.630 10.000 10.000 10.000 20.500 30.000 20.000 30.000 20.400 20.000 20.000 20.000 30.000 1
David Rozenberszki, Or Litany, Angela Dai: Language-Grounded Indoor 3D Semantic Segmentation in the Wild.