The 3D semantic instance prediction task involves detecting and segmenting the object in an 3D scan mesh.

Evaluation and metrics

Similarly to the ScanNet benchmark in ScanNet200 our evaluation ranks all methods according to the average precision for each class. We report the mean average precision AP at overlap 0.25 (AP 25%), overlap 0.5 (AP 50%), and over overlaps in the range [0.5:0.95:0.05] (AP) for all 200 categories. Note that multiple predictions of the same ground truth instance are penalized as false positives.



This table lists the benchmark results for the ScanNet200 3D semantic instance scenario.




Method Infoavg ap 25%head ap 25%common ap 25%tail ap 25%alarm clockarmchairbackpackbagballbarbasketbathroom cabinetbathroom counterbathroom stallbathroom stall doorbathroom vanitybathtubbedbenchbicyclebinblackboardblanketblindsboardbookbookshelfbottlebowlboxbroombucketbulletin boardcabinetcalendarcandlecartcase of water bottlescd caseceilingceiling lightchairclockclosetcloset doorcloset rodcloset wallclothesclothes dryercoat rackcoffee kettlecoffee makercoffee tablecolumncomputer towercontainercopiercouchcountercratecupcurtaincushiondecorationdeskdining tabledish rackdishwasherdividerdoordoorframedresserdumbbelldustpanend tablefanfile cabinetfire alarmfire extinguisherfireplacefolded chairfurnitureguitarguitar casehair dryerhandicap barhatheadphonesironing boardjacketkeyboardkeyboard pianokitchen cabinetkitchen counterladderlamplaptoplaundry basketlaundry detergentlaundry hamperledgelightlight switchluggagemachinemailboxmatmattressmicrowavemini fridgemirrormonitormousemusic standnightstandobjectoffice chairottomanovenpaperpaper bagpaper cutterpaper towel dispenserpaper towel rollpersonpianopicturepillarpillowpipeplantplateplungerposterpotted plantpower outletpower stripprinterprojectorprojector screenpurserackradiatorrailrange hoodrecycling binrefrigeratorscaleseatshelfshoeshowershower curtainshower curtain rodshower doorshower floorshower headshower wallsignsinksoap dishsoap dispensersofa chairspeakerstair railstairsstandstoolstorage binstorage containerstorage organizerstovestructurestuffed animalsuitcasetabletelephonetissue boxtoastertoaster oventoilettoilet papertoilet paper dispensertoilet paper holdertoilet seat cover dispensertoweltrash bintrash cantraytubetvtv standvacuum cleanerventwardrobewashing machinewater bottlewater coolerwater pitcherwhiteboardwindowwindowsill
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
TD3D Scannet200permissive0.379 50.603 50.306 50.190 50.635 40.073 40.500 10.000 20.000 30.000 10.495 60.735 40.275 81.000 10.979 40.590 30.000 70.021 40.000 50.146 60.000 30.356 30.173 80.795 10.226 50.000 30.173 50.000 10.000 20.226 50.390 40.000 50.000 10.250 30.000 30.706 50.061 60.885 30.093 40.186 40.259 70.200 40.667 30.000 40.000 10.667 40.825 10.250 60.834 71.000 10.958 10.553 10.111 60.748 40.220 20.051 50.866 40.792 10.390 80.045 80.800 50.302 80.517 40.533 40.113 50.427 30.843 30.000 30.458 30.600 10.000 20.101 40.000 40.259 30.717 50.500 50.615 50.520 40.526 30.457 40.270 70.000 10.000 10.400 30.088 40.294 30.181 30.000 21.000 10.400 20.710 80.103 60.477 80.905 40.061 40.000 10.906 30.102 40.232 20.125 50.000 30.003 50.792 61.000 10.000 50.102 60.125 70.559 80.523 60.075 50.715 30.000 50.424 80.000 10.396 30.250 10.638 40.000 30.000 50.000 20.622 80.833 40.221 20.970 10.250 30.038 10.260 50.415 40.125 41.000 11.000 10.857 20.000 30.908 10.012 10.869 60.836 30.635 10.111 30.625 21.000 10.020 20.510 40.003 60.009 31.000 10.778 20.000 20.000 10.370 60.755 20.288 50.333 50.274 41.000 10.557 30.731 40.456 50.433 30.769 80.000 10.000 40.621 71.000 10.458 70.000 10.196 30.817 10.000 10.472 30.222 50.205 80.689 40.274 6
Maksim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: Top-Down Beats Bottom-Up in 3D Instance Segmentation. WACV 2024
LGround Inst.permissive0.314 60.529 60.225 60.155 60.578 80.010 60.500 10.000 20.000 30.000 10.515 50.556 60.696 21.000 10.927 60.400 50.083 60.000 51.000 10.252 30.000 30.167 50.350 50.731 20.067 60.000 30.123 70.000 10.000 20.036 60.372 60.000 50.000 10.250 30.000 30.569 70.031 80.810 50.000 60.000 70.630 20.183 50.278 60.000 40.000 10.582 60.589 80.500 20.863 51.000 10.940 20.000 70.144 40.716 60.000 40.000 60.484 60.000 60.500 50.400 50.798 60.500 50.278 70.750 20.093 60.166 70.783 50.000 30.200 40.400 20.000 20.000 50.000 40.219 50.539 60.500 50.578 60.413 60.181 80.457 50.375 40.000 10.000 10.050 80.000 60.077 70.000 40.000 20.500 80.000 80.743 60.250 50.488 70.846 60.000 50.000 10.800 60.069 60.000 60.000 60.000 30.000 61.000 10.607 70.000 50.200 40.500 10.694 40.528 50.063 60.659 40.000 50.594 30.000 10.000 60.000 20.571 50.000 30.000 50.000 20.716 70.647 80.221 30.857 50.000 50.000 30.217 60.346 60.071 70.530 81.000 10.429 60.000 30.286 60.000 30.826 80.706 60.208 70.000 60.250 70.744 60.000 40.500 50.042 20.000 40.000 20.746 50.000 20.000 10.517 30.625 60.085 80.333 50.000 71.000 10.378 70.533 80.376 70.042 80.814 60.000 10.000 40.765 61.000 10.600 60.000 10.000 60.667 50.000 10.472 30.333 30.337 60.605 60.305 5
David Rozenberszki, Or Litany, Angela Dai: Language-Grounded Indoor 3D Semantic Segmentation in the Wild.
CompetitorFormer-2000.469 20.676 20.401 30.296 20.692 20.057 50.500 10.083 10.000 30.000 10.534 40.701 50.410 50.903 70.998 30.878 10.500 30.068 10.250 40.424 11.000 10.244 40.556 10.696 30.270 41.000 10.240 30.000 10.000 20.587 30.380 51.000 10.000 10.500 10.000 30.900 10.257 30.901 10.085 50.207 20.863 10.224 31.000 10.109 20.000 10.724 30.806 20.500 20.869 31.000 10.829 60.247 20.474 10.759 30.021 30.269 20.873 30.125 50.467 70.542 10.885 20.829 10.711 20.285 70.118 40.482 20.770 70.025 20.018 50.400 20.000 20.677 20.500 10.222 40.916 11.000 10.818 30.827 10.342 60.650 20.452 30.000 10.000 10.330 40.173 20.278 40.000 40.083 11.000 10.336 40.748 50.508 30.698 40.989 10.286 20.000 10.933 10.175 10.400 10.663 10.015 20.103 21.000 10.829 20.125 20.293 30.500 10.847 20.711 20.295 10.543 70.385 30.581 40.000 10.500 10.000 20.747 20.050 21.000 10.013 10.850 20.886 30.214 50.918 20.125 40.000 30.320 30.610 20.025 80.933 61.000 10.820 40.250 20.901 20.000 30.980 10.878 10.325 30.160 20.574 40.703 70.009 30.540 30.011 50.000 40.000 20.700 60.056 10.000 10.491 50.729 40.617 30.489 30.565 11.000 10.410 60.750 30.629 30.292 40.839 40.000 10.157 10.839 11.000 10.834 30.000 10.131 40.794 20.000 10.667 10.144 60.664 20.854 10.500 3
ODIN - Ins200permissive0.451 30.637 40.407 20.277 30.583 70.116 20.500 10.000 20.125 10.000 10.599 20.823 20.407 60.667 80.941 50.542 41.000 10.000 51.000 10.162 50.000 30.028 70.357 40.695 40.550 10.000 30.475 20.000 10.000 20.714 10.626 21.000 10.000 10.500 10.125 20.749 40.080 40.742 80.528 20.078 50.500 30.334 20.667 30.333 10.000 10.278 80.723 60.250 60.859 61.000 10.826 80.108 50.221 30.763 20.000 40.250 30.742 50.500 40.750 10.400 50.855 30.769 20.701 30.469 50.203 20.406 40.870 20.000 30.963 10.200 40.000 20.000 50.500 10.370 10.886 21.000 10.782 40.504 50.429 50.494 30.337 50.000 10.000 10.600 10.000 60.215 50.226 20.000 20.944 30.200 50.887 20.750 10.874 10.877 50.438 10.000 10.867 40.089 50.003 40.500 20.000 30.333 11.000 10.742 40.125 20.671 20.417 60.616 70.637 30.238 20.873 20.528 20.494 70.000 10.250 40.000 20.688 30.000 31.000 10.000 20.872 10.833 40.275 10.779 71.000 10.000 30.441 20.577 30.167 21.000 10.500 70.777 50.000 30.778 40.000 30.910 40.800 40.232 60.019 50.717 10.833 50.000 40.638 20.284 10.000 40.000 20.778 20.000 20.000 10.597 20.699 50.850 20.333 50.250 50.944 60.571 20.677 50.795 20.264 50.852 30.000 10.000 40.824 21.000 10.668 50.000 10.000 60.667 50.000 10.333 70.333 30.760 10.679 50.404 4
Mask3D Scannet2000.445 40.653 30.392 40.254 40.648 30.097 30.125 80.000 20.000 30.000 10.657 10.971 10.451 31.000 11.000 10.640 20.500 30.045 31.000 10.241 40.409 20.363 20.440 30.686 50.300 30.000 30.201 40.000 10.009 10.290 40.556 31.000 10.000 10.063 60.000 30.830 30.573 10.844 40.333 30.204 30.058 80.158 80.552 50.056 30.000 11.000 10.725 50.750 10.927 11.000 10.888 50.042 60.120 50.615 70.226 10.250 30.890 20.792 10.677 40.510 40.818 40.699 40.512 50.167 80.125 30.315 50.943 10.309 10.017 60.200 40.000 20.188 30.000 40.183 60.815 41.000 10.827 20.741 30.442 40.414 70.600 20.000 10.000 10.458 20.049 50.321 20.381 10.000 20.908 40.400 20.841 30.260 40.710 30.966 20.265 30.000 10.924 20.152 20.025 30.500 20.027 10.028 31.000 10.556 80.016 40.080 80.500 10.694 50.608 40.084 40.604 50.194 40.538 50.000 10.500 10.000 20.354 70.000 31.000 10.000 20.761 50.930 20.053 70.890 41.000 10.008 20.262 40.358 51.000 11.000 10.792 60.966 11.000 10.765 50.004 20.930 30.780 50.330 20.027 40.625 20.974 40.050 10.412 80.021 40.000 40.000 20.778 20.000 20.000 10.493 40.746 30.454 40.335 40.396 20.930 80.551 41.000 10.552 40.606 10.853 20.000 10.004 30.806 31.000 10.727 40.000 10.042 50.745 40.000 10.399 60.391 20.630 40.721 30.619 1
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
CSC-Pretrain Inst.permissive0.275 80.466 80.218 70.110 80.625 50.007 80.500 10.000 20.000 30.000 10.000 80.222 80.377 71.000 10.661 80.400 50.000 70.000 50.000 50.119 80.000 30.000 80.277 70.685 60.067 60.000 30.132 60.000 10.000 20.000 70.367 70.000 50.000 10.000 70.000 30.591 60.055 70.783 70.000 60.014 60.500 30.161 70.278 60.000 40.000 10.667 40.768 30.500 20.866 41.000 10.829 70.000 70.019 80.555 80.000 40.000 60.305 80.000 60.750 10.200 70.783 70.429 60.395 60.677 30.020 80.286 60.584 80.000 30.000 70.115 80.000 20.000 50.000 40.145 80.423 80.500 50.364 80.369 70.571 20.448 60.206 80.000 10.000 10.200 50.106 30.065 80.000 40.000 20.750 50.200 50.774 40.000 80.501 60.841 70.000 50.000 10.692 80.063 70.000 60.000 60.000 30.000 60.500 70.649 50.000 50.084 70.125 70.719 30.413 80.004 70.450 80.000 50.638 20.000 10.000 60.000 20.505 60.000 30.000 50.000 20.727 60.833 40.221 30.779 70.000 50.000 30.168 80.311 80.125 40.571 70.500 70.143 80.000 30.250 70.000 30.869 50.667 70.162 80.000 60.250 71.000 10.000 40.500 50.000 70.000 40.000 20.689 70.000 20.000 10.312 70.383 80.114 60.333 50.000 70.997 40.420 50.613 70.212 80.500 20.819 50.000 10.000 40.768 51.000 10.918 10.000 10.000 60.278 80.000 10.333 70.000 80.353 50.546 80.258 7
Ji Hou, Benjamin Graham, Matthias Nießner, Saining Xie: Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts. CVPR 2021
DINO3D-Scannet200copyleft0.511 10.685 10.484 10.331 10.864 10.220 10.500 10.000 20.042 20.000 10.576 30.746 30.744 11.000 11.000 10.355 81.000 10.048 20.000 50.327 20.000 30.494 10.532 20.596 70.496 20.250 20.481 10.000 10.000 20.714 10.629 11.000 10.000 10.250 30.663 10.861 20.436 20.892 20.667 10.244 10.385 50.421 11.000 10.000 40.000 10.764 20.719 70.500 20.889 21.000 10.907 30.111 40.378 20.778 10.000 40.595 10.905 10.708 30.750 10.542 10.890 10.754 30.761 10.798 10.220 10.683 10.817 40.000 30.600 20.200 40.500 10.944 10.125 30.334 20.856 30.792 40.873 10.756 20.777 10.803 10.675 10.000 10.000 10.200 50.298 10.412 10.000 40.000 20.719 70.800 10.923 10.750 10.798 20.960 30.000 50.000 10.856 50.142 30.001 50.417 40.000 30.014 41.000 10.824 30.559 10.700 10.500 10.863 10.816 10.163 30.944 10.764 10.714 10.000 10.250 40.000 21.000 10.063 11.000 10.000 20.789 40.974 10.079 60.851 60.000 50.000 30.468 10.702 10.167 21.000 11.000 10.857 20.000 30.867 30.000 30.968 20.845 20.264 50.419 10.500 50.667 80.000 40.677 10.028 30.194 20.000 20.857 10.000 20.000 10.699 10.821 10.930 10.850 10.346 30.944 60.579 10.866 20.850 10.221 60.911 10.000 10.011 20.806 40.764 80.860 20.000 10.472 10.794 20.000 10.667 10.655 10.655 30.811 20.528 2
Jinyuan Qu, Hongyang Li, Xingyu Chen, Shilong Liu, Yukai Shi, Tianhe Ren, Ruitao Jing and Lei Zhang: SegDINO3D: 3D Instance Segmentation Empowered by Both Image-Level and Object-Level 2D Features. AAAI 2026
Minkowski 34D Inst.permissive0.280 70.488 70.192 80.124 70.593 60.010 70.500 10.000 20.000 30.000 10.447 70.535 70.445 41.000 10.861 70.400 50.225 50.000 50.000 50.142 70.000 30.074 60.342 60.467 80.067 60.000 30.119 80.000 10.000 20.000 70.337 80.000 50.000 10.000 70.000 30.506 80.070 50.804 60.000 60.000 70.333 60.172 60.150 80.000 40.000 10.479 70.745 40.000 80.830 81.000 10.904 40.167 30.090 70.732 50.000 40.000 60.443 70.000 60.500 50.542 10.772 80.396 70.077 80.385 60.044 70.118 80.777 60.000 30.000 70.200 40.000 20.000 50.000 40.148 70.502 70.500 50.419 70.159 80.281 70.404 80.317 60.000 10.000 10.200 50.000 60.077 60.000 40.000 20.750 50.200 50.715 70.021 70.551 50.828 80.000 50.000 10.743 70.059 80.000 60.000 60.000 30.000 60.125 80.648 60.000 50.191 50.500 10.669 60.502 70.000 80.568 60.000 50.516 60.000 10.000 60.000 20.305 80.000 30.000 50.000 20.825 30.833 40.021 80.918 20.000 50.000 30.191 70.346 70.100 60.981 51.000 10.286 70.000 30.000 80.000 30.868 70.648 80.292 40.000 60.375 61.000 10.000 40.500 50.000 70.333 10.000 20.538 80.000 20.000 10.213 80.518 70.098 70.528 20.250 50.997 40.284 80.677 50.398 60.167 70.790 70.000 10.000 40.618 80.903 70.200 80.000 10.333 20.333 70.000 10.442 50.083 70.213 70.587 70.131 8
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019