Presenting the ScanNet200 Benchmark

We present the ScanNet200 benchmark, which studies an order of magnitude more class categories than previous version of ScanNet. The scene geometry is shared within the two tasks, but the parsing of surface annotation allows for a larger vocabulary and more realistic setting for in the wild 3D understanding methods.

The ScanNet200 benchmark includes both finer-grained categories as well as a large number of previously unaddressed classes. This induces a much more challenging setting regarding the diversity of naturally observed semantic classes seen in the raw ScanNet RGB-D observations, where the data also reflects naturally encountered class imbalances. The difference in category frequencies between ScanNet and ScanNet200 can be seen in the Figure above.

ScanNet200 Benchmark

This table lists the benchmark results for the ScanNet200 3D semantic label scenario.




Method Infoavg iouhead ioucommon ioutail ioualarm clockarmchairbackpackbagballbarbasketbathroom cabinetbathroom counterbathroom stallbathroom stall doorbathroom vanitybathtubbedbenchbicyclebinblackboardblanketblindsboardbookbookshelfbottlebowlboxbroombucketbulletin boardcabinetcalendarcandlecartcase of water bottlescd caseceilingceiling lightchairclockclosetcloset doorcloset rodcloset wallclothesclothes dryercoat rackcoffee kettlecoffee makercoffee tablecolumncomputer towercontainercopiercouchcountercratecupcurtaincushiondecorationdeskdining tabledish rackdishwasherdividerdoordoorframedresserdumbbelldustpanend tablefanfile cabinetfire alarmfire extinguisherfireplacefloorfolded chairfurnitureguitarguitar casehair dryerhandicap barhatheadphonesironing boardjacketkeyboardkeyboard pianokitchen cabinetkitchen counterladderlamplaptoplaundry basketlaundry detergentlaundry hamperledgelightlight switchluggagemachinemailboxmatmattressmicrowavemini fridgemirrormonitormousemusic standnightstandobjectoffice chairottomanovenpaperpaper bagpaper cutterpaper towel dispenserpaper towel rollpersonpianopicturepillarpillowpipeplantplateplungerposterpotted plantpower outletpower stripprinterprojectorprojector screenpurserackradiatorrailrange hoodrecycling binrefrigeratorscaleseatshelfshoeshowershower curtainshower curtain rodshower doorshower floorshower headshower wallsignsinksoap dishsoap dispensersofa chairspeakerstair railstairsstandstoolstorage binstorage containerstorage organizerstovestructurestuffed animalsuitcasetabletelephonetissue boxtoastertoaster oventoilettoilet papertoilet paper dispensertoilet paper holdertoilet seat cover dispensertoweltrash bintrash cantraytubetvtv standvacuum cleanerventwallwardrobewashing machinewater bottlewater coolerwater pitcherwhiteboardwindowwindowsill
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
PPT-SpUNet-F.T.0.332 30.556 20.270 10.123 50.519 10.091 20.349 20.000 10.000 10.000 10.339 60.383 50.498 50.833 20.807 10.241 10.584 20.000 30.755 30.124 30.000 40.608 20.330 30.530 40.314 10.000 40.374 30.000 10.000 30.197 10.459 30.000 30.000 10.117 10.000 20.876 10.095 10.682 20.000 30.086 30.518 20.433 10.930 10.000 10.000 10.563 30.542 50.077 40.715 20.858 30.756 10.008 80.171 40.874 20.000 10.039 10.550 30.000 50.545 20.256 30.657 30.453 10.351 30.449 50.213 20.392 30.611 40.000 20.037 60.946 10.138 50.000 10.000 40.063 30.308 10.537 20.796 10.673 10.323 50.392 30.400 50.509 30.000 20.000 10.649 10.000 50.023 50.000 30.000 30.914 30.002 70.506 70.163 50.359 30.872 30.000 30.000 10.623 20.112 30.001 70.000 40.000 10.021 20.753 10.565 70.150 10.579 10.806 40.267 20.616 10.042 70.783 40.000 30.374 50.000 10.000 20.000 20.620 40.000 10.000 40.000 10.572 60.634 10.350 40.792 20.000 30.000 10.376 40.535 20.378 20.855 10.672 10.074 50.000 40.185 30.000 10.727 30.660 30.076 80.000 60.432 40.646 30.000 10.594 50.006 70.000 40.000 10.658 20.000 20.000 10.661 10.549 30.300 50.291 50.045 50.942 40.304 20.600 30.572 20.135 70.695 10.000 10.008 30.793 30.942 10.899 20.000 10.816 20.181 40.897 10.000 10.679 10.223 30.264 10.691 20.345 6
Xiaoyang Wu, Zhuotao Tian, Xin Wen, Bohao Peng, Xihui Liu, Kaicheng Yu, Hengshuang Zhao: Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt Training.
CeCo0.340 10.551 30.247 40.181 10.475 40.057 80.142 70.000 10.000 10.000 10.387 30.463 30.499 40.924 10.774 30.213 20.257 40.000 30.546 80.100 50.006 30.615 10.177 80.534 20.246 30.000 40.400 20.000 10.338 10.006 70.484 20.609 10.000 10.083 40.000 20.873 30.089 40.661 50.000 30.048 80.560 10.408 30.892 30.000 10.000 10.586 10.616 30.000 70.692 40.900 10.721 30.162 10.228 10.860 30.000 10.000 40.575 10.083 20.550 10.347 10.624 40.410 40.360 20.740 10.109 50.321 60.660 20.000 20.121 20.939 40.143 30.000 10.400 10.003 50.190 30.564 10.652 40.615 40.421 10.304 60.579 10.547 10.000 20.000 10.296 50.000 50.030 40.096 10.000 30.916 10.037 40.551 40.171 30.376 20.865 40.286 10.000 10.633 10.102 70.027 40.011 30.000 10.000 30.474 50.742 10.133 30.311 40.824 30.242 40.503 50.068 40.828 10.000 30.429 20.000 10.063 10.000 20.781 10.000 10.000 40.000 10.665 10.633 20.450 10.818 10.000 30.000 10.429 20.532 30.226 40.825 30.510 60.377 20.709 10.079 60.000 10.753 10.683 10.102 70.063 20.401 70.620 60.000 10.619 20.000 80.000 40.000 10.595 60.000 20.000 10.345 50.564 20.411 10.603 10.384 20.945 20.266 40.643 10.367 50.304 10.663 40.000 10.010 20.726 60.767 30.898 30.000 10.784 40.435 10.861 40.000 10.447 30.000 70.257 30.656 40.377 4
: Understanding Imbalanced Semantic Segmentation Through Neural Collapse.
OctFormer ScanNet200permissive0.326 40.539 40.265 30.131 30.499 20.110 10.522 10.000 10.000 10.000 10.318 80.427 40.455 60.743 60.765 40.175 30.842 10.000 30.828 10.204 10.033 20.429 40.335 10.601 10.312 20.000 40.357 40.000 10.000 30.047 50.423 40.000 30.000 10.105 30.000 20.873 30.079 60.670 40.000 30.117 10.471 50.432 20.829 50.000 10.000 10.584 20.417 80.089 30.684 50.837 40.705 70.021 60.178 30.892 10.000 10.028 20.505 50.000 50.457 40.200 60.662 10.412 30.244 60.496 30.000 80.451 20.626 30.000 20.102 40.943 30.138 50.000 10.000 40.149 20.291 20.534 30.722 20.632 30.331 40.253 70.453 30.487 40.000 20.000 10.479 20.000 50.022 60.000 30.000 30.900 40.128 20.684 10.164 40.413 10.854 50.000 30.000 10.512 80.074 80.003 60.000 40.000 10.000 30.469 60.613 40.132 40.529 30.871 10.227 70.582 20.026 80.787 30.000 30.339 60.000 10.000 20.000 20.626 30.000 10.029 30.000 10.587 40.612 30.411 20.724 50.000 30.000 10.407 30.552 10.513 10.849 20.655 20.408 10.000 40.296 10.000 10.686 60.645 50.145 30.022 40.414 50.633 40.000 10.637 10.224 10.000 40.000 10.650 30.000 20.000 10.622 20.535 40.343 30.483 20.230 40.943 30.289 30.618 20.596 10.140 60.679 20.000 10.022 10.783 40.620 60.906 10.000 10.806 30.137 60.865 20.000 10.378 40.000 70.168 80.680 30.227 7
Peng-Shuai Wang: OctFormer: Octree-based Transformers for 3D Point Clouds. SIGGRAPH 2023
AWCS0.305 50.508 50.225 50.142 20.463 50.063 60.195 50.000 10.000 10.000 10.467 20.551 10.504 30.773 30.764 50.142 50.029 80.000 30.626 60.100 50.000 40.360 50.179 60.507 60.137 60.006 30.300 50.000 10.000 30.172 30.364 60.512 20.000 10.056 50.000 20.865 50.093 30.634 80.000 30.071 60.396 60.296 70.876 40.000 10.000 10.373 50.436 70.063 60.749 10.877 20.721 30.131 20.124 50.804 60.000 10.000 40.515 40.010 40.452 50.252 40.578 50.417 20.179 80.484 40.171 30.337 50.606 50.000 20.115 30.937 50.142 40.000 10.008 30.000 70.157 70.484 50.402 80.501 60.339 30.553 10.529 20.478 50.000 20.000 10.404 30.001 40.022 60.077 20.000 30.894 60.219 10.628 20.093 60.305 50.886 10.233 20.000 10.603 30.112 30.023 50.000 40.000 10.000 30.741 20.664 20.097 60.253 50.782 50.264 30.523 40.154 10.707 70.000 30.411 30.000 10.000 20.000 20.332 70.000 10.000 40.000 10.602 20.595 40.185 70.656 70.159 10.000 10.355 50.424 60.154 60.729 60.516 50.220 40.620 20.084 50.000 10.707 50.651 40.173 10.014 50.381 80.582 70.000 10.619 20.049 60.000 40.000 10.702 10.000 20.000 10.302 70.489 60.317 40.334 40.392 10.922 50.254 50.533 60.394 40.129 80.613 60.000 10.000 60.820 10.649 50.749 50.000 10.782 50.282 30.863 30.000 10.288 70.006 50.220 50.633 50.542 1
LGroundpermissive0.272 60.485 60.184 60.106 60.476 30.077 50.218 40.000 10.000 10.000 10.547 10.295 60.540 10.746 50.745 60.058 70.112 70.005 10.658 50.077 80.000 40.322 60.178 70.512 50.190 50.199 10.277 60.000 10.000 30.173 20.399 50.000 30.000 10.039 70.000 20.858 60.085 50.676 30.002 10.103 20.498 30.323 50.703 60.000 10.000 10.296 60.549 40.216 10.702 30.768 60.718 50.028 40.092 70.786 70.000 10.000 40.453 70.022 30.251 80.252 40.572 60.348 60.321 40.514 20.063 60.279 70.552 60.000 20.019 70.932 60.132 70.000 10.000 40.000 70.156 80.457 60.623 50.518 50.265 70.358 40.381 60.395 60.000 20.000 10.127 80.012 30.051 10.000 30.000 30.886 70.014 50.437 80.179 20.244 60.826 60.000 30.000 10.599 40.136 10.085 20.000 40.000 10.000 30.565 40.612 50.143 20.207 60.566 60.232 60.446 60.127 20.708 60.000 30.384 40.000 10.000 20.000 20.402 50.000 10.059 20.000 10.525 80.566 50.229 60.659 60.000 30.000 10.265 60.446 50.147 70.720 80.597 40.066 60.000 40.187 20.000 10.726 40.467 80.134 50.000 60.413 60.629 50.000 10.363 70.055 50.022 20.000 10.626 50.000 20.000 10.323 60.479 80.154 70.117 60.028 70.901 60.243 60.415 80.295 80.143 50.610 70.000 10.000 60.777 50.397 80.324 70.000 10.778 60.179 50.702 70.000 10.274 80.404 10.233 40.622 60.398 3
David Rozenberszki, Or Litany, Angela Dai: Language-Grounded Indoor 3D Semantic Segmentation in the Wild. arXiv
CSC-Pretrainpermissive0.249 80.455 80.171 70.079 80.418 70.059 70.186 60.000 10.000 10.000 10.335 70.250 70.316 70.766 40.697 80.142 50.170 50.003 20.553 70.112 40.097 10.201 80.186 50.476 80.081 70.000 40.216 80.000 10.000 30.001 80.314 80.000 30.000 10.055 60.000 20.832 80.094 20.659 60.002 10.076 40.310 80.293 80.664 80.000 10.000 10.175 80.634 20.130 20.552 80.686 80.700 80.076 30.110 60.770 80.000 10.000 40.430 80.000 50.319 60.166 70.542 80.327 70.205 70.332 70.052 70.375 40.444 80.000 20.012 80.930 80.203 10.000 10.000 40.046 40.175 50.413 70.592 60.471 70.299 60.152 80.340 70.247 80.000 20.000 10.225 60.058 20.037 20.000 30.207 10.862 80.014 50.548 50.033 70.233 70.816 70.000 30.000 10.542 70.123 20.121 10.019 20.000 10.000 30.463 70.454 80.045 80.128 80.557 70.235 50.441 70.063 50.484 80.000 30.308 80.000 10.000 20.000 20.318 80.000 10.000 40.000 10.545 70.543 60.164 80.734 40.000 30.000 10.215 80.371 70.198 50.743 50.205 70.062 70.000 40.079 60.000 10.683 70.547 70.142 40.000 60.441 30.579 80.000 10.464 60.098 40.041 10.000 10.590 70.000 20.000 10.373 40.494 50.174 60.105 70.001 80.895 70.222 70.537 50.307 70.180 40.625 50.000 10.000 60.591 80.609 70.398 60.000 10.766 80.014 80.638 80.000 10.377 50.004 60.206 70.609 80.465 2
Ji Hou, Benjamin Graham, Matthias Nießner, Saining Xie: Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts. CVPR 2021
OA-CNN-L_ScanNet2000.333 20.558 10.269 20.124 40.448 60.080 40.272 30.000 10.000 10.000 10.342 50.515 20.524 20.713 80.789 20.158 40.384 30.000 30.806 20.125 20.000 40.496 30.332 20.498 70.227 40.024 20.474 10.000 10.003 20.071 40.487 10.000 30.000 10.110 20.000 20.876 10.013 80.703 10.000 30.076 40.473 40.355 40.906 20.000 10.000 10.476 40.706 10.000 70.672 60.835 50.748 20.015 70.223 20.860 30.000 10.000 40.572 20.000 50.509 30.313 20.662 10.398 50.396 10.411 60.276 10.527 10.711 10.000 20.076 50.946 10.166 20.000 10.022 20.160 10.183 40.493 40.699 30.637 20.403 20.330 50.406 40.526 20.024 10.000 10.392 40.000 50.016 80.000 30.196 20.915 20.112 30.557 30.197 10.352 40.877 20.000 30.000 10.592 60.103 60.000 80.067 10.000 10.089 10.735 30.625 30.130 50.568 20.836 20.271 10.534 30.043 60.799 20.001 20.445 10.000 10.000 20.024 10.661 20.000 10.262 10.000 10.591 30.517 70.373 30.788 30.021 20.000 10.455 10.517 40.320 30.823 40.200 80.001 80.150 30.100 40.000 10.736 20.668 20.103 60.052 30.662 10.720 10.000 10.602 40.112 30.002 30.000 10.637 40.000 20.000 10.621 30.569 10.398 20.412 30.234 30.949 10.363 10.492 70.495 30.251 30.665 30.000 10.001 50.805 20.833 20.794 40.000 10.821 10.314 20.843 50.000 10.560 20.245 20.262 20.713 10.370 5
Minkowski 34Dpermissive0.253 70.463 70.154 80.102 70.381 80.084 30.134 80.000 10.000 10.000 10.386 40.141 80.279 80.737 70.703 70.014 80.164 60.000 30.663 40.092 70.000 40.224 70.291 40.531 30.056 80.000 40.242 70.000 10.000 30.013 60.331 70.000 30.000 10.035 80.001 10.858 60.059 70.650 70.000 30.056 70.353 70.299 60.670 70.000 10.000 10.284 70.484 60.071 50.594 70.720 70.710 60.027 50.068 80.813 50.000 10.005 30.492 60.164 10.274 70.111 80.571 70.307 80.293 50.307 80.150 40.163 80.531 70.002 10.545 10.932 60.093 80.000 10.000 40.002 60.159 60.368 80.581 70.440 80.228 80.406 20.282 80.294 70.000 20.000 10.189 70.060 10.036 30.000 30.000 30.897 50.000 80.525 60.025 80.205 80.771 80.000 30.000 10.593 50.108 50.044 30.000 40.000 10.000 30.282 80.589 60.094 70.169 70.466 80.227 70.419 80.125 30.757 50.002 10.334 70.000 10.000 20.000 20.357 60.000 10.000 40.000 10.582 50.513 80.337 50.612 80.000 30.000 10.250 70.352 80.136 80.724 70.655 20.280 30.000 40.046 80.000 10.606 80.559 60.159 20.102 10.445 20.655 20.000 10.310 80.117 20.000 40.000 10.581 80.026 10.000 10.265 80.483 70.084 80.097 80.044 60.865 80.142 80.588 40.351 60.272 20.596 80.000 10.003 40.622 70.720 40.096 80.000 10.771 70.016 70.772 60.000 10.302 60.194 40.214 60.621 70.197 8
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019


This table lists the benchmark results for the ScanNet200 3D semantic instance scenario.




Method Infoavg aphead apcommon aptail apalarm clockarmchairbackpackbagballbarbasketbathroom cabinetbathroom counterbathroom stallbathroom stall doorbathroom vanitybathtubbedbenchbicyclebinblackboardblanketblindsboardbookbookshelfbottlebowlboxbroombucketbulletin boardcabinetcalendarcandlecartcase of water bottlescd caseceilingceiling lightchairclockclosetcloset doorcloset rodcloset wallclothesclothes dryercoat rackcoffee kettlecoffee makercoffee tablecolumncomputer towercontainercopiercouchcountercratecupcurtaincushiondecorationdeskdining tabledish rackdishwasherdividerdoordoorframedresserdumbbelldustpanend tablefanfile cabinetfire alarmfire extinguisherfireplacefolded chairfurnitureguitarguitar casehair dryerhandicap barhatheadphonesironing boardjacketkeyboardkeyboard pianokitchen cabinetkitchen counterladderlamplaptoplaundry basketlaundry detergentlaundry hamperledgelightlight switchluggagemachinemailboxmatmattressmicrowavemini fridgemirrormonitormousemusic standnightstandobjectoffice chairottomanovenpaperpaper bagpaper cutterpaper towel dispenserpaper towel rollpersonpianopicturepillarpillowpipeplantplateplungerposterpotted plantpower outletpower stripprinterprojectorprojector screenpurserackradiatorrailrange hoodrecycling binrefrigeratorscaleseatshelfshoeshowershower curtainshower curtain rodshower doorshower floorshower headshower wallsignsinksoap dishsoap dispensersofa chairspeakerstair railstairsstandstoolstorage binstorage containerstorage organizerstovestructurestuffed animalsuitcasetabletelephonetissue boxtoastertoaster oventoilettoilet papertoilet paper dispensertoilet paper holdertoilet seat cover dispensertoweltrash bintrash cantraytubetvtv standvacuum cleanerventwardrobewashing machinewater bottlewater coolerwater pitcherwhiteboardwindowwindowsill
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Mask3D Scannet2000.278 10.383 10.263 10.168 10.506 10.068 10.083 50.000 10.000 10.000 10.023 20.149 40.302 10.778 30.647 10.569 10.500 10.031 10.014 20.027 20.173 10.311 10.195 10.351 30.258 10.000 10.082 10.000 10.003 10.037 20.391 11.000 10.000 10.014 20.000 10.572 10.573 10.661 20.000 10.003 10.005 40.082 40.349 10.028 10.000 10.605 10.515 30.509 10.711 11.000 10.665 30.015 20.107 10.402 40.201 10.083 10.304 10.759 10.491 10.378 10.572 10.119 10.277 10.013 50.089 10.283 20.411 20.267 10.006 30.156 20.000 10.116 10.000 10.105 30.556 10.514 10.396 10.275 10.323 10.215 20.380 10.000 10.000 10.356 10.005 20.208 10.325 10.000 10.050 40.400 10.561 10.258 10.179 10.722 10.147 10.000 10.586 10.063 10.015 20.139 10.016 10.028 10.708 10.418 20.016 10.048 30.500 10.489 10.349 10.001 20.475 20.086 10.365 10.000 10.500 10.000 20.323 30.000 10.222 10.000 10.497 10.626 10.044 30.795 10.556 10.008 20.121 40.265 10.667 10.789 10.568 20.579 10.444 10.176 10.004 20.474 10.752 10.233 20.014 20.002 40.570 20.007 10.377 50.000 10.000 20.000 20.337 10.000 10.000 10.384 10.465 10.287 10.085 10.048 20.816 50.467 10.810 10.377 10.415 10.744 10.000 10.004 10.724 10.778 20.590 10.000 10.032 20.441 10.000 10.377 20.391 10.427 10.321 10.192 1
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
CSC-Pretrain Inst.permissive0.123 50.223 50.082 50.046 40.308 30.004 30.278 10.000 10.000 10.000 10.000 50.032 50.105 30.537 40.348 50.378 40.000 20.000 30.000 30.000 50.000 20.000 50.037 50.323 40.000 40.000 10.013 50.000 10.000 20.000 30.235 20.000 20.000 10.000 30.000 10.231 30.045 30.564 40.000 10.000 20.006 30.078 50.065 30.000 20.000 10.259 30.516 20.000 40.600 41.000 10.578 50.000 30.000 50.184 50.000 30.000 30.034 50.000 30.211 40.089 30.394 50.018 50.064 40.171 40.001 50.144 30.172 40.000 20.000 40.044 40.000 10.000 30.000 10.064 50.126 40.278 20.093 50.000 40.094 40.214 30.011 50.000 10.000 10.000 30.000 30.022 50.000 30.000 10.275 30.000 40.275 40.000 50.098 40.407 40.000 30.000 10.250 50.007 50.000 30.000 30.000 20.000 30.333 40.376 40.000 20.000 50.042 50.285 30.119 40.000 30.224 50.000 20.184 30.000 10.000 30.000 20.244 40.000 10.000 20.000 10.377 30.378 20.051 20.424 50.000 30.000 30.116 50.030 40.125 20.441 40.444 50.063 50.000 20.042 30.000 30.297 20.483 30.096 50.000 30.028 20.338 40.000 30.444 30.000 10.000 20.000 20.189 40.000 10.000 10.141 40.152 50.017 40.000 50.000 30.838 40.193 30.111 50.105 50.198 30.588 30.000 10.000 20.542 30.343 50.267 30.000 10.000 30.108 50.000 10.333 40.000 50.228 20.202 50.022 4
Ji Hou, Benjamin Graham, Matthias Nießner, Saining Xie: Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts. CVPR 2021
LGround Inst.permissive0.154 30.275 30.108 30.060 30.295 50.002 40.278 10.000 10.000 10.000 10.006 40.272 20.064 50.815 20.503 30.333 50.000 20.000 30.556 10.001 40.000 20.148 30.078 20.448 10.007 30.000 10.024 30.000 10.000 20.000 30.190 40.000 20.000 10.000 30.000 10.209 50.031 50.573 30.000 10.000 20.041 20.099 30.037 40.000 20.000 10.327 20.364 50.181 20.642 21.000 10.654 40.000 30.023 30.429 30.000 30.000 30.097 30.000 30.278 20.267 20.434 30.048 20.092 30.257 20.030 30.097 40.189 30.000 20.089 20.000 50.000 10.000 30.000 10.115 20.166 30.222 50.222 30.003 30.127 30.213 40.169 20.000 10.000 10.000 30.000 30.044 30.000 30.000 10.000 50.000 40.268 50.222 20.130 20.494 30.000 30.000 10.363 30.015 30.000 30.000 30.000 20.000 30.611 20.400 30.000 20.056 20.278 30.242 40.180 30.000 30.383 40.000 20.209 20.000 10.000 30.000 20.364 20.000 10.000 20.000 10.323 40.302 30.019 40.654 20.000 30.000 30.141 20.045 30.000 50.427 50.514 30.143 30.000 20.028 40.000 30.252 30.402 40.156 40.000 30.028 20.470 30.000 30.444 30.000 10.000 20.000 20.205 30.000 10.000 10.203 30.381 30.026 30.037 30.000 30.881 30.099 40.135 40.239 30.000 40.585 40.000 10.000 20.616 20.778 20.322 20.000 10.000 30.407 30.000 10.333 40.148 30.177 30.242 30.028 3
David Rozenberszki, Or Litany, Angela Dai: Language-Grounded Indoor 3D Semantic Segmentation in the Wild.
Minkowski 34D Inst.permissive0.130 40.246 40.083 40.043 50.299 40.000 50.278 10.000 10.000 10.000 10.022 30.175 30.122 20.537 40.521 20.400 30.000 20.000 30.000 30.008 30.000 20.048 40.076 30.182 50.000 40.000 10.022 40.000 10.000 20.000 30.141 50.000 20.000 10.000 30.000 10.210 40.063 20.547 50.000 10.000 20.000 50.100 20.026 50.000 20.000 10.241 40.488 40.000 40.564 51.000 10.672 20.000 30.021 40.486 10.000 30.000 30.067 40.000 30.194 50.033 40.415 40.026 40.025 50.271 10.004 40.094 50.142 50.000 20.000 40.111 30.000 10.000 30.000 10.088 40.083 50.278 20.110 40.000 40.082 50.199 50.137 30.000 10.000 10.000 30.000 30.041 40.000 30.000 10.308 20.067 30.280 30.016 40.101 30.373 50.000 30.000 10.319 40.007 40.000 30.000 30.000 20.000 30.028 50.355 50.000 20.101 10.444 20.289 20.114 50.000 30.394 30.000 20.032 50.000 10.000 30.000 20.201 50.000 10.000 20.000 10.384 20.248 40.000 50.529 40.000 30.000 30.133 30.020 50.089 30.720 30.500 40.099 40.000 20.000 50.000 30.238 40.334 50.190 30.000 30.000 50.317 50.000 30.472 10.000 10.000 20.000 20.094 50.000 10.000 10.082 50.236 40.004 50.019 40.000 30.883 20.061 50.262 20.217 40.000 40.557 50.000 10.000 20.460 40.761 40.156 50.000 10.000 30.259 40.000 10.394 10.019 40.084 40.232 40.000 5
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019
TD3D Scannet2000.211 20.332 20.177 20.103 20.337 20.036 20.222 40.000 10.000 10.000 10.031 10.342 10.093 40.852 10.452 40.559 20.000 20.004 20.000 30.039 10.000 20.309 20.047 40.380 20.028 20.000 10.080 20.000 10.000 20.147 10.192 30.000 20.000 10.083 10.000 10.395 20.039 40.662 10.000 10.000 20.074 10.135 10.296 20.000 20.000 10.231 50.646 10.139 30.633 31.000 10.705 10.048 10.088 20.439 20.184 20.039 20.266 20.551 20.260 30.026 50.463 20.046 30.252 20.249 30.083 20.372 10.411 10.000 20.414 10.323 10.000 10.052 20.000 10.157 10.278 20.278 20.237 20.015 20.321 20.253 10.060 40.000 10.000 10.272 20.008 10.169 20.032 20.000 10.404 10.356 20.283 20.073 30.028 50.617 20.038 20.000 10.494 20.037 20.215 10.083 20.000 20.003 20.486 30.694 10.000 20.040 40.083 40.219 50.209 20.007 10.483 10.000 20.125 40.000 10.150 20.014 10.544 10.000 10.000 20.000 10.260 50.143 50.200 10.610 30.028 20.032 10.145 10.059 20.046 40.740 20.806 10.543 20.000 20.108 20.008 10.222 50.669 20.456 10.074 10.224 10.586 10.006 20.451 20.000 10.002 10.889 10.282 20.000 10.000 10.252 20.413 20.111 20.074 20.240 10.893 10.266 20.144 30.293 20.281 20.604 20.000 10.000 20.379 50.963 10.250 40.000 10.160 10.420 20.000 10.343 30.207 20.079 50.315 20.052 2


ScanNet Benchmark

This table lists the benchmark results for the 3D semantic label scenario.


Method Infoavg ioubathtubbedbookshelfcabinetchaircountercurtaindeskdoorfloorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwallwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Mix3Dpermissive0.781 10.964 10.855 10.843 130.781 40.858 90.575 40.831 260.685 90.714 20.979 10.594 40.310 200.801 10.892 120.841 20.819 30.723 30.940 80.887 20.725 18
Alexey Nekrasov, Jonas Schult, Or Litany, Bastian Leibe, Francis Engelmann: Mix3D: Out-of-Context Data Augmentation for 3D Scenes. 3DV 2021 (Oral)
Swin3Dpermissive0.779 20.861 170.818 110.836 160.790 20.875 20.576 30.905 30.704 30.739 10.969 60.611 10.349 60.756 170.958 10.702 360.805 120.708 60.916 240.898 10.801 1
PPT-SpUNet-Joint0.766 30.932 20.794 260.829 200.751 160.854 110.540 140.903 40.630 270.672 100.963 90.565 160.357 40.788 20.900 80.737 190.802 130.685 120.950 20.887 20.780 2
Xiaoyang Wu, Zhuotao Tian, Xin Wen, Bohao Peng, Xihui Liu, Kaicheng Yu, Hengshuang Zhao: Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt Training.
OctFormerpermissive0.766 30.925 40.808 170.849 70.786 30.846 200.566 60.876 100.690 70.674 90.960 110.576 120.226 590.753 190.904 60.777 70.815 50.722 40.923 200.877 80.776 4
Peng-Shuai Wang: OctFormer: Octree-based Transformers for 3D Point Clouds. SIGGRAPH 2023
OccuSeg+Semantic0.764 50.758 530.796 240.839 150.746 180.907 10.562 70.850 190.680 110.672 100.978 20.610 20.335 110.777 50.819 380.847 10.830 10.691 100.972 10.885 40.727 16
CU-Hybrid Net0.764 50.924 50.819 90.840 140.757 110.853 130.580 10.848 200.709 20.643 170.958 140.587 80.295 260.753 190.884 160.758 130.815 50.725 20.927 180.867 150.743 10
O-CNNpermissive0.762 70.924 50.823 50.844 120.770 60.852 140.577 20.847 220.711 10.640 210.958 140.592 50.217 650.762 130.888 130.758 130.813 80.726 10.932 160.868 140.744 9
Peng-Shuai Wang, Yang Liu, Yu-Xiao Guo, Chun-Yu Sun, Xin Tong: O-CNN: Octree-based Convolutional Neural Networks for 3D Shape Analysis. SIGGRAPH 2017
OA-CNN-L_ScanNet200.756 80.783 390.826 40.858 40.776 50.837 260.548 110.896 70.649 200.675 80.962 100.586 90.335 110.771 80.802 420.770 90.787 280.691 100.936 110.880 70.761 6
ConDaFormer0.755 90.927 30.822 60.836 160.801 10.849 170.516 240.864 150.651 190.680 70.958 140.584 110.282 330.759 150.855 260.728 210.802 130.678 140.880 520.873 130.756 7
Lunhao Duan, Shanshan Zhao, Nan Xue, Mingming Gong, Guisong Xia, Dacheng Tao: ConDaFormer : Disassembled Transformer with Local Structure Enhancement for 3D Point Cloud Understanding. Neurips, 2023
PointTransformerV20.752 100.742 600.809 160.872 10.758 100.860 80.552 90.891 80.610 350.687 30.960 110.559 190.304 230.766 100.926 30.767 100.797 190.644 260.942 60.876 110.722 20
Xiaoyang Wu, Yixing Lao, Li Jiang, Xihui Liu, Hengshuang Zhao: Point Transformer V2: Grouped Vector Attention and Partition-based Pooling. NeurIPS 2022
DMF-Net0.752 100.906 90.793 280.802 350.689 330.825 370.556 80.867 120.681 100.602 370.960 110.555 210.365 30.779 40.859 220.747 160.795 230.717 50.917 230.856 230.764 5
C.Yang, Y.Yan, W.Zhao, J.Ye, X.Yang, A.Hussain, B.Dong, K.Huang: Towards Deeper and Better Multi-view Feature Fusion for 3D Semantic Segmentation. ICONIP 2023
PointConvFormer0.749 120.793 360.790 290.807 310.750 170.856 100.524 200.881 90.588 470.642 200.977 40.591 60.274 380.781 30.929 20.804 30.796 200.642 270.947 40.885 40.715 23
Wenxuan Wu, Qi Shan, Li Fuxin: PointConvFormer: Revenge of the Point-based Convolution.
BPNetcopyleft0.749 120.909 70.818 110.811 280.752 140.839 250.485 400.842 230.673 120.644 160.957 180.528 300.305 220.773 70.859 220.788 40.818 40.693 90.916 240.856 230.723 19
Wenbo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia, Tien-Tsin Wong: Bidirectional Projection Network for Cross Dimension Scene Understanding. CVPR 2021 (Oral)
MSP0.748 140.623 860.804 190.859 30.745 190.824 390.501 300.912 20.690 70.685 50.956 200.567 150.320 170.768 90.918 40.720 260.802 130.676 160.921 210.881 60.779 3
StratifiedFormerpermissive0.747 150.901 100.803 200.845 110.757 110.846 200.512 250.825 290.696 60.645 150.956 200.576 120.262 500.744 240.861 210.742 170.770 370.705 70.899 370.860 200.734 11
Xin Lai*, Jianhui Liu*, Li Jiang, Liwei Wang, Hengshuang Zhao, Shu Liu, Xiaojuan Qi, Jiaya Jia: Stratified Transformer for 3D Point Cloud Segmentation. CVPR 2022
VMNetpermissive0.746 160.870 150.838 20.858 40.729 240.850 160.501 300.874 110.587 480.658 130.956 200.564 170.299 240.765 110.900 80.716 290.812 90.631 320.939 90.858 210.709 24
Zeyu HU, Xuyang Bai, Jiaxiang Shang, Runze Zhang, Jiayu Dong, Xin Wang, Guangyuan Sun, Hongbo Fu, Chiew-Lan Tai: VMNet: Voxel-Mesh Network for Geodesic-Aware 3D Semantic Segmentation. ICCV 2021 (Oral)
Virtual MVFusion0.746 160.771 470.819 90.848 90.702 300.865 70.397 770.899 50.699 40.664 120.948 480.588 70.330 130.746 230.851 300.764 110.796 200.704 80.935 120.866 160.728 14
Abhijit Kundu, Xiaoqi Yin, Alireza Fathi, David Ross, Brian Brewington, Thomas Funkhouser, Caroline Pantofaru: Virtual Multi-view Fusion for 3D Semantic Segmentation. ECCV 2020
Retro-FPN0.744 180.842 230.800 210.767 480.740 200.836 280.541 130.914 10.672 130.626 260.958 140.552 220.272 400.777 50.886 150.696 370.801 170.674 180.941 70.858 210.717 21
Peng Xiang*, Xin Wen*, Yu-Shen Liu, Hui Zhang, Yi Fang, Zhizhong Han: Retrospective Feature Pyramid Network for Point Cloud Semantic Segmentation. ICCV 2023
EQ-Net0.743 190.620 870.799 230.849 70.730 230.822 420.493 370.897 60.664 140.681 60.955 230.562 180.378 10.760 140.903 70.738 180.801 170.673 190.907 290.877 80.745 8
Zetong Yang*, Li Jiang*, Yanan Sun, Bernt Schiele, Jiaya JIa: A Unified Query-based Paradigm for Point Cloud Understanding. CVPR 2022
LRPNet0.742 200.816 310.806 180.807 310.752 140.828 350.575 40.839 250.699 40.637 220.954 290.520 320.320 170.755 180.834 340.760 120.772 340.676 160.915 260.862 180.717 21
SAT0.742 200.860 180.765 420.819 230.769 70.848 180.533 160.829 270.663 150.631 240.955 230.586 90.274 380.753 190.896 100.729 200.760 440.666 210.921 210.855 250.733 12
LargeKernel3D0.739 220.909 70.820 80.806 330.740 200.852 140.545 120.826 280.594 460.643 170.955 230.541 240.263 490.723 270.858 240.775 80.767 380.678 140.933 140.848 310.694 30
Yukang Chen*, Jianhui Liu*, Xiangyu Zhang, Xiaojuan Qi, Jiaya Jia: LargeKernel3D: Scaling up Kernels in 3D Sparse CNNs. CVPR 2023
RPN0.736 230.776 430.790 290.851 60.754 130.854 110.491 390.866 130.596 450.686 40.955 230.536 250.342 80.624 430.869 180.787 50.802 130.628 330.927 180.875 120.704 27
MinkowskiNetpermissive0.736 230.859 190.818 110.832 190.709 280.840 240.521 220.853 180.660 170.643 170.951 380.544 230.286 310.731 250.893 110.675 450.772 340.683 130.874 580.852 290.727 16
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019
IPCA0.731 250.890 110.837 30.864 20.726 250.873 30.530 190.824 300.489 790.647 140.978 20.609 30.336 100.624 430.733 520.758 130.776 320.570 580.949 30.877 80.728 14
SparseConvNet0.725 260.647 830.821 70.846 100.721 260.869 40.533 160.754 500.603 410.614 300.955 230.572 140.325 150.710 280.870 170.724 240.823 20.628 330.934 130.865 170.683 33
PointTransformer++0.725 260.727 680.811 150.819 230.765 80.841 230.502 290.814 350.621 300.623 280.955 230.556 200.284 320.620 450.866 190.781 60.757 470.648 240.932 160.862 180.709 24
MatchingNet0.724 280.812 330.812 140.810 290.735 220.834 300.495 360.860 170.572 540.602 370.954 290.512 340.280 350.757 160.845 320.725 230.780 300.606 440.937 100.851 300.700 29
PNE0.721 290.840 240.789 310.833 180.690 310.823 410.509 260.864 150.618 310.629 250.957 180.500 370.266 470.763 120.797 440.674 490.791 260.621 380.892 420.855 250.708 26
INS-Conv-semantic0.717 300.751 560.759 450.812 270.704 290.868 50.537 150.842 230.609 370.608 330.953 320.534 270.293 270.616 460.864 200.719 280.793 240.640 280.933 140.845 350.663 38
PointMetaBase0.714 310.835 250.785 320.821 210.684 350.846 200.531 180.865 140.614 320.596 410.953 320.500 370.246 550.674 290.888 130.692 380.764 400.624 350.849 730.844 360.675 35
contrastBoundarypermissive0.705 320.769 500.775 370.809 300.687 340.820 450.439 650.812 360.661 160.591 430.945 560.515 330.171 830.633 400.856 250.720 260.796 200.668 200.889 450.847 320.689 31
Liyao Tang, Yibing Zhan, Zhe Chen, Baosheng Yu, Dacheng Tao: Contrastive Boundary Learning for Point Cloud Segmentation. CVPR2022
ClickSeg_Semantic0.703 330.774 450.800 210.793 390.760 90.847 190.471 430.802 390.463 860.634 230.968 80.491 410.271 420.726 260.910 50.706 330.815 50.551 690.878 530.833 370.570 69
RFCR0.702 340.889 120.745 540.813 260.672 380.818 490.493 370.815 340.623 280.610 310.947 500.470 490.249 540.594 490.848 310.705 340.779 310.646 250.892 420.823 430.611 52
Jingyu Gong, Jiachen Xu, Xin Tan, Haichuan Song, Yanyun Qu, Yuan Xie, Lizhuang Ma: Omni-Supervised Point Cloud Segmentation via Gradual Receptive Field Component Reasoning. CVPR2021
One Thing One Click0.701 350.825 290.796 240.723 550.716 270.832 310.433 670.816 320.634 250.609 320.969 60.418 740.344 70.559 610.833 350.715 300.808 110.560 630.902 340.847 320.680 34
JSENetpermissive0.699 360.881 140.762 430.821 210.667 390.800 620.522 210.792 420.613 330.607 340.935 760.492 400.205 700.576 540.853 280.691 390.758 460.652 230.872 610.828 400.649 42
Zeyu HU, Mingmin Zhen, Xuyang BAI, Hongbo Fu, Chiew-lan Tai: JSENet: Joint Semantic Segmentation and Edge Detection Network for 3D Point Clouds. ECCV 2020
One-Thing-One-Click0.693 370.743 590.794 260.655 790.684 350.822 420.497 350.719 600.622 290.617 290.977 40.447 610.339 90.750 220.664 680.703 350.790 270.596 480.946 50.855 250.647 43
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
PicassoNet-IIpermissive0.692 380.732 640.772 380.786 400.677 370.866 60.517 230.848 200.509 710.626 260.952 360.536 250.225 610.545 670.704 590.689 420.810 100.564 620.903 330.854 280.729 13
Huan Lei, Naveed Akhtar, Mubarak Shah, and Ajmal Mian: Geometric feature learning for 3D meshes.
Feature_GeometricNetpermissive0.690 390.884 130.754 490.795 380.647 450.818 490.422 690.802 390.612 340.604 350.945 560.462 520.189 780.563 600.853 280.726 220.765 390.632 310.904 310.821 460.606 56
Kangcheng Liu, Ben M. Chen: https://arxiv.org/abs/2012.09439. arXiv Preprint
FusionNet0.688 400.704 730.741 580.754 520.656 410.829 330.501 300.741 550.609 370.548 500.950 420.522 310.371 20.633 400.756 470.715 300.771 360.623 360.861 690.814 480.658 39
Feihu Zhang, Jin Fang, Benjamin Wah, Philip Torr: Deep FusionNet for Point Cloud Semantic Segmentation. ECCV 2020
Feature-Geometry Netpermissive0.685 410.866 160.748 510.819 230.645 470.794 650.450 540.802 390.587 480.604 350.945 560.464 510.201 730.554 630.840 330.723 250.732 560.602 460.907 290.822 450.603 59
KP-FCNN0.684 420.847 220.758 470.784 420.647 450.814 520.473 420.772 450.605 390.594 420.935 760.450 590.181 810.587 500.805 410.690 400.785 290.614 400.882 490.819 470.632 48
H. Thomas, C. Qi, J. Deschaud, B. Marcotegui, F. Goulette, L. Guibas.: KPConv: Flexible and Deformable Convolution for Point Clouds. ICCV 2019
DGNet0.684 420.712 720.784 330.782 440.658 400.835 290.499 340.823 310.641 220.597 400.950 420.487 420.281 340.575 550.619 710.647 590.764 400.620 390.871 640.846 340.688 32
VACNN++0.684 420.728 670.757 480.776 450.690 310.804 590.464 480.816 320.577 530.587 440.945 560.508 360.276 370.671 300.710 570.663 510.750 500.589 530.881 500.832 390.653 41
Superpoint Network0.683 450.851 210.728 620.800 370.653 430.806 570.468 450.804 370.572 540.602 370.946 530.453 580.239 580.519 720.822 360.689 420.762 430.595 500.895 400.827 410.630 49
PointContrast_LA_SEM0.683 450.757 540.784 330.786 400.639 490.824 390.408 720.775 440.604 400.541 520.934 800.532 280.269 440.552 640.777 450.645 620.793 240.640 280.913 270.824 420.671 36
VI-PointConv0.676 470.770 490.754 490.783 430.621 530.814 520.552 90.758 480.571 560.557 480.954 290.529 290.268 460.530 700.682 630.675 450.719 590.603 450.888 460.833 370.665 37
Xingyi Li, Wenxuan Wu, Xiaoli Z. Fern, Li Fuxin: The Devils in the Point Clouds: Studying the Robustness of Point Cloud Convolutions.
ROSMRF3D0.673 480.789 370.748 510.763 500.635 510.814 520.407 740.747 520.581 520.573 450.950 420.484 430.271 420.607 470.754 480.649 560.774 330.596 480.883 480.823 430.606 56
SALANet0.670 490.816 310.770 400.768 470.652 440.807 560.451 510.747 520.659 180.545 510.924 860.473 480.149 930.571 570.811 400.635 650.746 510.623 360.892 420.794 600.570 69
PointConvpermissive0.666 500.781 400.759 450.699 640.644 480.822 420.475 410.779 430.564 590.504 680.953 320.428 680.203 720.586 520.754 480.661 520.753 480.588 540.902 340.813 500.642 44
Wenxuan Wu, Zhongang Qi, Li Fuxin: PointConv: Deep Convolutional Networks on 3D Point Clouds. CVPR 2019
PointASNLpermissive0.666 500.703 740.781 350.751 540.655 420.830 320.471 430.769 460.474 820.537 540.951 380.475 470.279 360.635 380.698 620.675 450.751 490.553 680.816 800.806 520.703 28
Xu Yan, Chaoda Zheng, Zhen Li, Sheng Wang, Shuguang Cui: PointASNL: Robust Point Clouds Processing using Nonlocal Neural Networks with Adaptive Sampling. CVPR 2020
PPCNN++permissive0.663 520.746 570.708 660.722 560.638 500.820 450.451 510.566 870.599 430.541 520.950 420.510 350.313 190.648 350.819 380.616 700.682 750.590 520.869 650.810 510.656 40
Pyunghwan Ahn, Juyoung Yang, Eojindl Yi, Chanho Lee, Junmo Kim: Projection-based Point Convolution for Efficient Point Cloud Segmentation. IEEE Access
DCM-Net0.658 530.778 410.702 690.806 330.619 540.813 550.468 450.693 680.494 740.524 600.941 670.449 600.298 250.510 740.821 370.675 450.727 580.568 600.826 780.803 540.637 46
Jonas Schult*, Francis Engelmann*, Theodora Kontogianni, Bastian Leibe: DualConvMesh-Net: Joint Geodesic and Euclidean Convolutions on 3D Meshes. CVPR 2020 [Oral]
HPGCNN0.656 540.698 750.743 560.650 800.564 710.820 450.505 280.758 480.631 260.479 730.945 560.480 450.226 590.572 560.774 460.690 400.735 540.614 400.853 720.776 750.597 62
Jisheng Dang, Qingyong Hu, Yulan Guo, Jun Yang: HPGCNN.
SAFNet-segpermissive0.654 550.752 550.734 600.664 770.583 660.815 510.399 760.754 500.639 230.535 560.942 650.470 490.309 210.665 310.539 770.650 550.708 640.635 300.857 710.793 620.642 44
Linqing Zhao, Jiwen Lu, Jie Zhou: Similarity-Aware Fusion Network for 3D Semantic Segmentation. IROS 2021
RandLA-Netpermissive0.645 560.778 410.731 610.699 640.577 670.829 330.446 560.736 560.477 810.523 620.945 560.454 560.269 440.484 810.749 510.618 680.738 520.599 470.827 770.792 650.621 51
MVPNetpermissive0.641 570.831 260.715 640.671 740.590 620.781 710.394 780.679 700.642 210.553 490.937 730.462 520.256 510.649 340.406 900.626 660.691 720.666 210.877 540.792 650.608 55
Maximilian Jaritz, Jiayuan Gu, Hao Su: Multi-view PointNet for 3D Scene Understanding. GMDL Workshop, ICCV 2019
PointConv-SFPN0.641 570.776 430.703 680.721 570.557 740.826 360.451 510.672 720.563 600.483 720.943 640.425 710.162 880.644 360.726 530.659 530.709 630.572 570.875 560.786 700.559 75
PointMRNet0.640 590.717 710.701 700.692 670.576 680.801 610.467 470.716 610.563 600.459 780.953 320.429 670.169 850.581 530.854 270.605 710.710 610.550 700.894 410.793 620.575 67
FPConvpermissive0.639 600.785 380.760 440.713 620.603 570.798 630.392 790.534 920.603 410.524 600.948 480.457 540.250 530.538 680.723 550.598 750.696 700.614 400.872 610.799 550.567 72
Yiqun Lin, Zizheng Yan, Haibin Huang, Dong Du, Ligang Liu, Shuguang Cui, Xiaoguang Han: FPConv: Learning Local Flattening for Point Convolution. CVPR 2020
PD-Net0.638 610.797 350.769 410.641 850.590 620.820 450.461 490.537 910.637 240.536 550.947 500.388 820.206 690.656 320.668 660.647 590.732 560.585 550.868 660.793 620.473 94
PointSPNet0.637 620.734 630.692 770.714 610.576 680.797 640.446 560.743 540.598 440.437 830.942 650.403 780.150 920.626 420.800 430.649 560.697 690.557 660.846 740.777 740.563 73
SConv0.636 630.830 270.697 730.752 530.572 700.780 730.445 580.716 610.529 650.530 570.951 380.446 620.170 840.507 760.666 670.636 640.682 750.541 750.886 470.799 550.594 63
Supervoxel-CNN0.635 640.656 810.711 650.719 580.613 550.757 820.444 610.765 470.534 640.566 460.928 840.478 460.272 400.636 370.531 790.664 500.645 850.508 830.864 680.792 650.611 52
joint point-basedpermissive0.634 650.614 880.778 360.667 760.633 520.825 370.420 700.804 370.467 840.561 470.951 380.494 390.291 280.566 580.458 850.579 820.764 400.559 650.838 750.814 480.598 61
Hung-Yueh Chiang, Yen-Liang Lin, Yueh-Cheng Liu, Winston H. Hsu: A Unified Point-Based Framework for 3D Segmentation. 3DV 2019
PointMTL0.632 660.731 650.688 800.675 710.591 610.784 700.444 610.565 880.610 350.492 700.949 460.456 550.254 520.587 500.706 580.599 740.665 810.612 430.868 660.791 690.579 66
3DSM_DMMF0.631 670.626 850.745 540.801 360.607 560.751 830.506 270.729 590.565 580.491 710.866 1000.434 630.197 760.595 480.630 700.709 320.705 660.560 630.875 560.740 850.491 89
APCF-Net0.631 670.742 600.687 820.672 720.557 740.792 680.408 720.665 730.545 620.508 650.952 360.428 680.186 790.634 390.702 600.620 670.706 650.555 670.873 590.798 570.581 65
Haojia, Lin: Adaptive Pyramid Context Fusion for Point Cloud Perception. GRSL
PointNet2-SFPN0.631 670.771 470.692 770.672 720.524 790.837 260.440 640.706 660.538 630.446 800.944 620.421 730.219 640.552 640.751 500.591 780.737 530.543 740.901 360.768 770.557 76
FusionAwareConv0.630 700.604 900.741 580.766 490.590 620.747 840.501 300.734 570.503 730.527 580.919 900.454 560.323 160.550 660.420 890.678 440.688 730.544 720.896 390.795 590.627 50
Jiazhao Zhang, Chenyang Zhu, Lintao Zheng, Kai Xu: Fusion-Aware Point Convolution for Online Semantic 3D Scene Segmentation. CVPR 2020
DenSeR0.628 710.800 340.625 920.719 580.545 770.806 570.445 580.597 820.448 890.519 630.938 720.481 440.328 140.489 800.499 840.657 540.759 450.592 510.881 500.797 580.634 47
SegGroup_sempermissive0.627 720.818 300.747 530.701 630.602 580.764 790.385 830.629 790.490 770.508 650.931 830.409 760.201 730.564 590.725 540.618 680.692 710.539 760.873 590.794 600.548 79
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
SIConv0.625 730.830 270.694 750.757 510.563 720.772 770.448 550.647 760.520 670.509 640.949 460.431 660.191 770.496 780.614 720.647 590.672 790.535 780.876 550.783 710.571 68
HPEIN0.618 740.729 660.668 830.647 820.597 600.766 780.414 710.680 690.520 670.525 590.946 530.432 640.215 660.493 790.599 730.638 630.617 900.570 580.897 380.806 520.605 58
Li Jiang, Hengshuang Zhao, Shu Liu, Xiaoyong Shen, Chi-Wing Fu, Jiaya Jia: Hierarchical Point-Edge Interaction Network for Point Cloud Semantic Segmentation. ICCV 2019
SPH3D-GCNpermissive0.610 750.858 200.772 380.489 970.532 780.792 680.404 750.643 780.570 570.507 670.935 760.414 750.046 1020.510 740.702 600.602 730.705 660.549 710.859 700.773 760.534 82
Huan Lei, Naveed Akhtar, and Ajmal Mian: Spherical Kernel for Efficient Graph Convolution on 3D Point Clouds. TPAMI 2020
AttAN0.609 760.760 520.667 840.649 810.521 800.793 660.457 500.648 750.528 660.434 850.947 500.401 790.153 910.454 830.721 560.648 580.717 600.536 770.904 310.765 780.485 90
Gege Zhang, Qinghua Ma, Licheng Jiao, Fang Liu and Qigong Sun: AttAN: Attention Adversarial Networks for 3D Point Cloud Semantic Segmentation. IJCAI2020
wsss-transformer0.600 770.634 840.743 560.697 660.601 590.781 710.437 660.585 850.493 750.446 800.933 810.394 800.011 1040.654 330.661 690.603 720.733 550.526 790.832 760.761 800.480 91
dtc_net0.596 780.683 760.725 630.715 600.549 760.803 600.444 610.647 760.493 750.495 690.941 670.409 760.000 1060.424 880.544 760.598 750.703 680.522 800.912 280.792 650.520 85
LAP-D0.594 790.720 690.692 770.637 860.456 890.773 760.391 810.730 580.587 480.445 820.940 700.381 830.288 290.434 860.453 870.591 780.649 830.581 560.777 840.749 840.610 54
DPC0.592 800.720 690.700 710.602 900.480 850.762 810.380 840.713 640.585 510.437 830.940 700.369 850.288 290.434 860.509 830.590 800.639 880.567 610.772 850.755 820.592 64
Francis Engelmann, Theodora Kontogianni, Bastian Leibe: Dilated Point Convolutions: On the Receptive Field Size of Point Convolutions on 3D Point Clouds. ICRA 2020
CCRFNet0.589 810.766 510.659 870.683 690.470 880.740 860.387 820.620 810.490 770.476 740.922 880.355 880.245 560.511 730.511 820.571 830.643 860.493 870.872 610.762 790.600 60
ROSMRF0.580 820.772 460.707 670.681 700.563 720.764 790.362 860.515 930.465 850.465 770.936 750.427 700.207 680.438 840.577 740.536 860.675 780.486 880.723 910.779 720.524 84
SD-DETR0.576 830.746 570.609 960.445 1010.517 810.643 970.366 850.714 630.456 870.468 760.870 990.432 640.264 480.558 620.674 640.586 810.688 730.482 890.739 890.733 870.537 81
SQN_0.1%0.569 840.676 780.696 740.657 780.497 820.779 740.424 680.548 890.515 690.376 900.902 970.422 720.357 40.379 910.456 860.596 770.659 820.544 720.685 940.665 980.556 77
TextureNetpermissive0.566 850.672 800.664 850.671 740.494 830.719 870.445 580.678 710.411 950.396 880.935 760.356 870.225 610.412 890.535 780.565 840.636 890.464 910.794 830.680 950.568 71
Jingwei Huang, Haotian Zhang, Li Yi, Thomas Funkerhouser, Matthias Niessner, Leonidas Guibas: TextureNet: Consistent Local Parametrizations for Learning from High-Resolution Signals on Meshes. CVPR
DVVNet0.562 860.648 820.700 710.770 460.586 650.687 910.333 900.650 740.514 700.475 750.906 940.359 860.223 630.340 930.442 880.422 970.668 800.501 840.708 920.779 720.534 82
Pointnet++ & Featurepermissive0.557 870.735 620.661 860.686 680.491 840.744 850.392 790.539 900.451 880.375 910.946 530.376 840.205 700.403 900.356 930.553 850.643 860.497 850.824 790.756 810.515 86
GMLPs0.538 880.495 980.693 760.647 820.471 870.793 660.300 930.477 940.505 720.358 920.903 960.327 910.081 990.472 820.529 800.448 950.710 610.509 810.746 870.737 860.554 78
PanopticFusion-label0.529 890.491 990.688 800.604 890.386 940.632 980.225 1030.705 670.434 920.293 980.815 1010.348 890.241 570.499 770.669 650.507 880.649 830.442 970.796 820.602 1010.561 74
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
subcloud_weak0.516 900.676 780.591 990.609 870.442 900.774 750.335 890.597 820.422 940.357 930.932 820.341 900.094 980.298 950.528 810.473 930.676 770.495 860.602 1000.721 900.349 101
Online SegFusion0.515 910.607 890.644 900.579 920.434 910.630 990.353 870.628 800.440 900.410 860.762 1040.307 930.167 860.520 710.403 910.516 870.565 930.447 950.678 950.701 920.514 87
Davide Menini, Suryansh Kumar, Martin R. Oswald, Erik Sandstroem, Cristian Sminchisescu, Luc van Gool: A Real-Time Learning Framework for Joint 3D Reconstruction and Semantic Segmentation. Robotics and Automation Letters Submission
3DMV, FTSDF0.501 920.558 940.608 970.424 1030.478 860.690 900.246 990.586 840.468 830.450 790.911 920.394 800.160 890.438 840.212 1000.432 960.541 980.475 900.742 880.727 880.477 92
PCNN0.498 930.559 930.644 900.560 940.420 930.711 890.229 1010.414 950.436 910.352 940.941 670.324 920.155 900.238 1000.387 920.493 890.529 990.509 810.813 810.751 830.504 88
3DMV0.484 940.484 1000.538 1010.643 840.424 920.606 1020.310 910.574 860.433 930.378 890.796 1020.301 940.214 670.537 690.208 1010.472 940.507 1020.413 1000.693 930.602 1010.539 80
Angela Dai, Matthias Niessner: 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation. ECCV'18
PointCNN with RGBpermissive0.458 950.577 920.611 950.356 1050.321 1020.715 880.299 950.376 990.328 1020.319 960.944 620.285 960.164 870.216 1030.229 980.484 910.545 970.456 930.755 860.709 910.475 93
Yangyan Li, Rui Bu, Mingchao Sun, Baoquan Chen: PointCNN. NeurIPS 2018
FCPNpermissive0.447 960.679 770.604 980.578 930.380 950.682 920.291 960.106 1050.483 800.258 1030.920 890.258 1000.025 1030.231 1020.325 940.480 920.560 950.463 920.725 900.666 970.231 105
Dario Rethage, Johanna Wald, Jürgen Sturm, Nassir Navab, Federico Tombari: Fully-Convolutional Point Networks for Large-Scale Point Clouds. ECCV 2018
DGCNN_reproducecopyleft0.446 970.474 1010.623 930.463 990.366 970.651 950.310 910.389 980.349 1000.330 950.937 730.271 980.126 950.285 960.224 990.350 1020.577 920.445 960.625 980.723 890.394 97
Yue Wang, Yongbin Sun, Ziwei Liu, Sanjay E. Sarma, Michael M. Bronstein, Justin M. Solomon: Dynamic Graph CNN for Learning on Point Clouds. TOG 2019
PNET20.442 980.548 950.548 1000.597 910.363 980.628 1000.300 930.292 1000.374 970.307 970.881 980.268 990.186 790.238 1000.204 1020.407 980.506 1030.449 940.667 960.620 1000.462 95
SurfaceConvPF0.442 980.505 970.622 940.380 1040.342 1000.654 940.227 1020.397 970.367 980.276 1000.924 860.240 1010.198 750.359 920.262 960.366 990.581 910.435 980.640 970.668 960.398 96
Hao Pan, Shilin Liu, Yang Liu, Xin Tong: Convolutional Neural Networks on 3D Surfaces Using Parallel Frames.
Tangent Convolutionspermissive0.438 1000.437 1030.646 890.474 980.369 960.645 960.353 870.258 1020.282 1040.279 990.918 910.298 950.147 940.283 970.294 950.487 900.562 940.427 990.619 990.633 990.352 100
Maxim Tatarchenko, Jaesik Park, Vladlen Koltun, Qian-Yi Zhou: Tangent convolutions for dense prediction in 3d. CVPR 2018
3DWSSS0.425 1010.525 960.647 880.522 950.324 1010.488 1050.077 1060.712 650.353 990.401 870.636 1060.281 970.176 820.340 930.565 750.175 1060.551 960.398 1010.370 1060.602 1010.361 99
SPLAT Netcopyleft0.393 1020.472 1020.511 1020.606 880.311 1030.656 930.245 1000.405 960.328 1020.197 1040.927 850.227 1030.000 1060.001 1070.249 970.271 1050.510 1000.383 1030.593 1010.699 930.267 103
Hang Su, Varun Jampani, Deqing Sun, Subhransu Maji, Evangelos Kalogerakis, Ming-Hsuan Yang, Jan Kautz: SPLATNet: Sparse Lattice Networks for Point Cloud Processing. CVPR 2018
ScanNet+FTSDF0.383 1030.297 1050.491 1030.432 1020.358 990.612 1010.274 970.116 1040.411 950.265 1010.904 950.229 1020.079 1000.250 980.185 1030.320 1030.510 1000.385 1020.548 1020.597 1040.394 97
PointNet++permissive0.339 1040.584 910.478 1040.458 1000.256 1050.360 1060.250 980.247 1030.278 1050.261 1020.677 1050.183 1040.117 960.212 1040.145 1050.364 1000.346 1060.232 1060.548 1020.523 1050.252 104
Charles R. Qi, Li Yi, Hao Su, Leonidas J. Guibas: pointnet++: deep hierarchical feature learning on point sets in a metric space.
SSC-UNetpermissive0.308 1050.353 1040.290 1060.278 1060.166 1060.553 1030.169 1050.286 1010.147 1060.148 1060.908 930.182 1050.064 1010.023 1060.018 1070.354 1010.363 1040.345 1040.546 1040.685 940.278 102
ScanNetpermissive0.306 1060.203 1060.366 1050.501 960.311 1030.524 1040.211 1040.002 1070.342 1010.189 1050.786 1030.145 1060.102 970.245 990.152 1040.318 1040.348 1050.300 1050.460 1050.437 1060.182 106
Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner: ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. CVPR'17
ERROR0.054 1070.000 1070.041 1070.172 1070.030 1070.062 1070.001 1070.035 1060.004 1070.051 1070.143 1070.019 1070.003 1050.041 1050.050 1060.003 1070.054 1070.018 1070.005 1070.264 1070.082 107


This table lists the benchmark results for the 3D semantic instance scenario.




Method Infoavg apbathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
MAFT0.596 10.889 60.721 10.448 110.460 30.768 10.251 40.558 150.408 10.504 10.539 20.616 10.618 50.858 20.482 30.684 70.551 60.931 50.450 1
Queryformer0.583 20.926 20.702 20.393 220.504 10.733 70.276 20.527 200.373 50.479 20.534 40.533 90.697 30.720 160.436 80.745 20.592 10.958 30.363 8
PBNetpermissive0.573 30.926 20.575 110.619 10.472 20.736 50.239 60.487 260.383 30.459 40.506 70.533 80.585 70.767 80.404 90.717 30.559 50.969 20.381 5
W.Zhao, Y.Yan, C.Yang, J.Ye,X.Yang,K.Huang: Divide and Conquer: 3D Instance Segmentation With Point-Wise Binarization. ICCV 2023
Mask3D0.566 40.926 20.597 60.408 190.420 50.737 40.239 50.598 80.386 20.458 50.549 10.568 60.716 20.601 270.480 40.646 100.575 30.922 60.364 7
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
ISBNetpermissive0.559 50.926 20.597 70.390 230.436 40.722 80.276 30.556 160.380 40.450 60.505 80.583 30.730 10.575 280.455 60.603 160.573 40.979 10.332 14
Tuan Duc Ngo, Binh-Son Hua, Khoi Nguyen: ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution. CVPR 2023
GraphCut0.552 61.000 10.611 50.438 140.392 80.714 90.139 90.598 90.327 80.389 80.510 60.598 20.427 240.754 110.463 50.761 10.588 20.903 100.329 15
SPFormerpermissive0.549 70.745 150.640 30.484 60.395 70.739 30.311 10.566 130.335 70.468 30.492 90.555 70.478 150.747 130.436 70.712 40.540 70.893 140.343 13
Sun Jiahao, Qing Chunmei, Tan Junpeng, Xu Xiangmin: Superpoint Transformer for 3D Scene Instance Segmentation. AAAI 2023 [Oral]
DKNet0.532 80.815 100.624 40.517 30.377 100.749 20.107 110.509 230.304 100.437 70.475 100.581 40.539 100.775 70.339 140.640 120.506 100.901 110.385 4
Yizheng Wu, Min Shi, Shuaiyuan Du, Hao Lu, Zhiguo Cao, Weicai Zhong: 3D Instances as 1D Kernels. ECCV 2022
IPCA-Inst0.520 90.889 60.551 150.548 20.418 60.665 190.064 200.585 100.260 180.277 210.471 120.500 100.644 40.785 50.369 100.591 190.511 80.878 190.362 9
SoftGroup++0.513 100.704 210.578 100.398 210.363 150.704 100.061 210.647 40.297 150.378 110.537 30.343 120.614 60.828 40.295 190.710 60.505 120.875 210.394 2
SSTNetpermissive0.506 110.738 180.549 160.497 50.316 200.693 130.178 80.377 340.198 230.330 130.463 140.576 50.515 120.857 30.494 10.637 130.457 160.943 40.290 24
Zhihao Liang, Zhihao Li, Songcen Xu, Mingkui Tan, Kui Jia: Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks. ICCV2021
SoftGrouppermissive0.504 120.667 280.579 80.372 260.381 90.694 120.072 170.677 20.303 110.387 90.531 50.319 160.582 80.754 100.318 150.643 110.492 130.907 90.388 3
Thang Vu, Kookhoi Kim, Tung M. Luu, Xuan Thanh Nguyen, Chang D. Yoo: SoftGroup for 3D Instance Segmentaiton on Point Clouds. CVPR 2022 [Oral]
TD3D0.489 130.852 80.511 240.434 150.322 190.735 60.101 140.512 220.355 60.349 120.468 130.283 200.514 130.676 220.268 240.671 80.510 90.908 80.329 16
OccuSeg+instance0.486 140.802 120.536 180.428 170.369 120.702 110.205 70.331 390.301 120.379 100.474 110.327 130.437 200.862 10.485 20.601 170.394 260.846 300.273 26
Lei Han, Tian Zheng, Lan Xu, Lu Fang: OccuSeg: Occupancy-aware 3D Instance Segmentation. CVPR2020
TopoSeg0.479 150.704 210.564 120.467 90.366 130.633 270.068 180.554 170.262 170.328 140.447 160.323 140.534 110.722 150.288 210.614 140.482 140.912 70.358 11
DualGroup0.469 160.815 100.552 140.398 200.374 110.683 150.130 100.539 190.310 90.327 150.407 190.276 210.447 190.535 320.342 130.659 90.455 170.900 130.301 20
SSEC0.465 170.667 280.578 90.502 40.362 160.641 260.035 300.605 60.291 160.323 160.451 150.296 180.417 270.677 210.245 280.501 360.506 110.900 120.366 6
HAISpermissive0.457 180.704 210.561 130.457 100.364 140.673 160.046 290.547 180.194 240.308 170.426 170.288 190.454 180.711 170.262 250.563 260.434 200.889 160.344 12
Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang: Hierarchical Aggregation for 3D Instance Segmentation. ICCV 2021
DD-UNet+Group0.436 190.630 360.508 270.480 70.310 210.624 300.065 190.638 50.174 250.256 250.384 220.194 330.428 220.759 90.289 200.574 230.400 240.849 280.291 23
H. Liu, R. Liu, K. Yang, J. Zhang, K. Peng, R. Stiefelhagen: HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor. ICCVW 2021
INS-Conv-instance0.435 200.716 200.495 290.355 280.331 170.689 140.102 130.394 330.208 220.280 190.395 210.250 240.544 90.741 140.309 170.536 320.391 270.842 330.258 30
Mask-Group0.434 210.778 130.516 220.471 80.330 180.658 200.029 320.526 210.249 190.256 240.400 200.309 170.384 310.296 480.368 110.575 220.425 210.877 200.362 10
Min Zhong, Xinghao Chen, Xiaokang Chen, Gang Zeng, Yunhe Wang: MaskGroup: Hierarchical Point Grouping and Masking for 3D Instance Segmentation. ICME 2022
Box2Mask0.433 220.741 160.463 340.433 160.283 240.625 290.103 120.298 440.125 330.260 230.424 180.322 150.472 160.701 190.363 120.711 50.309 420.882 170.272 28
Julian Chibane, Francis Engelmann, Tuan Anh Tran, Gerard Pons-Moll: Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation Using Bounding Boxes. ECCV 2022
RPGN0.428 230.630 360.508 260.367 270.249 310.658 210.016 390.673 30.131 310.234 280.383 230.270 220.434 210.748 120.274 230.609 150.406 230.842 320.267 29
Shichao Dong, Guosheng Lin, Tzu-Yi Hung: Learning Regional Purity for Instance Segmentation on 3D Point Clouds. ECCV 2022
DENet0.413 240.741 160.520 200.237 390.284 230.523 380.097 150.691 10.138 280.209 380.229 400.238 270.390 290.707 180.310 160.448 430.470 150.892 150.310 18
PointGroup0.407 250.639 350.496 280.415 180.243 330.645 250.021 370.570 120.114 340.211 360.359 250.217 310.428 230.660 230.256 260.562 270.341 340.860 240.291 22
Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia: PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation. CVPR 2020 [oral]
CSC-Pretrained0.405 260.738 180.465 330.331 320.205 370.655 220.051 250.601 70.092 380.211 370.329 280.198 320.459 170.775 60.195 350.524 340.400 250.878 180.184 39
PE0.396 270.667 280.467 320.446 130.243 320.624 310.022 360.577 110.106 350.219 310.340 260.239 260.487 140.475 390.225 300.541 310.350 320.818 350.273 27
Biao Zhang, Peter Wonka: Point Cloud Instance Segmentation using Probabilistic Embeddings. CVPR 2021
Dyco3Dcopyleft0.395 280.642 340.518 210.447 120.259 300.666 180.050 260.251 480.166 260.231 290.362 240.232 280.331 340.535 310.229 290.587 200.438 190.850 260.317 17
Tong He; Chunhua Shen; Anton van den Hengel: DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution. CVPR2021
OSIS0.392 290.778 130.530 190.220 410.278 250.567 350.083 160.330 400.299 130.270 220.310 310.143 380.260 380.624 250.277 220.568 250.361 300.865 230.301 19
AOIA0.387 300.704 210.515 230.385 240.225 360.669 170.005 450.482 270.126 320.181 410.269 370.221 300.426 250.478 380.218 310.592 180.371 280.851 250.242 32
SSEN0.384 310.852 80.494 300.192 420.226 350.648 240.022 350.398 320.299 140.277 200.317 300.231 290.194 450.514 350.196 330.586 210.444 180.843 310.184 38
Dongsu Zhang, Junha Chun, Sang Kyun Cha, Young Min Kim: Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with Deep Metric Learning. Arxiv
PCJC0.375 320.704 210.542 170.284 360.197 390.649 230.006 430.426 280.138 290.242 260.304 320.183 360.388 300.629 240.141 450.546 300.344 330.738 410.283 25
ClickSeg_Instance0.366 330.654 320.375 380.184 430.302 220.592 330.050 270.300 430.093 370.283 180.277 340.249 250.426 260.615 260.299 180.504 350.367 290.832 340.191 37
SphereSeg0.357 340.651 330.411 360.345 290.264 290.630 280.059 220.289 460.212 200.240 270.336 270.158 370.305 350.557 290.159 410.455 420.341 350.726 430.294 21
3D-MPA0.355 350.457 470.484 310.299 340.277 260.591 340.047 280.332 370.212 210.217 320.278 330.193 340.413 280.410 420.195 340.574 240.352 310.849 270.213 35
Francis Engelmann, Martin Bokeloh, Alireza Fathi, Bastian Leibe, Matthias Nießner: 3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation. CVPR 2020
NeuralBF0.353 360.593 380.511 250.375 250.264 280.597 320.008 410.332 380.160 270.229 300.274 360.000 580.206 420.678 200.155 420.485 380.422 220.816 360.254 31
Weiwei Sun, Daniel Rebain, Renjie Liao, Vladimir Tankovich, Soroosh Yazdani, Kwang Moo Yi, Andrea Tagliasacchi: NeuralBF: Neural Bilateral Filtering for Top-down Instance Segmentation on Point Clouds. WACV 2023
RWSeg0.348 370.475 440.456 350.320 330.275 270.476 400.020 380.491 250.056 450.212 350.320 290.261 230.302 360.520 330.182 370.557 280.285 440.867 220.197 36
GICN0.341 380.580 390.371 390.344 300.198 380.469 410.052 240.564 140.093 360.212 340.212 420.127 400.347 330.537 300.206 320.525 330.329 370.729 420.241 33
One_Thing_One_Clickpermissive0.326 390.472 450.361 400.232 400.183 400.555 360.000 510.498 240.038 470.195 390.226 410.362 110.168 460.469 400.251 270.553 290.335 360.846 290.117 47
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
Occipital-SCS0.320 400.679 270.352 410.334 310.229 340.436 420.025 330.412 310.058 430.161 460.240 390.085 420.262 370.496 370.187 360.467 400.328 380.775 370.231 34
Sparse R-CNN0.292 410.704 210.213 510.153 450.154 420.551 370.053 230.212 490.132 300.174 430.274 350.070 440.363 320.441 410.176 380.424 450.234 460.758 390.161 43
MTML0.282 420.577 400.380 370.182 440.107 480.430 430.001 480.422 290.057 440.179 420.162 450.070 450.229 400.511 360.161 390.491 370.313 390.650 480.162 41
Jean Lahoud, Bernard Ghanem, Marc Pollefeys, Martin R. Oswald: 3D Instance Segmentation via Multi-task Metric Learning. ICCV 2019 [oral]
SALoss-ResNet0.262 430.667 280.335 420.067 520.123 460.427 440.022 340.280 470.058 420.216 330.211 430.039 480.142 480.519 340.106 490.338 490.310 410.721 440.138 44
Zhidong Liang, Ming Yang, Hao Li, Chunxiang Wang: 3D Instance Embedding Learning With a Structure-Aware Loss Function for Point Cloud Segmentation. IEEE Robotics and Automation Letters (IROS2020)
MASCpermissive0.254 440.463 460.249 500.113 460.167 410.412 460.000 500.374 350.073 390.173 440.243 380.130 390.228 410.368 440.160 400.356 470.208 470.711 450.136 45
Chen Liu, Yasutaka Furukawa: MASC: Multi-scale Affinity with Sparse Convolution for 3D Instance Segmentation.
3D-BoNet0.253 450.519 420.324 450.251 380.137 450.345 510.031 310.419 300.069 400.162 450.131 470.052 460.202 440.338 460.147 440.301 520.303 430.651 470.178 40
Bo Yang, Jianan Wang, Ronald Clark, Qingyong Hu, Sen Wang, Andrew Markham, Niki Trigoni: Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds. NeurIPS 2019 Spotlight
SPG_WSIS0.251 460.380 490.274 480.289 350.144 430.413 450.000 510.311 410.065 410.113 480.130 480.029 500.204 430.388 430.108 480.459 410.311 400.769 380.127 46
SegGroup_inspermissive0.246 470.556 410.335 430.062 540.115 470.490 390.000 510.297 450.018 510.186 400.142 460.083 430.233 390.216 500.153 430.469 390.251 450.744 400.083 50
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
PanopticFusion-inst0.214 480.250 530.330 440.275 370.103 490.228 570.000 510.345 360.024 490.088 500.203 440.186 350.167 470.367 450.125 460.221 550.112 570.666 460.162 42
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
UNet-backbone0.161 490.519 420.259 490.084 480.059 510.325 530.002 460.093 540.009 530.077 520.064 510.045 470.044 550.161 520.045 510.331 500.180 490.566 490.033 58
3D-SISpermissive0.161 490.407 480.155 550.068 510.043 550.346 500.001 470.134 510.005 540.088 490.106 500.037 490.135 500.321 470.028 540.339 480.116 560.466 520.093 49
Ji Hou, Angela Dai, Matthias Niessner: 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans. CVPR 2019
R-PointNet0.158 510.356 500.173 530.113 470.140 440.359 470.012 400.023 560.039 460.134 470.123 490.008 540.089 510.149 530.117 470.221 540.128 540.563 500.094 48
Region-18class0.146 520.175 570.321 460.080 490.062 500.357 480.000 510.307 420.002 550.066 530.044 530.000 580.018 570.036 570.054 500.447 440.133 520.472 510.060 53
SemRegionNet-20cls0.121 530.296 520.203 520.071 500.058 520.349 490.000 510.150 500.019 500.054 540.034 550.017 530.052 530.042 560.013 570.209 560.183 480.371 530.057 54
3D-BEVIS0.117 540.250 530.308 470.020 580.009 590.269 560.006 440.008 570.029 480.037 570.014 580.003 560.036 560.147 540.042 520.381 460.118 550.362 540.069 52
Cathrin Elich, Francis Engelmann, Jonas Schult, Theodora Kontogianni, Bastian Leibe: 3D-BEVIS: Birds-Eye-View Instance Segmentation.
Hier3Dcopyleft0.117 540.222 550.161 540.054 560.027 560.289 540.000 510.124 520.001 570.079 510.061 520.027 510.141 490.240 490.005 580.310 510.129 530.153 580.081 51
Tan: HCFS3D: Hierarchical Coupled Feature Selection Network for 3D Semantic and Instance Segmentation.
tmp0.113 560.333 510.151 560.056 550.053 530.344 520.000 510.105 530.016 520.049 550.035 540.020 520.053 520.048 550.013 560.183 570.173 500.344 550.054 55
ASIS0.085 570.037 580.080 580.066 530.047 540.282 550.000 510.052 550.002 560.047 560.026 560.001 570.046 540.194 510.031 530.264 530.140 510.167 570.047 57
Sgpn_scannet0.049 580.023 590.134 570.031 570.013 580.144 580.006 420.008 580.000 580.028 580.017 570.003 550.009 590.000 580.021 550.122 580.095 580.175 560.054 56
MaskRCNN 2d->3d Proj0.022 590.185 560.000 590.000 590.015 570.000 590.000 490.006 590.000 580.010 590.006 590.107 410.012 580.000 580.002 590.027 590.004 590.022 590.001 59


This table lists the benchmark results for the 2D semantic label scenario.


Method Infoavg ioubathtubbedbookshelfcabinetchaircountercurtaindeskdoorfloorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwallwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Virtual MVFusion (R)0.745 10.861 10.839 10.881 10.672 10.512 10.422 150.898 10.723 10.714 10.954 20.454 10.509 10.773 10.895 10.756 10.820 10.653 10.935 10.891 10.728 1
Abhijit Kundu, Xiaoqi Yin, Alireza Fathi, David Ross, Brian Brewington, Thomas Funkhouser, Caroline Pantofaru: Virtual Multi-view Fusion for 3D Semantic Segmentation. ECCV 2020
BPNet_2Dcopyleft0.670 20.822 30.795 30.836 20.659 20.481 20.451 110.769 30.656 30.567 30.931 30.395 40.390 40.700 30.534 30.689 90.770 20.574 30.865 60.831 30.675 4
Wenbo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia and Tien-Tsin Wong: Bidirectional Projection Network for Cross Dimension Scene Understanding. CVPR 2021 (Oral)
CU-Hybrid-2D Net0.636 30.825 20.820 20.179 200.648 30.463 30.549 20.742 60.676 20.628 20.961 10.420 20.379 50.684 60.381 150.732 20.723 30.599 20.827 130.851 20.634 6
CMX0.613 40.681 70.725 90.502 120.634 50.297 150.478 90.830 20.651 40.537 60.924 40.375 50.315 120.686 50.451 120.714 40.543 180.504 50.894 40.823 40.688 3
DMMF_3d0.605 50.651 80.744 70.782 30.637 40.387 40.536 30.732 70.590 60.540 50.856 180.359 90.306 130.596 110.539 20.627 180.706 40.497 70.785 180.757 160.476 19
MCA-Net0.595 60.533 170.756 60.746 40.590 80.334 70.506 60.670 120.587 70.500 100.905 80.366 80.352 80.601 100.506 60.669 150.648 70.501 60.839 120.769 120.516 18
RFBNet0.592 70.616 90.758 50.659 50.581 90.330 80.469 100.655 150.543 120.524 70.924 40.355 100.336 100.572 140.479 80.671 130.648 70.480 90.814 160.814 50.614 9
FAN_NV_RVC0.586 80.510 180.764 40.079 230.620 70.330 80.494 70.753 40.573 80.556 40.884 130.405 30.303 140.718 20.452 110.672 120.658 50.509 40.898 30.813 60.727 2
DCRedNet0.583 90.682 60.723 100.542 110.510 170.310 120.451 110.668 130.549 110.520 80.920 60.375 50.446 20.528 170.417 130.670 140.577 150.478 100.862 70.806 70.628 8
MIX6D_RVC0.582 100.695 40.687 140.225 180.632 60.328 100.550 10.748 50.623 50.494 130.890 110.350 120.254 200.688 40.454 100.716 30.597 140.489 80.881 50.768 130.575 12
SSMAcopyleft0.577 110.695 40.716 120.439 140.563 110.314 110.444 130.719 80.551 100.503 90.887 120.346 130.348 90.603 90.353 170.709 50.600 120.457 120.901 20.786 80.599 11
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. International Journal of Computer Vision, 2019
UNIV_CNP_RVC_UE0.566 120.569 160.686 160.435 150.524 140.294 160.421 160.712 90.543 120.463 150.872 140.320 140.363 70.611 80.477 90.686 100.627 90.443 150.862 70.775 110.639 5
EMSAFormer0.564 130.581 130.736 80.564 100.546 130.219 200.517 40.675 110.486 170.427 190.904 90.352 110.320 110.589 120.528 40.708 60.464 210.413 190.847 110.786 80.611 10
SN_RN152pyrx8_RVCcopyleft0.546 140.572 140.663 180.638 70.518 150.298 140.366 210.633 180.510 150.446 170.864 160.296 170.267 170.542 160.346 180.704 70.575 160.431 160.853 100.766 140.630 7
UDSSEG_RVC0.545 150.610 110.661 190.588 80.556 120.268 180.482 80.642 170.572 90.475 140.836 200.312 150.367 60.630 70.189 200.639 170.495 200.452 130.826 140.756 170.541 14
segfomer with 6d0.542 160.594 120.687 140.146 210.579 100.308 130.515 50.703 100.472 180.498 110.868 150.369 70.282 150.589 120.390 140.701 80.556 170.416 180.860 90.759 150.539 16
FuseNetpermissive0.535 170.570 150.681 170.182 190.512 160.290 170.431 140.659 140.504 160.495 120.903 100.308 160.428 30.523 180.365 160.676 110.621 110.470 110.762 190.779 100.541 14
Caner Hazirbas, Lingni Ma, Csaba Domokos, Daniel Cremers: FuseNet: Incorporating Depth into Semantic Segmentation via Fusion-based CNN Architecture. ACCV 2016
AdapNet++copyleft0.503 180.613 100.722 110.418 160.358 230.337 60.370 200.479 210.443 190.368 210.907 70.207 200.213 220.464 210.525 50.618 190.657 60.450 140.788 170.721 200.408 22
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. International Journal of Computer Vision, 2019
3DMV (2d proj)0.498 190.481 210.612 200.579 90.456 190.343 50.384 180.623 190.525 140.381 200.845 190.254 190.264 190.557 150.182 210.581 210.598 130.429 170.760 200.661 220.446 21
Angela Dai, Matthias Niessner: 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation. ECCV'18
MSeg1080_RVCpermissive0.485 200.505 190.709 130.092 220.427 200.241 190.411 170.654 160.385 230.457 160.861 170.053 230.279 160.503 190.481 70.645 160.626 100.365 210.748 210.725 190.529 17
John Lambert*, Zhuang Liu*, Ozan Sener, James Hays, Vladlen Koltun: MSeg: A Composite Dataset for Multi-domain Semantic Segmentation. CVPR 2020
ILC-PSPNet0.475 210.490 200.581 210.289 170.507 180.067 230.379 190.610 200.417 210.435 180.822 220.278 180.267 170.503 190.228 190.616 200.533 190.375 200.820 150.729 180.560 13
Enet (reimpl)0.376 220.264 230.452 230.452 130.365 210.181 210.143 230.456 220.409 220.346 220.769 230.164 210.218 210.359 220.123 230.403 230.381 230.313 230.571 220.685 210.472 20
Re-implementation of Adam Paszke, Abhishek Chaurasia, Sangpil Kim, Eugenio Culurciello: ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation.
ScanNet (2d proj)permissive0.330 230.293 220.521 220.657 60.361 220.161 220.250 220.004 230.440 200.183 230.836 200.125 220.060 230.319 230.132 220.417 220.412 220.344 220.541 230.427 230.109 23
Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner: ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. CVPR'17
DMMF0.003 240.000 240.005 240.000 240.000 240.037 240.001 240.000 240.001 240.005 240.003 240.000 240.000 240.000 240.000 240.000 240.002 240.001 240.000 240.006 240.000 24


This table lists the benchmark results for the 2D semantic instance scenario.




Method Infoavg apbathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
UniDet_RVC0.205 10.381 10.323 10.037 10.226 10.177 10.063 10.277 10.120 10.067 10.131 10.074 20.317 10.080 10.235 10.289 10.141 10.678 10.080 1
MaskRCNN_ScanNetpermissive0.119 20.129 20.212 20.002 20.112 20.148 20.014 20.205 20.044 20.066 20.078 20.095 10.142 20.030 20.128 20.139 20.080 20.459 20.057 2
Re-implementation of Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross Girshick: Mask R-CNN. ICCV'17


This table lists the benchmark results for the scene type classification scenario.




Method Infoavg recallapartmentbathroombedroom / hotelbookstore / libraryconference roomcopy/mail roomhallwaykitchenlaundry roomliving room / loungemiscofficestorage / basement / garage
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
multi-taskpermissive0.700 10.500 11.000 10.882 20.500 21.000 11.000 10.500 21.000 11.000 10.778 10.000 20.938 10.000 2
Shengyu Huang, Mikhail Usvyatsov, Konrad Schindler: Indoor Scene Recognition in 3D. IROS 2020
3DASPP-SCE0.691 20.500 10.938 20.824 31.000 11.000 10.500 21.000 10.857 20.500 20.556 30.000 20.812 20.500 1
SE-ResNeXt-SSMA0.498 30.000 40.812 30.941 10.500 20.500 30.500 20.500 20.429 40.500 20.667 20.500 10.625 30.000 2
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. arXiv
resnet50_scannet0.353 40.250 30.812 30.529 40.500 20.500 30.000 40.500 20.571 30.000 40.556 30.000 20.375 40.000 2