Presenting the ScanNet200 Benchmark

We present the ScanNet200 benchmark, which studies an order of magnitude more class categories than previous version of ScanNet. The scene geometry is shared within the two tasks, but the parsing of surface annotation allows for a larger vocabulary and more realistic setting for in the wild 3D understanding methods.

The ScanNet200 benchmark includes both finer-grained categories as well as a large number of previously unaddressed classes. This induces a much more challenging setting regarding the diversity of naturally observed semantic classes seen in the raw ScanNet RGB-D observations, where the data also reflects naturally encountered class imbalances. The difference in category frequencies between ScanNet and ScanNet200 can be seen in the Figure above.

ScanNet200 Benchmark

This table lists the benchmark results for the ScanNet200 3D semantic label scenario.




Method Infoavg iouhead ioucommon ioutail ioualarm clockarmchairbackpackbagballbarbasketbathroom cabinetbathroom counterbathroom stallbathroom stall doorbathroom vanitybathtubbedbenchbicyclebinblackboardblanketblindsboardbookbookshelfbottlebowlboxbroombucketbulletin boardcabinetcalendarcandlecartcase of water bottlescd caseceilingceiling lightchairclockclosetcloset doorcloset rodcloset wallclothesclothes dryercoat rackcoffee kettlecoffee makercoffee tablecolumncomputer towercontainercopiercouchcountercratecupcurtaincushiondecorationdeskdining tabledish rackdishwasherdividerdoordoorframedresserdumbbelldustpanend tablefanfile cabinetfire alarmfire extinguisherfireplacefloorfolded chairfurnitureguitarguitar casehair dryerhandicap barhatheadphonesironing boardjacketkeyboardkeyboard pianokitchen cabinetkitchen counterladderlamplaptoplaundry basketlaundry detergentlaundry hamperledgelightlight switchluggagemachinemailboxmatmattressmicrowavemini fridgemirrormonitormousemusic standnightstandobjectoffice chairottomanovenpaperpaper bagpaper cutterpaper towel dispenserpaper towel rollpersonpianopicturepillarpillowpipeplantplateplungerposterpotted plantpower outletpower stripprinterprojectorprojector screenpurserackradiatorrailrange hoodrecycling binrefrigeratorscaleseatshelfshoeshowershower curtainshower curtain rodshower doorshower floorshower headshower wallsignsinksoap dishsoap dispensersofa chairspeakerstair railstairsstandstoolstorage binstorage containerstorage organizerstovestructurestuffed animalsuitcasetabletelephonetissue boxtoastertoaster oventoilettoilet papertoilet paper dispensertoilet paper holdertoilet seat cover dispensertoweltrash bintrash cantraytubetvtv standvacuum cleanerventwallwardrobewashing machinewater bottlewater coolerwater pitcherwhiteboardwindowwindowsill
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort by
IMFSegNet0.334 90.532 130.251 110.179 70.486 90.041 160.139 130.003 10.283 40.000 10.274 150.191 150.457 140.704 140.795 70.197 90.830 60.000 30.710 90.055 160.064 40.518 60.305 100.458 170.216 120.027 50.284 130.000 10.000 30.044 120.406 100.561 70.000 10.080 120.000 30.873 90.021 150.683 80.000 70.076 90.494 100.363 90.648 160.000 10.000 20.425 90.649 40.000 100.668 120.908 70.740 110.010 140.206 80.862 100.000 10.000 110.560 90.000 70.359 130.237 110.631 120.408 110.411 40.322 150.246 40.439 100.599 130.047 40.213 70.940 100.139 110.000 10.369 50.124 100.188 120.495 110.624 110.626 80.320 140.595 40.495 80.496 100.000 40.000 10.340 120.014 60.032 70.135 50.000 40.903 80.277 60.612 80.196 70.344 120.848 130.260 60.000 10.574 130.073 160.062 40.000 40.000 10.091 60.839 30.776 30.123 120.392 90.756 120.274 50.518 120.029 160.842 40.000 60.357 130.000 10.035 70.000 30.444 120.793 20.245 50.000 10.512 160.512 150.159 150.713 130.000 100.000 10.336 130.484 120.569 20.852 90.615 60.120 120.068 100.228 80.000 10.733 100.773 20.190 40.000 100.608 60.792 40.000 10.597 70.000 140.025 20.000 10.573 170.000 20.000 10.508 110.555 80.363 100.139 120.610 20.947 80.305 70.594 90.527 90.009 170.633 130.000 10.060 30.820 50.604 150.799 90.000 10.799 110.034 140.784 130.000 10.618 60.424 20.134 160.646 130.214 14
OA-CNN-L_ScanNet2000.333 110.558 50.269 90.124 130.448 140.080 90.272 50.000 30.000 70.000 10.342 80.515 40.524 70.713 130.789 90.158 120.384 120.000 30.806 60.125 70.000 90.496 80.332 70.498 140.227 80.024 60.474 30.000 10.003 20.071 90.487 30.000 110.000 10.110 80.000 30.876 70.013 170.703 30.000 70.076 90.473 120.355 110.906 60.000 10.000 20.476 60.706 10.000 100.672 100.835 130.748 90.015 130.223 70.860 110.000 10.000 110.572 70.000 70.509 70.313 70.662 40.398 130.396 80.411 130.276 20.527 40.711 50.000 70.076 130.946 60.166 60.000 10.022 100.160 70.183 130.493 130.699 90.637 60.403 60.330 120.406 130.526 60.024 20.000 10.392 110.000 110.016 160.000 120.196 30.915 50.112 120.557 100.197 60.352 100.877 30.000 120.000 10.592 120.103 110.000 140.067 10.000 10.089 70.735 70.625 110.130 90.568 60.836 70.271 80.534 90.043 130.799 110.001 50.445 50.000 10.000 80.024 20.661 40.000 50.262 30.000 10.591 80.517 130.373 80.788 70.021 80.000 10.455 40.517 90.320 80.823 120.200 160.001 170.150 50.100 120.000 10.736 90.668 100.103 140.052 60.662 40.720 80.000 10.602 60.112 70.002 60.000 10.637 90.000 20.000 10.621 100.569 50.398 90.412 50.234 120.949 60.363 50.492 140.495 110.251 40.665 90.000 10.001 110.805 70.833 60.794 110.000 10.821 50.314 50.843 110.000 10.560 100.245 70.262 60.713 40.370 11
CSC-Pretrainpermissive0.249 170.455 170.171 160.079 170.418 150.059 140.186 100.000 30.000 70.000 10.335 100.250 130.316 160.766 70.697 170.142 130.170 140.003 20.553 140.112 90.097 10.201 160.186 140.476 150.081 160.000 90.216 170.000 10.000 30.001 170.314 170.000 110.000 10.055 150.000 30.832 160.094 30.659 150.002 50.076 90.310 160.293 170.664 140.000 10.000 20.175 170.634 60.130 20.552 170.686 170.700 170.076 70.110 150.770 170.000 10.000 110.430 170.000 70.319 150.166 150.542 170.327 160.205 160.332 140.052 160.375 130.444 170.000 70.012 170.930 170.203 30.000 10.000 120.046 120.175 140.413 160.592 140.471 160.299 150.152 160.340 160.247 170.000 40.000 10.225 150.058 30.037 40.000 120.207 20.862 150.014 140.548 130.033 160.233 160.816 160.000 120.000 10.542 150.123 50.121 10.019 20.000 10.000 110.463 160.454 170.045 170.128 170.557 150.235 140.441 160.063 110.484 170.000 60.308 170.000 10.000 80.000 30.318 170.000 50.000 90.000 10.545 140.543 120.164 140.734 90.000 100.000 10.215 170.371 160.198 140.743 140.205 150.062 150.000 110.079 140.000 10.683 160.547 160.142 90.000 100.441 110.579 150.000 10.464 140.098 90.041 10.000 10.590 140.000 20.000 10.373 130.494 140.174 150.105 160.001 170.895 160.222 160.537 120.307 160.180 50.625 140.000 10.000 120.591 170.609 140.398 150.000 10.766 170.014 160.638 170.000 10.377 130.004 130.206 130.609 170.465 5
Ji Hou, Benjamin Graham, Matthias Nießner, Saining Xie: Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts. CVPR 2021
LGroundpermissive0.272 150.485 150.184 150.106 150.476 110.077 100.218 80.000 30.000 70.000 10.547 20.295 110.540 50.746 100.745 150.058 160.112 160.005 10.658 110.077 150.000 90.322 140.178 160.512 110.190 130.199 20.277 150.000 10.000 30.173 70.399 120.000 110.000 10.039 160.000 30.858 140.085 70.676 110.002 50.103 60.498 80.323 140.703 120.000 10.000 20.296 150.549 120.216 10.702 60.768 140.718 140.028 100.092 160.786 160.000 10.000 110.453 160.022 50.251 170.252 90.572 150.348 140.321 110.514 70.063 150.279 160.552 150.000 70.019 160.932 150.132 150.000 10.000 120.000 150.156 170.457 150.623 120.518 140.265 160.358 110.381 150.395 140.000 40.000 10.127 170.012 80.051 10.000 120.000 40.886 130.014 140.437 170.179 80.244 150.826 150.000 120.000 10.599 100.136 10.085 30.000 40.000 10.000 110.565 130.612 130.143 50.207 150.566 140.232 150.446 150.127 40.708 150.000 60.384 90.000 10.000 80.000 30.402 140.000 50.059 70.000 10.525 150.566 110.229 120.659 150.000 100.000 10.265 150.446 140.147 160.720 170.597 80.066 140.000 110.187 90.000 10.726 130.467 170.134 120.000 100.413 150.629 120.000 10.363 160.055 100.022 30.000 10.626 110.000 20.000 10.323 150.479 170.154 160.117 150.028 160.901 150.243 150.415 160.295 170.143 60.610 160.000 10.000 120.777 120.397 170.324 160.000 10.778 150.179 80.702 160.000 10.274 160.404 40.233 100.622 150.398 7
David Rozenberszki, Or Litany, Angela Dai: Language-Grounded Indoor 3D Semantic Segmentation in the Wild. arXiv
AWCS0.305 140.508 140.225 140.142 110.463 130.063 130.195 90.000 30.000 70.000 10.467 30.551 30.504 80.773 60.764 140.142 130.029 170.000 30.626 130.100 110.000 90.360 130.179 150.507 130.137 150.006 80.300 120.000 10.000 30.172 80.364 150.512 90.000 10.056 140.000 30.865 130.093 40.634 170.000 70.071 130.396 140.296 160.876 90.000 10.000 20.373 130.436 160.063 90.749 20.877 100.721 120.131 30.124 140.804 150.000 10.000 110.515 120.010 60.452 100.252 90.578 140.417 80.179 170.484 100.171 70.337 140.606 120.000 70.115 100.937 140.142 90.000 10.008 110.000 150.157 160.484 140.402 170.501 150.339 90.553 70.529 30.478 120.000 40.000 10.404 100.001 100.022 130.077 90.000 40.894 120.219 70.628 70.093 150.305 140.886 10.233 90.000 10.603 90.112 60.023 90.000 40.000 10.000 110.741 60.664 80.097 150.253 140.782 100.264 110.523 110.154 20.707 160.000 60.411 80.000 10.000 80.000 30.332 160.000 50.000 90.000 10.602 70.595 100.185 130.656 160.159 60.000 10.355 110.424 150.154 150.729 150.516 100.220 100.620 30.084 130.000 10.707 140.651 130.173 50.014 90.381 170.582 140.000 10.619 30.049 120.000 70.000 10.702 40.000 20.000 10.302 160.489 150.317 130.334 70.392 70.922 140.254 130.533 130.394 130.129 140.613 150.000 10.000 120.820 50.649 110.749 130.000 10.782 140.282 60.863 60.000 10.288 150.006 120.220 110.633 140.542 3
: Long-Tailed 3D Semantic Segmentation with Adaptive Weight Constraint and Sampling. ICRA 2024
CeCo0.340 70.551 90.247 130.181 60.475 120.057 150.142 120.000 30.000 70.000 10.387 60.463 60.499 90.924 20.774 110.213 60.257 130.000 30.546 150.100 110.006 80.615 20.177 170.534 70.246 60.000 90.400 50.000 10.338 10.006 160.484 50.609 50.000 10.083 110.000 30.873 90.089 50.661 140.000 70.048 150.560 40.408 60.892 80.000 10.000 20.586 10.616 80.000 100.692 80.900 80.721 120.162 10.228 60.860 110.000 10.000 110.575 50.083 30.550 40.347 40.624 130.410 100.360 90.740 30.109 130.321 150.660 80.000 70.121 90.939 130.143 80.000 10.400 20.003 130.190 110.564 60.652 100.615 110.421 50.304 130.579 10.547 50.000 40.000 10.296 140.000 110.030 90.096 70.000 40.916 40.037 130.551 120.171 90.376 70.865 70.286 50.000 10.633 50.102 120.027 80.011 30.000 10.000 110.474 140.742 50.133 70.311 130.824 80.242 130.503 140.068 90.828 90.000 60.429 70.000 10.063 50.000 30.781 20.000 50.000 90.000 10.665 20.633 60.450 60.818 20.000 100.000 10.429 50.532 70.226 130.825 110.510 110.377 50.709 20.079 140.000 10.753 50.683 80.102 150.063 50.401 160.620 130.000 10.619 30.000 140.000 70.000 10.595 130.000 20.000 10.345 140.564 60.411 80.603 10.384 80.945 90.266 110.643 50.367 140.304 10.663 100.000 10.010 70.726 150.767 70.898 30.000 10.784 130.435 10.861 70.000 10.447 110.000 150.257 70.656 110.377 10
Zhisheng Zhong, Jiequan Cui, Yibo Yang, Xiaoyang Wu, Xiaojuan Qi, Xiangyu Zhang, Jiaya Jia: Understanding Imbalanced Semantic Segmentation Through Neural Collapse. CVPR 2023
OctFormer ScanNet200permissive0.326 130.539 100.265 100.131 120.499 60.110 40.522 30.000 30.000 70.000 10.318 110.427 70.455 150.743 110.765 130.175 110.842 40.000 30.828 50.204 40.033 60.429 110.335 60.601 20.312 30.000 90.357 100.000 10.000 30.047 110.423 90.000 110.000 10.105 90.000 30.873 90.079 90.670 120.000 70.117 50.471 130.432 30.829 110.000 10.000 20.584 20.417 170.089 60.684 90.837 120.705 160.021 120.178 110.892 60.000 10.028 80.505 130.000 70.457 90.200 140.662 40.412 90.244 150.496 80.000 170.451 80.626 90.000 70.102 110.943 90.138 130.000 10.000 120.149 80.291 30.534 90.722 70.632 70.331 100.253 140.453 110.487 110.000 40.000 10.479 60.000 110.022 130.000 120.000 40.900 100.128 110.684 30.164 100.413 40.854 100.000 120.000 10.512 160.074 150.003 110.000 40.000 10.000 110.469 150.613 120.132 80.529 70.871 30.227 160.582 70.026 170.787 120.000 60.339 150.000 10.000 80.000 30.626 70.000 50.029 80.000 10.587 90.612 80.411 70.724 100.000 100.000 10.407 60.552 50.513 30.849 100.655 40.408 40.000 110.296 20.000 10.686 150.645 140.145 80.022 80.414 140.633 110.000 10.637 20.224 30.000 70.000 10.650 80.000 20.000 10.622 90.535 120.343 120.483 30.230 130.943 100.289 100.618 70.596 50.140 80.679 80.000 10.022 60.783 110.620 120.906 10.000 10.806 80.137 100.865 50.000 10.378 120.000 150.168 150.680 80.227 13
Peng-Shuai Wang: OctFormer: Octree-based Transformers for 3D Point Clouds. SIGGRAPH 2023
PPT-SpUNet-F.T.0.332 120.556 60.270 70.123 140.519 40.091 70.349 40.000 30.000 70.000 10.339 90.383 100.498 100.833 40.807 40.241 40.584 90.000 30.755 70.124 80.000 90.608 30.330 80.530 90.314 20.000 90.374 80.000 10.000 30.197 50.459 70.000 110.000 10.117 60.000 30.876 70.095 20.682 90.000 70.086 80.518 70.433 20.930 40.000 10.000 20.563 30.542 140.077 70.715 40.858 110.756 50.008 160.171 120.874 80.000 10.039 70.550 110.000 70.545 50.256 80.657 80.453 40.351 100.449 110.213 60.392 120.611 110.000 70.037 150.946 60.138 130.000 10.000 120.063 110.308 20.537 80.796 50.673 40.323 110.392 100.400 140.509 70.000 40.000 10.649 10.000 110.023 120.000 120.000 40.914 60.002 160.506 160.163 110.359 80.872 50.000 120.000 10.623 70.112 60.001 120.000 40.000 10.021 90.753 50.565 150.150 40.579 40.806 90.267 90.616 40.042 140.783 130.000 60.374 110.000 10.000 80.000 30.620 80.000 50.000 90.000 10.572 130.634 50.350 90.792 50.000 100.000 10.376 90.535 60.378 60.855 70.672 30.074 130.000 110.185 100.000 10.727 120.660 120.076 170.000 100.432 120.646 100.000 10.594 80.006 130.000 70.000 10.658 70.000 20.000 10.661 40.549 100.300 140.291 80.045 140.942 110.304 80.600 80.572 70.135 120.695 50.000 10.008 90.793 90.942 20.899 20.000 10.816 60.181 70.897 20.000 10.679 40.223 80.264 50.691 50.345 12
Xiaoyang Wu, Zhuotao Tian, Xin Wen, Bohao Peng, Xihui Liu, Kaicheng Yu, Hengshuang Zhao: Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt Training. CVPR 2024
L3DETR-ScanNet_2000.336 80.533 110.279 60.155 100.508 50.073 110.101 170.000 30.058 60.000 10.294 140.233 140.548 40.927 10.788 100.264 20.463 110.000 30.638 120.098 130.014 70.411 120.226 130.525 100.225 90.010 70.397 60.000 10.000 30.192 60.380 140.598 60.000 10.117 60.000 30.883 60.082 80.689 40.000 70.032 170.549 60.417 40.910 50.000 10.000 20.448 80.613 90.000 100.697 70.960 30.759 40.158 20.293 30.883 70.000 10.312 30.583 40.079 40.422 110.068 170.660 70.418 70.298 120.430 120.114 110.526 50.776 30.051 30.679 30.946 60.152 70.000 10.183 80.000 150.211 80.511 100.409 160.565 120.355 80.448 80.512 50.557 30.000 40.000 10.420 90.000 110.007 170.104 60.000 40.125 170.330 30.514 150.146 120.321 130.860 80.174 110.000 10.629 60.075 140.000 140.000 40.000 10.002 100.671 80.712 70.141 60.339 120.856 40.261 120.529 100.067 100.835 60.000 60.369 120.000 10.259 20.000 30.629 60.000 50.487 10.000 10.579 110.646 40.107 170.720 110.122 70.000 10.333 140.505 100.303 90.908 30.503 130.565 20.074 80.324 10.000 10.740 80.661 110.109 130.000 100.427 130.563 170.000 10.579 110.108 80.000 70.000 10.664 60.000 20.000 10.641 70.539 110.416 70.515 20.256 110.940 120.312 60.209 170.620 30.138 110.636 110.000 10.000 120.775 130.861 50.765 120.000 10.801 90.119 110.860 80.000 10.687 20.001 140.192 140.679 90.699 1
Yanmin Wu, Qiankun Gao, Renrui Zhang, Jian Zhang: Language-Assisted 3D Scene Understanding. arXiv23.12
GSTran0.334 100.533 120.250 120.179 80.487 80.041 160.139 130.003 10.273 50.000 10.273 160.189 160.465 120.704 140.794 80.198 80.831 50.000 30.712 80.055 160.063 50.518 60.306 90.459 160.217 100.028 40.282 140.000 10.000 30.044 120.405 110.558 80.000 10.080 120.000 30.873 90.020 160.684 70.000 70.075 120.496 90.363 90.651 150.000 10.000 20.425 90.648 50.000 100.669 110.914 60.741 100.009 150.200 90.864 90.000 10.000 110.560 90.000 70.357 140.233 120.633 110.408 110.411 40.320 160.242 50.440 90.598 140.047 40.205 80.940 100.139 110.000 10.372 40.138 90.191 100.495 110.618 130.624 90.321 120.595 40.496 70.499 80.000 40.000 10.340 120.014 60.032 70.136 40.000 40.903 80.279 50.601 90.198 50.345 110.849 110.260 60.000 10.573 140.072 170.060 50.000 40.000 10.089 70.838 40.775 40.125 110.381 110.752 130.274 50.517 130.032 150.841 50.000 60.354 140.000 10.047 60.000 30.439 130.787 30.252 40.000 10.512 160.507 160.158 160.717 120.000 100.000 10.337 120.483 130.570 10.853 80.614 70.121 110.070 90.229 70.000 10.732 110.773 20.193 30.000 100.606 70.791 50.000 10.593 90.000 140.010 50.000 10.574 160.000 20.000 10.507 120.554 90.361 110.136 130.608 30.948 70.304 80.593 100.533 80.011 160.634 120.000 10.060 30.821 40.613 130.797 100.000 10.799 110.036 130.782 140.000 10.609 70.423 30.133 170.647 120.213 15
PTv3 ScanNet2000.393 30.592 30.330 20.216 30.520 30.109 50.108 160.000 30.337 10.000 10.310 120.394 90.494 110.753 90.848 20.256 30.717 80.000 30.842 40.192 50.065 30.449 100.346 40.546 60.190 130.000 90.384 70.000 10.000 30.218 40.505 20.791 30.000 10.136 40.000 30.903 20.073 120.687 60.000 70.168 20.551 50.387 70.941 30.000 10.000 20.397 120.654 30.000 100.714 50.759 150.752 70.118 40.264 40.926 30.000 10.048 60.575 50.000 70.597 20.366 20.755 10.469 20.474 30.798 20.140 100.617 30.692 70.000 70.592 40.971 20.188 40.000 10.133 90.593 20.349 10.650 30.717 80.699 30.455 20.790 20.523 40.636 10.301 10.000 10.622 20.000 110.017 150.259 30.000 40.921 30.337 10.733 20.210 40.514 20.860 80.407 10.000 10.688 20.109 80.000 140.000 40.000 10.151 50.671 80.782 20.115 130.641 20.903 20.349 10.616 40.088 70.832 80.000 60.480 20.000 10.428 10.000 30.497 100.000 50.000 90.000 10.662 30.690 20.612 10.828 10.575 10.000 10.404 70.644 20.325 70.887 40.728 10.009 160.134 70.026 170.000 10.761 30.731 40.172 60.077 40.528 80.727 70.000 10.603 50.220 50.022 30.000 10.740 10.000 20.000 10.661 40.586 20.566 40.436 40.531 50.978 30.457 20.708 30.583 60.141 70.748 30.000 10.026 50.822 30.871 40.879 50.000 10.851 20.405 20.914 10.000 10.682 30.000 150.281 40.738 30.463 6
Xiaoyang Wu, Li Jiang, Peng-Shuai Wang, Zhijian Liu, Xihui Liu, Yu Qiao, Wanli Ouyang, Tong He, Hengshuang Zhao: Point Transformer V3: Simpler, Faster, Stronger. CVPR 2024 (Oral)
PonderV2 ScanNet2000.346 60.552 80.270 80.175 90.497 70.070 120.239 70.000 30.000 70.000 10.232 170.412 80.584 20.842 30.804 50.212 70.540 100.000 30.433 160.106 100.000 90.590 50.290 120.548 50.243 70.000 90.356 110.000 10.000 30.062 100.398 130.441 100.000 10.104 100.000 30.888 50.076 110.682 90.030 30.094 70.491 110.351 120.869 100.000 10.063 10.403 110.700 20.000 100.660 130.881 90.761 30.050 80.186 100.852 130.000 10.007 90.570 80.100 20.565 30.326 60.641 100.431 60.290 140.621 60.259 30.408 110.622 100.125 20.082 120.950 50.179 50.000 10.263 60.424 50.193 90.558 70.880 40.545 130.375 70.727 30.445 120.499 80.000 40.000 10.475 70.002 90.034 60.083 80.000 40.924 20.290 40.636 60.115 140.400 50.874 40.186 100.000 10.611 80.128 30.113 20.000 40.000 10.000 110.584 120.636 100.103 140.385 100.843 60.283 40.603 60.080 80.825 100.000 60.377 100.000 10.000 80.000 30.457 110.000 50.000 90.000 10.574 120.608 90.481 40.792 50.394 50.000 10.357 100.503 110.261 100.817 130.504 120.304 70.472 40.115 110.000 10.750 70.677 90.202 20.000 100.509 90.729 60.000 10.519 120.000 140.000 70.000 10.620 120.000 20.000 10.660 60.560 70.486 60.384 60.346 100.952 50.247 140.667 40.436 120.269 30.691 60.000 10.010 70.787 100.889 30.880 40.000 10.810 70.336 40.860 80.000 10.606 80.009 110.248 90.681 70.392 9
Haoyi Zhu, Honghui Yang, Xiaoyang Wu, Di Huang, Sha Zhang, Xianglong He, Tong He, Hengshuang Zhao, Chunhua Shen, Yu Qiao, Wanli Ouyang: PonderV2: Pave the Way for 3D Foundataion Model with A Universal Pre-training Paradigm.
ODIN - Sem200permissive0.368 40.562 40.297 40.207 40.380 170.196 10.828 20.000 30.321 20.000 10.400 50.775 10.460 130.501 170.769 120.065 150.870 30.000 30.913 10.213 30.000 90.000 170.389 20.554 40.312 30.000 90.591 10.000 10.000 30.491 10.487 30.894 20.000 10.378 20.303 10.796 170.088 60.669 130.081 10.216 10.256 170.334 130.898 70.000 10.000 20.370 140.599 100.000 100.581 160.988 20.749 80.090 60.242 50.921 40.000 10.202 50.609 20.000 70.655 10.214 130.654 90.346 150.408 70.485 90.169 80.631 20.704 60.000 70.814 10.940 100.127 160.000 10.000 120.462 40.227 60.641 40.885 30.657 50.434 30.000 170.550 20.393 150.000 40.000 10.590 40.000 110.048 20.077 90.000 40.784 160.131 100.557 100.316 20.359 80.833 140.373 20.000 10.661 40.108 90.001 120.000 40.000 10.301 30.612 110.565 150.129 100.482 80.468 160.274 50.561 80.376 10.912 20.181 10.440 60.000 10.166 40.000 30.641 50.000 50.426 20.000 10.642 50.626 70.259 110.787 80.429 40.000 10.589 10.523 80.246 110.857 60.000 170.228 90.000 110.265 40.000 10.752 60.832 10.090 160.157 10.791 10.578 160.000 10.373 150.539 10.000 70.000 10.685 50.000 20.000 10.632 80.575 30.663 10.152 110.358 90.926 130.397 30.454 150.610 40.119 150.685 70.000 10.000 120.803 80.740 90.441 140.000 10.800 100.000 170.871 30.000 10.220 170.487 10.862 10.682 60.054 17
Ayush Jain, Pushkal Katara, Nikolaos Gkanatsios, Adam W. Harley, Gabriel Sarch, Kriti Aggarwal, Vishrav Chaudhary, Katerina Fragkiadaki: ODIN: A Single Model for 2D and 3D Segmentation. CVPR 2024
ALS-MinkowskiNetcopyleft0.414 20.610 20.322 30.271 20.542 20.153 30.159 110.000 30.000 70.000 10.404 40.503 50.532 60.672 160.804 50.285 10.888 20.000 30.900 20.226 20.087 20.598 40.342 50.671 10.217 100.087 30.449 40.000 10.000 30.253 30.477 61.000 10.000 10.118 50.000 30.905 10.071 130.710 20.076 20.047 160.665 10.376 80.981 10.000 10.000 20.466 70.632 70.113 40.769 10.956 40.795 20.031 90.314 10.936 10.000 10.390 20.601 30.000 70.458 80.366 20.719 30.440 50.564 10.699 40.314 10.464 70.784 20.200 10.283 60.973 10.142 90.000 10.250 70.285 60.220 70.718 10.752 60.723 20.460 10.248 150.475 100.463 130.000 40.000 10.446 80.021 50.025 110.285 10.000 40.972 10.149 80.769 10.230 30.535 10.879 20.252 80.000 10.693 10.129 20.000 140.000 40.000 10.447 10.958 10.662 90.159 20.598 30.780 110.344 20.646 30.106 60.893 30.135 30.455 30.000 10.194 30.259 10.726 30.475 40.000 90.000 10.741 10.865 10.571 20.817 30.445 30.000 10.506 20.630 30.230 120.916 20.728 10.635 11.000 10.252 60.000 10.804 20.697 70.137 110.043 70.717 20.807 30.000 10.510 130.245 20.000 70.000 10.709 30.000 20.000 10.703 20.572 40.646 20.223 100.531 50.984 10.397 30.813 10.798 10.135 120.800 10.000 10.097 20.832 20.752 80.842 70.000 10.852 10.149 90.846 100.000 10.666 50.359 50.252 80.777 10.690 2
Guangda Ji, Silvan Weder, Francis Engelmann, Marc Pollefeys, Hermann Blum: ARKit LabelMaker: A New Scale for Indoor 3D Scene Understanding. CVPR 2025
BFANet ScanNet200permissive0.360 50.553 70.293 50.193 50.483 100.096 60.266 60.000 30.000 70.000 10.298 130.255 120.661 10.810 50.810 30.194 100.785 70.000 30.000 170.161 60.000 90.494 90.382 30.574 30.258 50.000 90.372 90.000 10.000 30.043 140.436 80.000 110.000 10.239 30.000 30.901 30.105 10.689 40.025 40.128 40.614 20.436 10.493 170.000 10.000 20.526 40.546 130.109 50.651 140.953 50.753 60.101 50.143 130.897 50.000 10.431 10.469 150.000 70.522 60.337 50.661 60.459 30.409 60.666 50.102 140.508 60.757 40.000 70.060 140.970 30.497 10.000 10.376 30.511 30.262 40.688 20.921 20.617 100.321 120.590 60.491 90.556 40.000 40.000 10.481 50.093 10.043 30.284 20.000 40.875 140.135 90.669 40.124 130.394 60.849 110.298 40.000 10.476 170.088 130.042 70.000 40.000 10.254 40.653 100.741 60.215 10.573 50.852 50.266 100.654 20.056 120.835 60.000 60.492 10.000 10.000 80.000 30.612 90.000 50.000 90.000 10.616 60.469 170.460 50.698 140.516 20.000 10.378 80.563 40.476 40.863 50.574 90.330 60.000 110.282 30.000 10.760 40.710 50.233 10.000 100.641 50.814 20.000 10.585 100.053 110.000 70.000 10.629 100.000 20.000 10.678 30.528 130.534 50.129 140.596 40.973 40.264 120.772 20.526 100.139 90.707 40.000 10.000 120.764 140.591 160.848 60.000 10.827 40.338 30.806 120.000 10.568 90.151 100.358 20.659 100.510 4
Weiguang Zhao, Rui Zhang, Qiufeng Wang, Guangliang Cheng, Kaizhu Huang: BFANet: Revisiting 3D Semantic Segmentation with Boundary Feature Analysis. CVPR 2025
DITR0.449 10.629 10.392 10.289 10.650 10.168 20.862 10.000 30.313 30.000 10.580 10.568 20.564 30.766 70.867 10.238 50.949 10.000 30.866 30.300 10.000 90.664 10.482 10.508 120.317 10.420 10.551 20.000 10.000 30.486 20.519 10.662 40.000 10.385 10.000 30.901 30.079 90.727 10.000 70.160 30.606 30.417 40.967 20.000 10.000 20.498 50.596 110.130 20.728 30.998 10.805 10.000 170.314 10.934 20.000 10.278 40.636 10.000 70.403 120.367 10.741 20.484 10.500 21.000 10.113 120.828 10.815 10.000 70.733 20.969 40.374 20.000 10.579 11.000 10.230 50.617 50.983 10.729 10.423 40.855 10.508 60.622 20.018 30.000 10.591 30.034 40.028 100.066 110.869 10.904 70.334 20.651 50.716 10.514 20.871 60.315 30.000 10.664 30.128 30.014 100.000 40.000 10.392 20.851 20.817 10.153 30.823 10.991 10.318 30.680 10.134 30.913 10.157 20.448 40.000 10.000 80.000 30.826 10.978 10.091 60.000 10.660 40.647 30.571 20.804 40.001 90.000 10.480 30.700 10.421 50.947 10.433 140.411 30.148 60.262 50.000 10.849 10.709 60.138 100.150 20.714 30.889 10.000 10.698 10.222 40.000 70.000 10.720 20.000 20.000 10.805 10.600 10.642 30.268 90.904 10.982 20.477 10.632 60.718 20.139 90.776 20.000 10.178 10.886 10.962 10.839 80.000 10.851 20.043 120.869 40.000 10.710 10.315 60.348 30.753 20.397 8
Karim Abou Zeid, Kadir Yilmaz, Daan de Geus, Alexander Hermans, David Adrian, Timm Linder, Bastian Leibe: DINO in the Room: Leveraging 2D Foundation Models for 3D Segmentation.
Minkowski 34Dpermissive0.253 160.463 160.154 170.102 160.381 160.084 80.134 150.000 30.000 70.000 10.386 70.141 170.279 170.737 120.703 160.014 170.164 150.000 30.663 100.092 140.000 90.224 150.291 110.531 80.056 170.000 90.242 160.000 10.000 30.013 150.331 160.000 110.000 10.035 170.001 20.858 140.059 140.650 160.000 70.056 140.353 150.299 150.670 130.000 10.000 20.284 160.484 150.071 80.594 150.720 160.710 150.027 110.068 170.813 140.000 10.005 100.492 140.164 10.274 160.111 160.571 160.307 170.293 130.307 170.150 90.163 170.531 160.002 60.545 50.932 150.093 170.000 10.000 120.002 140.159 150.368 170.581 150.440 170.228 170.406 90.282 170.294 160.000 40.000 10.189 160.060 20.036 50.000 120.000 40.897 110.000 170.525 140.025 170.205 170.771 170.000 120.000 10.593 110.108 90.044 60.000 40.000 10.000 110.282 170.589 140.094 160.169 160.466 170.227 160.419 170.125 50.757 140.002 40.334 160.000 10.000 80.000 30.357 150.000 50.000 90.000 10.582 100.513 140.337 100.612 170.000 100.000 10.250 160.352 170.136 170.724 160.655 40.280 80.000 110.046 160.000 10.606 170.559 150.159 70.102 30.445 100.655 90.000 10.310 170.117 60.000 70.000 10.581 150.026 10.000 10.265 170.483 160.084 170.097 170.044 150.865 170.142 170.588 110.351 150.272 20.596 170.000 10.003 100.622 160.720 100.096 170.000 10.771 160.016 150.772 150.000 10.302 140.194 90.214 120.621 160.197 16
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019


This table lists the benchmark results for the ScanNet200 3D semantic instance scenario.




Method Infoavg ap 25%head ap 25%common ap 25%tail ap 25%alarm clockarmchairbackpackbagballbarbasketbathroom cabinetbathroom counterbathroom stallbathroom stall doorbathroom vanitybathtubbedbenchbicyclebinblackboardblanketblindsboardbookbookshelfbottlebowlboxbroombucketbulletin boardcabinetcalendarcandlecartcase of water bottlescd caseceilingceiling lightchairclockclosetcloset doorcloset rodcloset wallclothesclothes dryercoat rackcoffee kettlecoffee makercoffee tablecolumncomputer towercontainercopiercouchcountercratecupcurtaincushiondecorationdeskdining tabledish rackdishwasherdividerdoordoorframedresserdumbbelldustpanend tablefanfile cabinetfire alarmfire extinguisherfireplacefolded chairfurnitureguitarguitar casehair dryerhandicap barhatheadphonesironing boardjacketkeyboardkeyboard pianokitchen cabinetkitchen counterladderlamplaptoplaundry basketlaundry detergentlaundry hamperledgelightlight switchluggagemachinemailboxmatmattressmicrowavemini fridgemirrormonitormousemusic standnightstandobjectoffice chairottomanovenpaperpaper bagpaper cutterpaper towel dispenserpaper towel rollpersonpianopicturepillarpillowpipeplantplateplungerposterpotted plantpower outletpower stripprinterprojectorprojector screenpurserackradiatorrailrange hoodrecycling binrefrigeratorscaleseatshelfshoeshowershower curtainshower curtain rodshower doorshower floorshower headshower wallsignsinksoap dishsoap dispensersofa chairspeakerstair railstairsstandstoolstorage binstorage containerstorage organizerstovestructurestuffed animalsuitcasetabletelephonetissue boxtoastertoaster oventoilettoilet papertoilet paper dispensertoilet paper holdertoilet seat cover dispensertoweltrash bintrash cantraytubetvtv standvacuum cleanerventwardrobewashing machinewater bottlewater coolerwater pitcherwhiteboardwindowwindowsill
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort by
ODIN - Ins200permissive0.451 10.637 20.407 10.277 10.583 50.116 10.500 10.000 10.125 10.000 10.599 20.823 20.407 40.667 60.941 30.542 31.000 10.000 31.000 10.162 30.000 20.028 50.357 20.695 30.550 10.000 10.475 10.000 10.000 20.714 10.626 11.000 10.000 10.500 10.125 10.749 20.080 20.742 60.528 10.078 30.500 20.334 10.667 10.333 10.000 10.278 60.723 50.250 40.859 41.000 10.826 60.108 30.221 10.763 10.000 30.250 10.742 30.500 30.750 10.400 30.855 10.769 10.701 10.469 40.203 10.406 20.870 20.000 20.963 10.200 30.000 10.000 30.500 10.370 10.886 11.000 10.782 20.504 30.429 40.494 10.337 30.000 10.000 10.600 10.000 40.215 30.226 20.000 10.944 20.200 30.887 10.750 10.874 10.877 30.438 10.000 10.867 30.089 30.003 30.500 10.000 20.333 11.000 10.742 20.125 10.671 10.417 40.616 50.637 10.238 10.873 10.528 10.494 50.000 10.250 30.000 20.688 10.000 11.000 10.000 10.872 10.833 20.275 10.779 51.000 10.000 30.441 10.577 10.167 21.000 10.500 50.777 30.000 20.778 20.000 30.910 20.800 20.232 40.019 30.717 10.833 50.000 30.638 10.284 10.000 30.000 20.778 10.000 10.000 10.597 10.699 30.850 10.333 30.250 30.944 50.571 10.677 30.795 10.264 40.852 20.000 10.000 20.824 11.000 10.668 30.000 10.000 40.667 30.000 10.333 50.333 20.760 10.679 30.404 2
TD3D Scannet200permissive0.379 30.603 30.306 30.190 30.635 20.073 30.500 10.000 10.000 20.000 10.495 40.735 30.275 61.000 10.979 20.590 20.000 50.021 20.000 40.146 40.000 20.356 20.173 60.795 10.226 30.000 10.173 30.000 10.000 20.226 30.390 30.000 30.000 10.250 20.000 20.706 30.061 40.885 10.093 30.186 20.259 50.200 20.667 10.000 30.000 10.667 20.825 10.250 40.834 51.000 10.958 10.553 10.111 40.748 20.220 20.051 30.866 20.792 10.390 60.045 60.800 30.302 60.517 20.533 30.113 30.427 10.843 30.000 20.458 20.600 10.000 10.101 20.000 20.259 20.717 30.500 30.615 30.520 20.526 20.457 20.270 50.000 10.000 10.400 30.088 20.294 20.181 30.000 11.000 10.400 10.710 60.103 40.477 60.905 20.061 30.000 10.906 20.102 20.232 10.125 30.000 20.003 30.792 41.000 10.000 30.102 40.125 50.559 60.523 40.075 30.715 20.000 30.424 60.000 10.396 20.250 10.638 20.000 10.000 30.000 10.622 60.833 20.221 20.970 10.250 30.038 10.260 30.415 20.125 31.000 11.000 10.857 20.000 20.908 10.012 10.869 40.836 10.635 10.111 10.625 21.000 10.020 20.510 20.003 40.009 21.000 10.778 10.000 10.000 10.370 40.755 10.288 30.333 30.274 21.000 10.557 20.731 20.456 30.433 30.769 60.000 10.000 20.621 51.000 10.458 50.000 10.196 20.817 10.000 10.472 10.222 40.205 60.689 20.274 4
Maksim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: Top-Down Beats Bottom-Up in 3D Instance Segmentation. WACV 2024
Mask3D Scannet2000.445 20.653 10.392 20.254 20.648 10.097 20.125 60.000 10.000 20.000 10.657 10.971 10.451 21.000 11.000 10.640 10.500 20.045 11.000 10.241 20.409 10.363 10.440 10.686 40.300 20.000 10.201 20.000 10.009 10.290 20.556 21.000 10.000 10.063 40.000 20.830 10.573 10.844 20.333 20.204 10.058 60.158 60.552 30.056 20.000 11.000 10.725 40.750 10.927 11.000 10.888 40.042 40.120 30.615 50.226 10.250 10.890 10.792 10.677 30.510 20.818 20.699 20.512 30.167 60.125 20.315 30.943 10.309 10.017 40.200 30.000 10.188 10.000 20.183 40.815 21.000 10.827 10.741 10.442 30.414 50.600 10.000 10.000 10.458 20.049 30.321 10.381 10.000 10.908 30.400 10.841 20.260 20.710 20.966 10.265 20.000 10.924 10.152 10.025 20.500 10.027 10.028 21.000 10.556 60.016 20.080 60.500 10.694 30.608 20.084 20.604 40.194 20.538 30.000 10.500 10.000 20.354 50.000 11.000 10.000 10.761 30.930 10.053 50.890 31.000 10.008 20.262 20.358 31.000 11.000 10.792 40.966 11.000 10.765 30.004 20.930 10.780 30.330 20.027 20.625 20.974 40.050 10.412 60.021 30.000 30.000 20.778 10.000 10.000 10.493 30.746 20.454 20.335 20.396 10.930 60.551 31.000 10.552 20.606 10.853 10.000 10.004 10.806 21.000 10.727 20.000 10.042 30.745 20.000 10.399 40.391 10.630 20.721 10.619 1
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
Minkowski 34D Inst.permissive0.280 50.488 50.192 60.124 50.593 40.010 50.500 10.000 10.000 20.000 10.447 50.535 50.445 31.000 10.861 50.400 40.225 30.000 30.000 40.142 50.000 20.074 40.342 40.467 60.067 40.000 10.119 60.000 10.000 20.000 50.337 60.000 30.000 10.000 50.000 20.506 60.070 30.804 40.000 40.000 50.333 40.172 40.150 60.000 30.000 10.479 50.745 30.000 60.830 61.000 10.904 30.167 20.090 50.732 30.000 30.000 40.443 50.000 40.500 40.542 10.772 60.396 50.077 60.385 50.044 50.118 60.777 50.000 20.000 50.200 30.000 10.000 30.000 20.148 50.502 50.500 30.419 50.159 60.281 50.404 60.317 40.000 10.000 10.200 40.000 40.077 40.000 40.000 10.750 40.200 30.715 50.021 50.551 30.828 60.000 40.000 10.743 50.059 60.000 40.000 40.000 20.000 40.125 60.648 40.000 30.191 30.500 10.669 40.502 50.000 60.568 50.000 30.516 40.000 10.000 40.000 20.305 60.000 10.000 30.000 10.825 20.833 20.021 60.918 20.000 40.000 30.191 50.346 50.100 50.981 41.000 10.286 50.000 20.000 60.000 30.868 50.648 60.292 30.000 40.375 41.000 10.000 30.500 30.000 50.333 10.000 20.538 60.000 10.000 10.213 60.518 50.098 50.528 10.250 30.997 30.284 60.677 30.398 40.167 50.790 50.000 10.000 20.618 60.903 60.200 60.000 10.333 10.333 50.000 10.442 30.083 50.213 50.587 50.131 6
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019
CSC-Pretrain Inst.permissive0.275 60.466 60.218 50.110 60.625 30.007 60.500 10.000 10.000 20.000 10.000 60.222 60.377 51.000 10.661 60.400 40.000 50.000 30.000 40.119 60.000 20.000 60.277 50.685 50.067 40.000 10.132 40.000 10.000 20.000 50.367 50.000 30.000 10.000 50.000 20.591 40.055 50.783 50.000 40.014 40.500 20.161 50.278 40.000 30.000 10.667 20.768 20.500 20.866 21.000 10.829 50.000 50.019 60.555 60.000 30.000 40.305 60.000 40.750 10.200 50.783 50.429 40.395 40.677 20.020 60.286 40.584 60.000 20.000 50.115 60.000 10.000 30.000 20.145 60.423 60.500 30.364 60.369 50.571 10.448 40.206 60.000 10.000 10.200 40.106 10.065 60.000 40.000 10.750 40.200 30.774 30.000 60.501 40.841 50.000 40.000 10.692 60.063 50.000 40.000 40.000 20.000 40.500 50.649 30.000 30.084 50.125 50.719 10.413 60.004 50.450 60.000 30.638 10.000 10.000 40.000 20.505 40.000 10.000 30.000 10.727 40.833 20.221 30.779 50.000 40.000 30.168 60.311 60.125 30.571 50.500 50.143 60.000 20.250 50.000 30.869 30.667 50.162 60.000 40.250 51.000 10.000 30.500 30.000 50.000 30.000 20.689 50.000 10.000 10.312 50.383 60.114 40.333 30.000 50.997 30.420 40.613 50.212 60.500 20.819 30.000 10.000 20.768 31.000 10.918 10.000 10.000 40.278 60.000 10.333 50.000 60.353 30.546 60.258 5
Ji Hou, Benjamin Graham, Matthias Nießner, Saining Xie: Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts. CVPR 2021
LGround Inst.permissive0.314 40.529 40.225 40.155 40.578 60.010 40.500 10.000 10.000 20.000 10.515 30.556 40.696 11.000 10.927 40.400 40.083 40.000 31.000 10.252 10.000 20.167 30.350 30.731 20.067 40.000 10.123 50.000 10.000 20.036 40.372 40.000 30.000 10.250 20.000 20.569 50.031 60.810 30.000 40.000 50.630 10.183 30.278 40.000 30.000 10.582 40.589 60.500 20.863 31.000 10.940 20.000 50.144 20.716 40.000 30.000 40.484 40.000 40.500 40.400 30.798 40.500 30.278 50.750 10.093 40.166 50.783 40.000 20.200 30.400 20.000 10.000 30.000 20.219 30.539 40.500 30.578 40.413 40.181 60.457 30.375 20.000 10.000 10.050 60.000 40.077 50.000 40.000 10.500 60.000 60.743 40.250 30.488 50.846 40.000 40.000 10.800 40.069 40.000 40.000 40.000 20.000 41.000 10.607 50.000 30.200 20.500 10.694 20.528 30.063 40.659 30.000 30.594 20.000 10.000 40.000 20.571 30.000 10.000 30.000 10.716 50.647 60.221 30.857 40.000 40.000 30.217 40.346 40.071 60.530 61.000 10.429 40.000 20.286 40.000 30.826 60.706 40.208 50.000 40.250 50.744 60.000 30.500 30.042 20.000 30.000 20.746 40.000 10.000 10.517 20.625 40.085 60.333 30.000 51.000 10.378 50.533 60.376 50.042 60.814 40.000 10.000 20.765 41.000 10.600 40.000 10.000 40.667 30.000 10.472 10.333 20.337 40.605 40.305 3
David Rozenberszki, Or Litany, Angela Dai: Language-Grounded Indoor 3D Semantic Segmentation in the Wild.


ScanNet Benchmark

This table lists the benchmark results for the 3D semantic label scenario.


Method Infoavg ioubathtubbedbookshelfcabinetchaircountercurtaindeskdoorfloorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwallwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
PTv3-PPT-ALCcopyleft0.798 10.911 110.812 230.854 80.770 120.856 150.555 170.943 10.660 260.735 20.979 10.606 70.492 10.792 40.934 40.841 20.819 60.716 90.947 100.906 10.822 1
Guangda Ji, Silvan Weder, Francis Engelmann, Marc Pollefeys, Hermann Blum: ARKit LabelMaker: A New Scale for Indoor 3D Scene Understanding. CVPR 2025
DITR ScanNet0.797 20.727 770.869 10.882 10.785 60.868 70.578 50.943 10.744 10.727 30.979 10.627 20.364 90.824 10.949 20.779 150.844 10.757 10.982 10.905 20.802 3
Karim Abou Zeid, Kadir Yilmaz, Daan de Geus, Alexander Hermans, David Adrian, Timm Linder, Bastian Leibe: DINO in the Room: Leveraging 2D Foundation Models for 3D Segmentation.
PTv3 ScanNet0.794 30.941 30.813 220.851 110.782 70.890 20.597 10.916 60.696 110.713 50.979 10.635 10.384 30.793 30.907 100.821 50.790 370.696 140.967 40.903 30.805 2
Xiaoyang Wu, Li Jiang, Peng-Shuai Wang, Zhijian Liu, Xihui Liu, Yu Qiao, Wanli Ouyang, Tong He, Hengshuang Zhao: Point Transformer V3: Simpler, Faster, Stronger. CVPR 2024 (Oral)
PonderV20.785 40.978 10.800 310.833 300.788 40.853 200.545 210.910 90.713 30.705 60.979 10.596 90.390 20.769 150.832 450.821 50.792 360.730 20.975 20.897 60.785 7
Haoyi Zhu, Honghui Yang, Xiaoyang Wu, Di Huang, Sha Zhang, Xianglong He, Tong He, Hengshuang Zhao, Chunhua Shen, Yu Qiao, Wanli Ouyang: PonderV2: Pave the Way for 3D Foundataion Model with A Universal Pre-training Paradigm.
Mix3Dpermissive0.781 50.964 20.855 20.843 200.781 80.858 130.575 80.831 400.685 170.714 40.979 10.594 100.310 310.801 20.892 190.841 20.819 60.723 60.940 150.887 80.725 29
Alexey Nekrasov, Jonas Schult, Or Litany, Bastian Leibe, Francis Engelmann: Mix3D: Out-of-Context Data Augmentation for 3D Scenes. 3DV 2021 (Oral)
Swin3Dpermissive0.779 60.861 240.818 170.836 270.790 30.875 40.576 70.905 100.704 70.739 10.969 120.611 30.349 120.756 250.958 10.702 520.805 200.708 100.916 390.898 50.801 4
TTT-KD0.773 70.646 980.818 170.809 420.774 100.878 30.581 30.943 10.687 150.704 70.978 60.607 60.336 200.775 110.912 80.838 40.823 40.694 150.967 40.899 40.794 6
Lisa Weijler, Muhammad Jehanzeb Mirza, Leon Sick, Can Ekkazan, Pedro Hermosilla: TTT-KD: Test-Time Training for 3D Semantic Segmentation through Knowledge Distillation from Foundation Models.
ResLFE_HDS0.772 80.939 40.824 70.854 80.771 110.840 350.564 130.900 120.686 160.677 140.961 180.537 360.348 130.769 150.903 120.785 130.815 90.676 260.939 160.880 130.772 11
PPT-SpUNet-Joint0.766 90.932 50.794 370.829 320.751 260.854 180.540 250.903 110.630 390.672 180.963 160.565 260.357 100.788 50.900 140.737 310.802 210.685 200.950 80.887 80.780 8
Xiaoyang Wu, Zhuotao Tian, Xin Wen, Bohao Peng, Xihui Liu, Kaicheng Yu, Hengshuang Zhao: Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt Training. CVPR 2024
OctFormerpermissive0.766 90.925 70.808 270.849 130.786 50.846 300.566 120.876 190.690 130.674 170.960 190.576 220.226 740.753 270.904 110.777 160.815 90.722 70.923 310.877 170.776 10
Peng-Shuai Wang: OctFormer: Octree-based Transformers for 3D Point Clouds. SIGGRAPH 2023
CU-Hybrid Net0.764 110.924 80.819 140.840 230.757 210.853 200.580 40.848 320.709 50.643 280.958 240.587 160.295 390.753 270.884 230.758 230.815 90.725 50.927 270.867 280.743 20
OccuSeg+Semantic0.764 110.758 620.796 350.839 240.746 300.907 10.562 140.850 310.680 190.672 180.978 60.610 40.335 220.777 90.819 490.847 10.830 30.691 170.972 30.885 100.727 27
O-CNNpermissive0.762 130.924 80.823 80.844 190.770 120.852 220.577 60.847 340.711 40.640 320.958 240.592 110.217 800.762 200.888 200.758 230.813 130.726 40.932 250.868 270.744 19
Peng-Shuai Wang, Yang Liu, Yu-Xiao Guo, Chun-Yu Sun, Xin Tong: O-CNN: Octree-based Convolutional Neural Networks for 3D Shape Analysis. SIGGRAPH 2017
DiffSegNet0.758 140.725 790.789 420.843 200.762 170.856 150.562 140.920 40.657 290.658 220.958 240.589 140.337 190.782 60.879 240.787 110.779 420.678 220.926 290.880 130.799 5
DTC0.757 150.843 300.820 120.847 160.791 20.862 110.511 390.870 230.707 60.652 240.954 410.604 80.279 500.760 210.942 30.734 320.766 510.701 130.884 620.874 230.736 21
OA-CNN-L_ScanNet200.756 160.783 480.826 60.858 60.776 90.837 400.548 200.896 150.649 310.675 160.962 170.586 170.335 220.771 140.802 540.770 190.787 390.691 170.936 200.880 130.761 14
PNE0.755 170.786 460.835 50.834 290.758 190.849 250.570 100.836 390.648 320.668 200.978 60.581 200.367 70.683 400.856 330.804 80.801 250.678 220.961 60.889 70.716 36
P. Hermosilla: Point Neighborhood Embeddings.
LSK3DNetpermissive0.755 170.899 170.823 80.843 200.764 160.838 380.584 20.845 350.717 20.638 340.956 310.580 210.229 730.640 500.900 140.750 260.813 130.729 30.920 350.872 250.757 15
Tuo Feng, Wenguan Wang, Fan Ma, Yi Yang: LSK3DNet: Towards Effective and Efficient 3D Perception with Large Sparse Kernels. CVPR 2024
ConDaFormer0.755 170.927 60.822 100.836 270.801 10.849 250.516 360.864 280.651 300.680 130.958 240.584 190.282 470.759 230.855 350.728 340.802 210.678 220.880 670.873 240.756 17
Lunhao Duan, Shanshan Zhao, Nan Xue, Mingming Gong, Guisong Xia, Dacheng Tao: ConDaFormer : Disassembled Transformer with Local Structure Enhancement for 3D Point Cloud Understanding. Neurips, 2023
DMF-Net0.752 200.906 150.793 390.802 480.689 470.825 530.556 160.867 240.681 180.602 510.960 190.555 320.365 80.779 80.859 300.747 270.795 330.717 80.917 380.856 360.764 13
C.Yang, Y.Yan, W.Zhao, J.Ye, X.Yang, A.Hussain, B.Dong, K.Huang: Towards Deeper and Better Multi-view Feature Fusion for 3D Semantic Segmentation. ICONIP 2023
PointTransformerV20.752 200.742 690.809 260.872 20.758 190.860 120.552 180.891 170.610 460.687 80.960 190.559 300.304 340.766 180.926 60.767 200.797 290.644 390.942 130.876 200.722 32
Xiaoyang Wu, Yixing Lao, Li Jiang, Xihui Liu, Hengshuang Zhao: Point Transformer V2: Grouped Vector Attention and Partition-based Pooling. NeurIPS 2022
PointConvFormer0.749 220.793 440.790 400.807 440.750 280.856 150.524 320.881 180.588 590.642 310.977 100.591 120.274 530.781 70.929 50.804 80.796 300.642 400.947 100.885 100.715 37
Wenxuan Wu, Qi Shan, Li Fuxin: PointConvFormer: Revenge of the Point-based Convolution.
BPNetcopyleft0.749 220.909 130.818 170.811 400.752 240.839 370.485 540.842 360.673 210.644 270.957 290.528 430.305 330.773 120.859 300.788 100.818 80.693 160.916 390.856 360.723 31
Wenbo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia, Tien-Tsin Wong: Bidirectional Projection Network for Cross Dimension Scene Understanding. CVPR 2021 (Oral)
MSP0.748 240.623 1010.804 290.859 50.745 310.824 550.501 430.912 80.690 130.685 100.956 310.567 250.320 280.768 170.918 70.720 390.802 210.676 260.921 330.881 120.779 9
StratifiedFormerpermissive0.747 250.901 160.803 300.845 180.757 210.846 300.512 380.825 430.696 110.645 260.956 310.576 220.262 640.744 330.861 290.742 290.770 490.705 110.899 510.860 330.734 22
Xin Lai*, Jianhui Liu*, Li Jiang, Liwei Wang, Hengshuang Zhao, Shu Liu, Xiaojuan Qi, Jiaya Jia: Stratified Transformer for 3D Point Cloud Segmentation. CVPR 2022
Virtual MVFusion0.746 260.771 560.819 140.848 150.702 430.865 100.397 920.899 130.699 90.664 210.948 630.588 150.330 240.746 320.851 390.764 210.796 300.704 120.935 210.866 290.728 25
Abhijit Kundu, Xiaoqi Yin, Alireza Fathi, David Ross, Brian Brewington, Thomas Funkhouser, Caroline Pantofaru: Virtual Multi-view Fusion for 3D Semantic Segmentation. ECCV 2020
VMNetpermissive0.746 260.870 220.838 30.858 60.729 360.850 240.501 430.874 200.587 600.658 220.956 310.564 270.299 360.765 190.900 140.716 420.812 150.631 450.939 160.858 340.709 38
Zeyu HU, Xuyang Bai, Jiaxiang Shang, Runze Zhang, Jiayu Dong, Xin Wang, Guangyuan Sun, Hongbo Fu, Chiew-Lan Tai: VMNet: Voxel-Mesh Network for Geodesic-Aware 3D Semantic Segmentation. ICCV 2021 (Oral)
DiffSeg3D20.745 280.725 790.814 210.837 250.751 260.831 470.514 370.896 150.674 200.684 110.960 190.564 270.303 350.773 120.820 480.713 450.798 280.690 190.923 310.875 210.757 15
ODINpermissive0.744 290.658 940.752 650.870 30.714 400.843 330.569 110.919 50.703 80.622 410.949 600.591 120.343 150.736 340.784 560.816 70.838 20.672 310.918 370.854 400.725 29
Ayush Jain, Pushkal Katara, Nikolaos Gkanatsios, Adam W. Harley, Gabriel Sarch, Kriti Aggarwal, Vishrav Chaudhary, Katerina Fragkiadaki: ODIN: A Single Model for 2D and 3D Segmentation. CVPR 2024
Retro-FPN0.744 290.842 310.800 310.767 620.740 320.836 420.541 230.914 70.672 220.626 380.958 240.552 330.272 550.777 90.886 220.696 530.801 250.674 290.941 140.858 340.717 34
Peng Xiang*, Xin Wen*, Yu-Shen Liu, Hui Zhang, Yi Fang, Zhizhong Han: Retrospective Feature Pyramid Network for Point Cloud Semantic Segmentation. ICCV 2023
EQ-Net0.743 310.620 1020.799 340.849 130.730 350.822 570.493 510.897 140.664 230.681 120.955 350.562 290.378 40.760 210.903 120.738 300.801 250.673 300.907 430.877 170.745 18
Zetong Yang*, Li Jiang*, Yanan Sun, Bernt Schiele, Jiaya JIa: A Unified Query-based Paradigm for Point Cloud Understanding. CVPR 2022
SAT0.742 320.860 250.765 560.819 350.769 140.848 270.533 270.829 410.663 240.631 370.955 350.586 170.274 530.753 270.896 170.729 330.760 570.666 330.921 330.855 380.733 23
LRPNet0.742 320.816 390.806 280.807 440.752 240.828 510.575 80.839 380.699 90.637 350.954 410.520 470.320 280.755 260.834 430.760 220.772 460.676 260.915 410.862 310.717 34
LargeKernel3D0.739 340.909 130.820 120.806 460.740 320.852 220.545 210.826 420.594 580.643 280.955 350.541 350.263 630.723 380.858 320.775 180.767 500.678 220.933 230.848 440.694 43
Yukang Chen*, Jianhui Liu*, Xiangyu Zhang, Xiaojuan Qi, Jiaya Jia: LargeKernel3D: Scaling up Kernels in 3D Sparse CNNs. CVPR 2023
RPN0.736 350.776 520.790 400.851 110.754 230.854 180.491 530.866 260.596 570.686 90.955 350.536 370.342 160.624 570.869 260.787 110.802 210.628 460.927 270.875 210.704 40
MinkowskiNetpermissive0.736 350.859 260.818 170.832 310.709 410.840 350.521 340.853 300.660 260.643 280.951 520.544 340.286 450.731 360.893 180.675 620.772 460.683 210.874 740.852 420.727 27
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019
IPCA0.731 370.890 180.837 40.864 40.726 370.873 50.530 310.824 440.489 940.647 250.978 60.609 50.336 200.624 570.733 640.758 230.776 440.570 720.949 90.877 170.728 25
MS-SFA-net0.730 380.910 120.819 140.837 250.698 440.838 380.532 290.872 210.605 500.676 150.959 230.535 390.341 170.649 460.598 880.708 470.810 160.664 350.895 540.879 160.771 12
online3d0.727 390.715 840.777 490.854 80.748 290.858 130.497 480.872 210.572 670.639 330.957 290.523 440.297 380.750 300.803 530.744 280.810 160.587 680.938 180.871 260.719 33
SparseConvNet0.725 400.647 970.821 110.846 170.721 380.869 60.533 270.754 650.603 530.614 430.955 350.572 240.325 260.710 390.870 250.724 370.823 40.628 460.934 220.865 300.683 46
PointTransformer++0.725 400.727 770.811 250.819 350.765 150.841 340.502 420.814 490.621 420.623 400.955 350.556 310.284 460.620 590.866 270.781 140.757 610.648 370.932 250.862 310.709 38
MatchingNet0.724 420.812 410.812 230.810 410.735 340.834 440.495 500.860 290.572 670.602 510.954 410.512 490.280 490.757 240.845 410.725 360.780 410.606 560.937 190.851 430.700 42
INS-Conv-semantic0.717 430.751 650.759 590.812 390.704 420.868 70.537 260.842 360.609 480.608 470.953 450.534 400.293 400.616 600.864 280.719 410.793 340.640 410.933 230.845 480.663 52
PointMetaBase0.714 440.835 320.785 440.821 330.684 490.846 300.531 300.865 270.614 430.596 550.953 450.500 520.246 690.674 410.888 200.692 540.764 530.624 480.849 890.844 490.675 48
contrastBoundarypermissive0.705 450.769 590.775 500.809 420.687 480.820 600.439 800.812 500.661 250.591 570.945 710.515 480.171 990.633 540.856 330.720 390.796 300.668 320.889 590.847 450.689 44
Liyao Tang, Yibing Zhan, Zhe Chen, Baosheng Yu, Dacheng Tao: Contrastive Boundary Learning for Point Cloud Segmentation. CVPR2022
ClickSeg_Semantic0.703 460.774 540.800 310.793 530.760 180.847 290.471 580.802 530.463 1010.634 360.968 140.491 550.271 570.726 370.910 90.706 480.815 90.551 840.878 680.833 500.570 84
RFCR0.702 470.889 190.745 710.813 380.672 520.818 640.493 510.815 480.623 400.610 450.947 650.470 640.249 680.594 640.848 400.705 490.779 420.646 380.892 570.823 560.611 67
Jingyu Gong, Jiachen Xu, Xin Tan, Haichuan Song, Yanyun Qu, Yuan Xie, Lizhuang Ma: Omni-Supervised Point Cloud Segmentation via Gradual Receptive Field Component Reasoning. CVPR2021
One Thing One Click0.701 480.825 360.796 350.723 690.716 390.832 460.433 820.816 460.634 370.609 460.969 120.418 900.344 140.559 760.833 440.715 430.808 190.560 780.902 480.847 450.680 47
JSENetpermissive0.699 490.881 210.762 570.821 330.667 530.800 770.522 330.792 560.613 440.607 480.935 910.492 540.205 860.576 690.853 370.691 560.758 590.652 360.872 770.828 530.649 56
Zeyu HU, Mingmin Zhen, Xuyang BAI, Hongbo Fu, Chiew-lan Tai: JSENet: Joint Semantic Segmentation and Edge Detection Network for 3D Point Clouds. ECCV 2020
One-Thing-One-Click0.693 500.743 680.794 370.655 920.684 490.822 570.497 480.719 750.622 410.617 420.977 100.447 770.339 180.750 300.664 810.703 510.790 370.596 610.946 120.855 380.647 57
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
PicassoNet-IIpermissive0.692 510.732 730.772 510.786 540.677 510.866 90.517 350.848 320.509 870.626 380.952 500.536 370.225 760.545 820.704 710.689 590.810 160.564 770.903 470.854 400.729 24
Huan Lei, Naveed Akhtar, Mubarak Shah, and Ajmal Mian: Geometric feature learning for 3D meshes.
Feature_GeometricNetpermissive0.690 520.884 200.754 630.795 510.647 600.818 640.422 840.802 530.612 450.604 490.945 710.462 670.189 940.563 750.853 370.726 350.765 520.632 440.904 450.821 590.606 71
Kangcheng Liu, Ben M. Chen: https://arxiv.org/abs/2012.09439. arXiv Preprint
FusionNet0.688 530.704 860.741 750.754 660.656 550.829 490.501 430.741 700.609 480.548 650.950 560.522 460.371 50.633 540.756 590.715 430.771 480.623 490.861 850.814 620.658 53
Feihu Zhang, Jin Fang, Benjamin Wah, Philip Torr: Deep FusionNet for Point Cloud Semantic Segmentation. ECCV 2020
Feature-Geometry Netpermissive0.685 540.866 230.748 680.819 350.645 620.794 800.450 700.802 530.587 600.604 490.945 710.464 660.201 890.554 780.840 420.723 380.732 720.602 590.907 430.822 580.603 74
VACNN++0.684 550.728 760.757 620.776 590.690 450.804 750.464 630.816 460.577 660.587 580.945 710.508 510.276 520.671 420.710 690.663 670.750 650.589 660.881 650.832 520.653 55
KP-FCNN0.684 550.847 290.758 610.784 560.647 600.814 670.473 570.772 590.605 500.594 560.935 910.450 750.181 970.587 650.805 520.690 570.785 400.614 520.882 640.819 600.632 63
H. Thomas, C. Qi, J. Deschaud, B. Marcotegui, F. Goulette, L. Guibas.: KPConv: Flexible and Deformable Convolution for Point Clouds. ICCV 2019
DGNet0.684 550.712 850.784 450.782 580.658 540.835 430.499 470.823 450.641 340.597 540.950 560.487 570.281 480.575 700.619 850.647 750.764 530.620 510.871 800.846 470.688 45
PointContrast_LA_SEM0.683 580.757 630.784 450.786 540.639 640.824 550.408 870.775 580.604 520.541 670.934 950.532 410.269 590.552 790.777 570.645 780.793 340.640 410.913 420.824 550.671 49
Superpoint Network0.683 580.851 280.728 790.800 500.653 570.806 730.468 600.804 510.572 670.602 510.946 680.453 740.239 720.519 870.822 460.689 590.762 560.595 630.895 540.827 540.630 64
VI-PointConv0.676 600.770 580.754 630.783 570.621 680.814 670.552 180.758 630.571 700.557 630.954 410.529 420.268 610.530 850.682 750.675 620.719 750.603 580.888 600.833 500.665 51
Xingyi Li, Wenxuan Wu, Xiaoli Z. Fern, Li Fuxin: The Devils in the Point Clouds: Studying the Robustness of Point Cloud Convolutions.
ROSMRF3D0.673 610.789 450.748 680.763 640.635 660.814 670.407 890.747 670.581 640.573 600.950 560.484 580.271 570.607 610.754 600.649 720.774 450.596 610.883 630.823 560.606 71
SALANet0.670 620.816 390.770 540.768 610.652 580.807 720.451 670.747 670.659 280.545 660.924 1010.473 630.149 1090.571 720.811 510.635 820.746 660.623 490.892 570.794 760.570 84
O3DSeg0.668 630.822 370.771 530.496 1130.651 590.833 450.541 230.761 620.555 760.611 440.966 150.489 560.370 60.388 1060.580 890.776 170.751 630.570 720.956 70.817 610.646 58
PointConvpermissive0.666 640.781 490.759 590.699 770.644 630.822 570.475 560.779 570.564 730.504 840.953 450.428 840.203 880.586 670.754 600.661 680.753 620.588 670.902 480.813 640.642 59
Wenxuan Wu, Zhongang Qi, Li Fuxin: PointConv: Deep Convolutional Networks on 3D Point Clouds. CVPR 2019
PointASNLpermissive0.666 640.703 870.781 470.751 680.655 560.830 480.471 580.769 600.474 970.537 690.951 520.475 620.279 500.635 520.698 740.675 620.751 630.553 830.816 960.806 660.703 41
Xu Yan, Chaoda Zheng, Zhen Li, Sheng Wang, Shuguang Cui: PointASNL: Robust Point Clouds Processing using Nonlocal Neural Networks with Adaptive Sampling. CVPR 2020
PPCNN++permissive0.663 660.746 660.708 820.722 700.638 650.820 600.451 670.566 1030.599 550.541 670.950 560.510 500.313 300.648 480.819 490.616 870.682 900.590 650.869 810.810 650.656 54
Pyunghwan Ahn, Juyoung Yang, Eojindl Yi, Chanho Lee, Junmo Kim: Projection-based Point Convolution for Efficient Point Cloud Segmentation. IEEE Access
MVF-GNN0.658 670.558 1090.751 660.655 920.690 450.722 1020.453 660.867 240.579 650.576 590.893 1130.523 440.293 400.733 350.571 910.692 540.659 970.606 560.875 710.804 680.668 50
DCM-Net0.658 670.778 500.702 850.806 460.619 690.813 700.468 600.693 830.494 900.524 750.941 830.449 760.298 370.510 890.821 470.675 620.727 740.568 750.826 940.803 690.637 61
Jonas Schult*, Francis Engelmann*, Theodora Kontogianni, Bastian Leibe: DualConvMesh-Net: Joint Geodesic and Euclidean Convolutions on 3D Meshes. CVPR 2020 [Oral]
HPGCNN0.656 690.698 890.743 730.650 940.564 860.820 600.505 410.758 630.631 380.479 880.945 710.480 600.226 740.572 710.774 580.690 570.735 700.614 520.853 880.776 910.597 77
Jisheng Dang, Qingyong Hu, Yulan Guo, Jun Yang: HPGCNN.
SAFNet-segpermissive0.654 700.752 640.734 770.664 900.583 810.815 660.399 910.754 650.639 350.535 710.942 810.470 640.309 320.665 430.539 930.650 710.708 800.635 430.857 870.793 780.642 59
Linqing Zhao, Jiwen Lu, Jie Zhou: Similarity-Aware Fusion Network for 3D Semantic Segmentation. IROS 2021
RandLA-Netpermissive0.645 710.778 500.731 780.699 770.577 820.829 490.446 720.736 710.477 960.523 770.945 710.454 710.269 590.484 960.749 630.618 850.738 680.599 600.827 930.792 810.621 66
PointConv-SFPN0.641 720.776 520.703 840.721 710.557 890.826 520.451 670.672 880.563 740.483 870.943 800.425 870.162 1040.644 490.726 650.659 690.709 790.572 710.875 710.786 860.559 90
MVPNetpermissive0.641 720.831 330.715 800.671 870.590 770.781 860.394 930.679 850.642 330.553 640.937 880.462 670.256 650.649 460.406 1060.626 830.691 870.666 330.877 690.792 810.608 70
Maximilian Jaritz, Jiayuan Gu, Hao Su: Multi-view PointNet for 3D Scene Understanding. GMDL Workshop, ICCV 2019
PointMRNet0.640 740.717 830.701 860.692 800.576 830.801 760.467 620.716 760.563 740.459 940.953 450.429 830.169 1010.581 680.854 360.605 880.710 770.550 850.894 560.793 780.575 82
FPConvpermissive0.639 750.785 470.760 580.713 750.603 720.798 780.392 950.534 1080.603 530.524 750.948 630.457 690.250 670.538 830.723 670.598 920.696 850.614 520.872 770.799 710.567 87
Yiqun Lin, Zizheng Yan, Haibin Huang, Dong Du, Ligang Liu, Shuguang Cui, Xiaoguang Han: FPConv: Learning Local Flattening for Point Convolution. CVPR 2020
PD-Net0.638 760.797 430.769 550.641 990.590 770.820 600.461 640.537 1070.637 360.536 700.947 650.388 970.206 850.656 440.668 790.647 750.732 720.585 690.868 820.793 780.473 110
PointSPNet0.637 770.734 720.692 930.714 740.576 830.797 790.446 720.743 690.598 560.437 990.942 810.403 930.150 1080.626 560.800 550.649 720.697 840.557 810.846 900.777 900.563 88
SConv0.636 780.830 340.697 890.752 670.572 850.780 880.445 740.716 760.529 800.530 720.951 520.446 780.170 1000.507 910.666 800.636 810.682 900.541 910.886 610.799 710.594 78
Supervoxel-CNN0.635 790.656 950.711 810.719 720.613 700.757 970.444 770.765 610.534 790.566 610.928 990.478 610.272 550.636 510.531 950.664 660.645 1010.508 990.864 840.792 810.611 67
joint point-basedpermissive0.634 800.614 1030.778 480.667 890.633 670.825 530.420 850.804 510.467 990.561 620.951 520.494 530.291 420.566 730.458 1010.579 980.764 530.559 800.838 910.814 620.598 76
Hung-Yueh Chiang, Yen-Liang Lin, Yueh-Cheng Liu, Winston H. Hsu: A Unified Point-Based Framework for 3D Segmentation. 3DV 2019
PointMTL0.632 810.731 740.688 960.675 840.591 760.784 850.444 770.565 1040.610 460.492 850.949 600.456 700.254 660.587 650.706 700.599 910.665 960.612 550.868 820.791 840.579 81
PointNet2-SFPN0.631 820.771 560.692 930.672 850.524 950.837 400.440 790.706 810.538 780.446 960.944 770.421 890.219 790.552 790.751 620.591 940.737 690.543 900.901 500.768 930.557 91
APCF-Net0.631 820.742 690.687 980.672 850.557 890.792 830.408 870.665 900.545 770.508 810.952 500.428 840.186 950.634 530.702 720.620 840.706 810.555 820.873 750.798 730.581 80
Haojia, Lin: Adaptive Pyramid Context Fusion for Point Cloud Perception. GRSL
3DSM_DMMF0.631 820.626 1000.745 710.801 490.607 710.751 980.506 400.729 740.565 720.491 860.866 1160.434 790.197 920.595 630.630 840.709 460.705 820.560 780.875 710.740 1010.491 105
FusionAwareConv0.630 850.604 1050.741 750.766 630.590 770.747 990.501 430.734 720.503 890.527 730.919 1050.454 710.323 270.550 810.420 1050.678 610.688 880.544 880.896 530.795 750.627 65
Jiazhao Zhang, Chenyang Zhu, Lintao Zheng, Kai Xu: Fusion-Aware Point Convolution for Online Semantic 3D Scene Segmentation. CVPR 2020
DenSeR0.628 860.800 420.625 1080.719 720.545 920.806 730.445 740.597 980.448 1040.519 790.938 870.481 590.328 250.489 950.499 1000.657 700.759 580.592 640.881 650.797 740.634 62
SegGroup_sempermissive0.627 870.818 380.747 700.701 760.602 730.764 940.385 990.629 950.490 920.508 810.931 980.409 920.201 890.564 740.725 660.618 850.692 860.539 920.873 750.794 760.548 94
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
SIConv0.625 880.830 340.694 910.757 650.563 870.772 920.448 710.647 930.520 830.509 800.949 600.431 820.191 930.496 930.614 860.647 750.672 940.535 950.876 700.783 870.571 83
dtc_net0.625 880.703 870.751 660.794 520.535 930.848 270.480 550.676 870.528 810.469 910.944 770.454 710.004 1210.464 980.636 830.704 500.758 590.548 870.924 300.787 850.492 104
Weakly-Openseg v30.625 880.924 80.787 430.620 1010.555 910.811 710.393 940.666 890.382 1120.520 780.953 450.250 1160.208 830.604 620.670 770.644 790.742 670.538 930.919 360.803 690.513 102
HPEIN0.618 910.729 750.668 990.647 960.597 750.766 930.414 860.680 840.520 830.525 740.946 680.432 800.215 810.493 940.599 870.638 800.617 1060.570 720.897 520.806 660.605 73
Li Jiang, Hengshuang Zhao, Shu Liu, Xiaoyong Shen, Chi-Wing Fu, Jiaya Jia: Hierarchical Point-Edge Interaction Network for Point Cloud Semantic Segmentation. ICCV 2019
SPH3D-GCNpermissive0.610 920.858 270.772 510.489 1140.532 940.792 830.404 900.643 940.570 710.507 830.935 910.414 910.046 1180.510 890.702 720.602 900.705 820.549 860.859 860.773 920.534 97
Huan Lei, Naveed Akhtar, and Ajmal Mian: Spherical Kernel for Efficient Graph Convolution on 3D Point Clouds. TPAMI 2020
AttAN0.609 930.760 610.667 1000.649 950.521 960.793 810.457 650.648 920.528 810.434 1010.947 650.401 940.153 1070.454 990.721 680.648 740.717 760.536 940.904 450.765 940.485 106
Gege Zhang, Qinghua Ma, Licheng Jiao, Fang Liu and Qigong Sun: AttAN: Attention Adversarial Networks for 3D Point Cloud Semantic Segmentation. IJCAI2020
wsss-transformer0.600 940.634 990.743 730.697 790.601 740.781 860.437 810.585 1010.493 910.446 960.933 960.394 950.011 1200.654 450.661 820.603 890.733 710.526 960.832 920.761 960.480 107
LAP-D0.594 950.720 810.692 930.637 1000.456 1050.773 910.391 970.730 730.587 600.445 980.940 850.381 980.288 430.434 1020.453 1030.591 940.649 990.581 700.777 1000.749 1000.610 69
DPC0.592 960.720 810.700 870.602 1050.480 1010.762 960.380 1000.713 790.585 630.437 990.940 850.369 1000.288 430.434 1020.509 990.590 960.639 1040.567 760.772 1010.755 980.592 79
Francis Engelmann, Theodora Kontogianni, Bastian Leibe: Dilated Point Convolutions: On the Receptive Field Size of Point Convolutions on 3D Point Clouds. ICRA 2020
CCRFNet0.589 970.766 600.659 1030.683 820.470 1040.740 1010.387 980.620 970.490 920.476 890.922 1030.355 1030.245 700.511 880.511 980.571 990.643 1020.493 1030.872 770.762 950.600 75
ROSMRF0.580 980.772 550.707 830.681 830.563 870.764 940.362 1020.515 1090.465 1000.465 930.936 900.427 860.207 840.438 1000.577 900.536 1020.675 930.486 1040.723 1070.779 880.524 99
SD-DETR0.576 990.746 660.609 1120.445 1180.517 970.643 1130.366 1010.714 780.456 1020.468 920.870 1150.432 800.264 620.558 770.674 760.586 970.688 880.482 1050.739 1050.733 1030.537 96
SQN_0.1%0.569 1000.676 910.696 900.657 910.497 980.779 890.424 830.548 1050.515 850.376 1060.902 1120.422 880.357 100.379 1070.456 1020.596 930.659 970.544 880.685 1100.665 1140.556 92
TextureNetpermissive0.566 1010.672 930.664 1010.671 870.494 990.719 1030.445 740.678 860.411 1100.396 1040.935 910.356 1020.225 760.412 1040.535 940.565 1000.636 1050.464 1070.794 990.680 1110.568 86
Jingwei Huang, Haotian Zhang, Li Yi, Thomas Funkerhouser, Matthias Niessner, Leonidas Guibas: TextureNet: Consistent Local Parametrizations for Learning from High-Resolution Signals on Meshes. CVPR
DVVNet0.562 1020.648 960.700 870.770 600.586 800.687 1070.333 1060.650 910.514 860.475 900.906 1090.359 1010.223 780.340 1090.442 1040.422 1130.668 950.501 1000.708 1080.779 880.534 97
Pointnet++ & Featurepermissive0.557 1030.735 710.661 1020.686 810.491 1000.744 1000.392 950.539 1060.451 1030.375 1070.946 680.376 990.205 860.403 1050.356 1090.553 1010.643 1020.497 1010.824 950.756 970.515 100
GMLPs0.538 1040.495 1140.693 920.647 960.471 1030.793 810.300 1090.477 1100.505 880.358 1080.903 1110.327 1060.081 1150.472 970.529 960.448 1110.710 770.509 970.746 1030.737 1020.554 93
PanopticFusion-label0.529 1050.491 1150.688 960.604 1040.386 1100.632 1140.225 1200.705 820.434 1070.293 1140.815 1180.348 1040.241 710.499 920.669 780.507 1040.649 990.442 1130.796 980.602 1180.561 89
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
subcloud_weak0.516 1060.676 910.591 1150.609 1020.442 1060.774 900.335 1050.597 980.422 1090.357 1090.932 970.341 1050.094 1140.298 1110.528 970.473 1090.676 920.495 1020.602 1160.721 1060.349 118
Online SegFusion0.515 1070.607 1040.644 1060.579 1070.434 1070.630 1150.353 1030.628 960.440 1050.410 1020.762 1210.307 1080.167 1020.520 860.403 1070.516 1030.565 1090.447 1110.678 1110.701 1080.514 101
Davide Menini, Suryansh Kumar, Martin R. Oswald, Erik Sandstroem, Cristian Sminchisescu, Luc van Gool: A Real-Time Learning Framework for Joint 3D Reconstruction and Semantic Segmentation. Robotics and Automation Letters Submission
3DMV, FTSDF0.501 1080.558 1090.608 1130.424 1200.478 1020.690 1060.246 1160.586 1000.468 980.450 950.911 1070.394 950.160 1050.438 1000.212 1160.432 1120.541 1140.475 1060.742 1040.727 1040.477 108
PCNN0.498 1090.559 1080.644 1060.560 1090.420 1090.711 1050.229 1180.414 1110.436 1060.352 1100.941 830.324 1070.155 1060.238 1160.387 1080.493 1050.529 1150.509 970.813 970.751 990.504 103
3DMV0.484 1100.484 1160.538 1180.643 980.424 1080.606 1180.310 1070.574 1020.433 1080.378 1050.796 1190.301 1090.214 820.537 840.208 1170.472 1100.507 1180.413 1160.693 1090.602 1180.539 95
Angela Dai, Matthias Niessner: 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation. ECCV'18
PointCNN with RGBpermissive0.458 1110.577 1070.611 1110.356 1220.321 1180.715 1040.299 1110.376 1150.328 1180.319 1120.944 770.285 1110.164 1030.216 1190.229 1140.484 1070.545 1130.456 1090.755 1020.709 1070.475 109
Yangyan Li, Rui Bu, Mingchao Sun, Baoquan Chen: PointCNN. NeurIPS 2018
FCPNpermissive0.447 1120.679 900.604 1140.578 1080.380 1110.682 1080.291 1120.106 1220.483 950.258 1200.920 1040.258 1150.025 1190.231 1180.325 1100.480 1080.560 1110.463 1080.725 1060.666 1130.231 122
Dario Rethage, Johanna Wald, Jürgen Sturm, Nassir Navab, Federico Tombari: Fully-Convolutional Point Networks for Large-Scale Point Clouds. ECCV 2018
DGCNN_reproducecopyleft0.446 1130.474 1170.623 1090.463 1160.366 1130.651 1110.310 1070.389 1140.349 1160.330 1110.937 880.271 1130.126 1110.285 1120.224 1150.350 1180.577 1080.445 1120.625 1140.723 1050.394 114
Yue Wang, Yongbin Sun, Ziwei Liu, Sanjay E. Sarma, Michael M. Bronstein, Justin M. Solomon: Dynamic Graph CNN for Learning on Point Clouds. TOG 2019
PNET20.442 1140.548 1110.548 1170.597 1060.363 1140.628 1160.300 1090.292 1170.374 1130.307 1130.881 1140.268 1140.186 950.238 1160.204 1180.407 1140.506 1190.449 1100.667 1120.620 1170.462 112
SurfaceConvPF0.442 1140.505 1130.622 1100.380 1210.342 1160.654 1100.227 1190.397 1130.367 1140.276 1160.924 1010.240 1170.198 910.359 1080.262 1120.366 1150.581 1070.435 1140.640 1130.668 1120.398 113
Hao Pan, Shilin Liu, Yang Liu, Xin Tong: Convolutional Neural Networks on 3D Surfaces Using Parallel Frames.
Tangent Convolutionspermissive0.438 1160.437 1190.646 1050.474 1150.369 1120.645 1120.353 1030.258 1190.282 1210.279 1150.918 1060.298 1100.147 1100.283 1130.294 1110.487 1060.562 1100.427 1150.619 1150.633 1160.352 117
Maxim Tatarchenko, Jaesik Park, Vladlen Koltun, Qian-Yi Zhou: Tangent convolutions for dense prediction in 3d. CVPR 2018
3DWSSS0.425 1170.525 1120.647 1040.522 1100.324 1170.488 1220.077 1230.712 800.353 1150.401 1030.636 1230.281 1120.176 980.340 1090.565 920.175 1220.551 1120.398 1170.370 1230.602 1180.361 116
SPLAT Netcopyleft0.393 1180.472 1180.511 1190.606 1030.311 1190.656 1090.245 1170.405 1120.328 1180.197 1210.927 1000.227 1190.000 1230.001 1240.249 1130.271 1210.510 1160.383 1190.593 1170.699 1090.267 120
Hang Su, Varun Jampani, Deqing Sun, Subhransu Maji, Evangelos Kalogerakis, Ming-Hsuan Yang, Jan Kautz: SPLATNet: Sparse Lattice Networks for Point Cloud Processing. CVPR 2018
ScanNet+FTSDF0.383 1190.297 1210.491 1200.432 1190.358 1150.612 1170.274 1140.116 1210.411 1100.265 1170.904 1100.229 1180.079 1160.250 1140.185 1190.320 1190.510 1160.385 1180.548 1180.597 1210.394 114
PointNet++permissive0.339 1200.584 1060.478 1210.458 1170.256 1210.360 1230.250 1150.247 1200.278 1220.261 1190.677 1220.183 1200.117 1120.212 1200.145 1210.364 1160.346 1230.232 1230.548 1180.523 1220.252 121
Charles R. Qi, Li Yi, Hao Su, Leonidas J. Guibas: pointnet++: deep hierarchical feature learning on point sets in a metric space.
GrowSP++0.323 1210.114 1230.589 1160.499 1120.147 1230.555 1190.290 1130.336 1160.290 1200.262 1180.865 1170.102 1230.000 1230.037 1220.000 1240.000 1240.462 1200.381 1200.389 1220.664 1150.473 110
SSC-UNetpermissive0.308 1220.353 1200.290 1230.278 1230.166 1220.553 1200.169 1220.286 1180.147 1230.148 1230.908 1080.182 1210.064 1170.023 1230.018 1230.354 1170.363 1210.345 1210.546 1200.685 1100.278 119
ScanNetpermissive0.306 1230.203 1220.366 1220.501 1110.311 1190.524 1210.211 1210.002 1240.342 1170.189 1220.786 1200.145 1220.102 1130.245 1150.152 1200.318 1200.348 1220.300 1220.460 1210.437 1230.182 123
Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner: ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. CVPR'17
ERROR0.054 1240.000 1240.041 1240.172 1240.030 1240.062 1240.001 1240.035 1230.004 1240.051 1240.143 1240.019 1240.003 1220.041 1210.050 1220.003 1230.054 1240.018 1240.005 1240.264 1240.082 124


This table lists the benchmark results for the 3D semantic instance scenario.




Method Infoavg ap 25%bathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
PointRel0.901 11.000 10.978 250.928 30.879 10.962 60.882 50.749 400.947 30.912 20.802 30.753 210.820 21.000 10.984 40.919 60.894 41.000 10.815 17
: Relation3D: Enhancing Relation Modeling for Point Cloud Instance Segmentation. CVPR 2025
PointComp0.897 21.000 10.998 60.864 200.869 30.969 30.830 80.783 330.905 150.894 100.791 40.834 10.769 141.000 10.982 50.920 50.868 201.000 10.872 2
OneFormer3Dcopyleft0.896 31.000 11.000 10.913 60.858 70.951 120.786 170.837 200.916 130.908 40.778 90.803 70.750 161.000 10.976 70.926 40.882 80.995 500.849 3
Maxim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: OneFormer3D: One Transformer for Unified Point Cloud Segmentation.
Competitor-MAFT0.896 31.000 11.000 10.872 170.847 120.967 40.955 10.778 350.901 170.919 10.784 60.812 20.770 131.000 10.949 100.865 370.868 191.000 10.840 6
MG-Former0.887 51.000 10.991 150.837 280.801 270.935 210.887 40.857 120.946 40.891 120.748 200.805 60.739 181.000 10.993 20.809 610.876 151.000 10.842 5
DCD0.885 61.000 10.933 430.856 240.832 160.959 80.930 20.858 110.802 400.859 200.767 100.796 110.709 221.000 10.971 80.871 310.904 21.000 10.874 1
UniPerception0.884 71.000 10.979 220.872 170.869 40.892 300.806 140.890 70.835 310.892 110.755 160.811 30.779 100.955 510.951 90.876 250.914 10.997 420.840 7
KmaxOneFormerNetpermissive0.883 81.000 11.000 10.798 430.848 110.971 10.853 70.903 30.827 340.910 30.748 190.809 50.724 201.000 10.980 60.855 430.844 261.000 10.832 8
InsSSM0.883 81.000 10.996 70.800 420.865 50.960 70.808 130.852 170.940 70.899 90.785 50.810 40.700 241.000 10.912 220.851 460.895 30.997 420.827 10
Lei Yao, Yi Wang, Moyun Liu, Lap-Pui Chau: SGIFormer: Semantic-guided and Geometric-enhanced Interleaving Transformer for 3D Instance Segmentation. TCSVT, 2024
Competitor-SPFormer0.881 101.000 11.000 10.845 260.854 80.962 50.714 250.857 130.904 160.902 70.782 80.789 140.662 301.000 10.988 30.874 280.886 70.997 420.847 4
VDG-Uni3DSeg0.880 111.000 10.990 170.889 100.823 200.952 110.764 190.893 60.941 60.907 50.756 150.781 160.628 481.000 10.918 210.903 90.872 180.999 400.821 14
TST3D0.879 121.000 10.994 100.921 50.807 260.939 180.771 180.887 80.923 110.862 190.722 250.768 180.756 151.000 10.910 330.904 80.836 290.999 400.824 12
Duc Tran Dang Trung, Byeongkeun Kang, Yeejin Lee: MSTA3D: Multi-scale Twin-attention for 3D Instance Segmentation. ACM Multimedia 2024
SIM3D0.878 131.000 10.972 270.863 210.817 240.952 100.821 110.783 310.890 200.902 80.735 230.797 90.799 91.000 10.931 180.893 150.853 241.000 10.792 20
EV3D0.877 141.000 10.996 90.873 150.854 90.950 130.691 290.783 320.926 80.889 150.754 170.794 130.820 21.000 10.912 220.900 110.860 221.000 10.779 23
TD3Dpermissive0.875 151.000 10.976 260.877 130.783 330.970 20.889 30.828 210.945 50.803 260.713 270.720 280.709 211.000 10.936 160.934 30.873 161.000 10.791 21
Maksim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: Top-Down Beats Bottom-Up in 3D Instance Segmentation. WACV 2024
Spherical Mask(CtoF)0.875 151.000 10.991 160.873 150.850 100.946 150.691 290.752 390.926 80.889 140.759 130.794 120.820 21.000 10.912 220.900 110.878 121.000 10.769 25
SoftGroup++0.874 171.000 10.972 280.947 10.839 150.898 290.556 440.913 20.881 230.756 280.828 20.748 230.821 11.000 10.937 150.937 10.887 61.000 10.821 13
Queryformer0.874 171.000 10.978 240.809 400.876 20.936 200.702 260.716 450.920 120.875 180.766 110.772 170.818 61.000 10.995 10.916 70.892 51.000 10.767 26
Mask3D0.870 191.000 10.985 190.782 500.818 230.938 190.760 200.749 400.923 100.877 170.760 120.785 150.820 21.000 10.912 220.864 390.878 120.983 560.825 11
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
ExtMask3D0.867 201.000 11.000 10.756 570.816 250.940 170.795 150.760 380.862 250.888 160.739 210.763 190.774 111.000 10.929 190.878 240.879 101.000 10.819 16
SoftGrouppermissive0.865 211.000 10.969 290.860 220.860 60.913 250.558 410.899 40.911 140.760 270.828 10.736 250.802 80.981 480.919 200.875 260.877 141.000 10.820 15
Thang Vu, Kookhoi Kim, Tung M. Luu, Xuan Thanh Nguyen, Chang D. Yoo: SoftGroup for 3D Instance Segmentaiton on Point Clouds. CVPR 2022 [Oral]
MAFT0.860 221.000 10.990 170.810 390.829 170.949 140.809 120.688 510.836 300.904 60.751 180.796 100.741 171.000 10.864 430.848 480.837 271.000 10.828 9
IPCA-Inst0.851 231.000 10.968 300.884 120.842 140.862 430.693 280.812 260.888 220.677 400.783 70.698 290.807 71.000 10.911 300.865 380.865 211.000 10.757 29
SPFormerpermissive0.851 231.000 10.994 110.806 410.774 350.942 160.637 330.849 180.859 270.889 130.720 260.730 260.665 291.000 10.911 300.868 360.873 171.000 10.796 19
Sun Jiahao, Qing Chunmei, Tan Junpeng, Xu Xiangmin: Superpoint Transformer for 3D Scene Instance Segmentation. AAAI 2023 [Oral]
ODIN - Inspermissive0.847 251.000 10.951 360.834 330.828 180.875 350.871 60.767 360.821 360.816 230.690 340.800 80.771 121.000 10.912 220.891 160.821 300.886 720.713 36
Ayush Jain, Pushkal Katara, Nikolaos Gkanatsios, Adam W. Harley, Gabriel Sarch, Kriti Aggarwal, Vishrav Chaudhary, Katerina Fragkiadaki: ODIN: A Single Model for 2D and 3D Segmentation. CVPR 2024
Mask3D_evaluation0.843 261.000 10.955 350.847 250.795 290.932 220.750 220.780 340.891 190.818 220.737 220.633 380.703 231.000 10.902 350.870 320.820 310.941 640.805 18
SphereSeg0.835 271.000 10.963 330.891 90.794 300.954 90.822 100.710 460.961 20.721 320.693 330.530 510.653 321.000 10.867 420.857 420.859 230.991 530.771 24
ISBNetpermissive0.835 271.000 10.950 370.731 590.819 210.918 230.790 160.740 420.851 290.831 210.661 360.742 240.650 331.000 10.937 140.814 600.836 281.000 10.765 27
Tuan Duc Ngo, Binh-Son Hua, Khoi Nguyen: ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution. CVPR 2023
TopoSeg0.832 291.000 10.981 210.933 20.819 220.826 520.524 500.841 190.811 370.681 390.759 140.687 300.727 190.981 480.911 300.883 200.853 251.000 10.756 30
GraphCut0.832 291.000 10.922 520.724 610.798 280.902 280.701 270.856 150.859 260.715 330.706 280.748 220.640 441.000 10.934 170.862 400.880 91.000 10.729 32
PBNetpermissive0.825 311.000 10.963 320.837 300.843 130.865 380.822 90.647 540.878 240.733 300.639 430.683 310.650 331.000 10.853 440.870 330.820 321.000 10.744 31
Weiguang Zhao, Yuyao Yan, Chaolong Yang, Jianan Ye, Xi Yang, Kaizhu Huang: Divide and Conquer: 3D Instance Segmentation With Point-Wise Binarization. ICCV 2023
SSEC0.820 321.000 10.983 200.924 40.826 190.817 550.415 590.899 50.793 420.673 410.731 240.636 360.653 311.000 10.939 130.804 630.878 111.000 10.780 22
DKNet0.815 331.000 10.930 440.844 270.765 390.915 240.534 480.805 280.805 390.807 250.654 370.763 200.650 331.000 10.794 560.881 210.766 361.000 10.758 28
Yizheng Wu, Min Shi, Shuaiyuan Du, Hao Lu, Zhiguo Cao, Weicai Zhong: 3D Instances as 1D Kernels. ECCV 2022
RPGN0.806 341.000 10.992 130.789 450.723 520.891 310.650 320.810 270.832 320.665 430.699 310.658 320.700 241.000 10.881 370.832 520.774 340.997 420.613 53
Shichao Dong, Guosheng Lin, Tzu-Yi Hung: Learning Regional Purity for Instance Segmentation on 3D Point Clouds. ECCV 2022
Box2Mask0.803 351.000 10.962 340.874 140.707 560.887 340.686 310.598 590.961 10.715 340.694 320.469 560.700 241.000 10.912 220.902 100.753 410.997 420.637 47
Julian Chibane, Francis Engelmann, Tuan Anh Tran, Gerard Pons-Moll: Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation Using Bounding Boxes. ECCV 2022
HAISpermissive0.803 351.000 10.994 110.820 350.759 400.855 440.554 450.882 90.827 350.615 490.676 350.638 350.646 421.000 10.912 220.797 660.767 350.994 510.726 33
Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang: Hierarchical Aggregation for 3D Instance Segmentation. ICCV 2021
Mask-Group0.792 371.000 10.968 310.812 360.766 380.864 390.460 530.815 250.888 210.598 530.651 400.639 340.600 510.918 540.941 110.896 140.721 481.000 10.723 34
Min Zhong, Xinghao Chen, Xiaokang Chen, Gang Zeng, Yunhe Wang: MaskGroup: Hierarchical Point Grouping and Masking for 3D Instance Segmentation. ICME 2022
CSC-Pretrained0.791 381.000 10.996 70.829 340.767 370.889 330.600 360.819 240.770 470.594 540.620 470.541 480.700 241.000 10.941 110.889 180.763 371.000 10.526 63
SSTNetpermissive0.789 391.000 10.840 660.888 110.717 530.835 480.717 240.684 520.627 620.724 310.652 390.727 270.600 511.000 10.912 220.822 550.757 401.000 10.691 41
Zhihao Liang, Zhihao Li, Songcen Xu, Mingkui Tan, Kui Jia: Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks. ICCV2021
GICN0.788 401.000 10.978 230.867 190.781 340.833 490.527 490.824 220.806 380.549 620.596 500.551 440.700 241.000 10.853 440.935 20.733 451.000 10.651 44
DENet0.786 411.000 10.929 450.736 580.750 460.720 680.755 210.934 10.794 410.590 550.561 560.537 490.650 331.000 10.882 360.804 640.789 331.000 10.719 35
DANCENET0.786 411.000 10.936 400.783 480.737 490.852 460.742 230.647 540.765 490.811 240.624 460.579 410.632 471.000 10.909 340.898 130.696 530.944 600.601 56
DualGroup0.782 431.000 10.927 460.811 370.772 360.853 450.631 350.805 280.773 440.613 500.611 480.610 390.650 330.835 650.881 370.879 230.750 431.000 10.675 42
PointGroup0.778 441.000 10.900 560.798 440.715 540.863 400.493 510.706 470.895 180.569 600.701 290.576 420.639 451.000 10.880 390.851 450.719 490.997 420.709 38
Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia: PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation. CVPR 2020 [oral]
PE0.776 451.000 10.900 570.860 220.728 510.869 360.400 600.857 140.774 430.568 610.701 300.602 400.646 420.933 530.843 470.890 170.691 570.997 420.709 37
Biao Zhang, Peter Wonka: Point Cloud Instance Segmentation using Probabilistic Embeddings. CVPR 2021
AOIA0.767 461.000 10.937 390.810 380.740 480.906 260.550 460.800 300.706 540.577 590.624 450.544 470.596 560.857 570.879 410.880 220.750 420.992 520.658 43
DD-UNet+Group0.764 471.000 10.897 590.837 290.753 430.830 510.459 550.824 220.699 560.629 470.653 380.438 590.650 331.000 10.880 390.858 410.690 581.000 10.650 45
H. Liu, R. Liu, K. Yang, J. Zhang, K. Peng, R. Stiefelhagen: HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor. ICCVW 2021
INS-Conv-instance0.762 481.000 10.923 490.765 530.785 320.905 270.600 360.655 530.646 610.683 380.647 410.530 500.650 331.000 10.824 490.830 530.693 560.944 600.644 46
Dyco3Dcopyleft0.761 491.000 10.935 410.893 80.752 450.863 410.600 360.588 600.742 510.641 450.633 440.546 460.550 580.857 570.789 580.853 440.762 380.987 540.699 39
Tong He; Chunhua Shen; Anton van den Hengel: DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution. CVPR2021
OccuSeg+instance0.742 501.000 10.923 490.785 460.745 470.867 370.557 420.578 630.729 520.670 420.644 420.488 540.577 571.000 10.794 560.830 530.620 661.000 10.550 59
Lei Han, Tian Zheng, Lan Xu, Lu Fang: OccuSeg: Occupancy-aware 3D Instance Segmentation. CVPR2020
RWSeg0.739 511.000 10.899 580.759 550.753 440.823 530.282 650.691 500.658 590.582 580.594 510.547 450.628 481.000 10.795 550.868 350.728 471.000 10.692 40
3D-MPA0.737 521.000 10.933 420.785 460.794 310.831 500.279 670.588 600.695 570.616 480.559 570.556 430.650 331.000 10.809 530.875 270.696 541.000 10.608 55
Francis Engelmann, Martin Bokeloh, Alireza Fathi, Bastian Leibe, Matthias Nießner: 3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation. CVPR 2020
MTML0.731 531.000 10.992 130.779 520.609 650.746 630.308 640.867 100.601 650.607 510.539 600.519 520.550 581.000 10.824 490.869 340.729 461.000 10.616 51
Jean Lahoud, Bernard Ghanem, Marc Pollefeys, Martin R. Oswald: 3D Instance Segmentation via Multi-task Metric Learning. ICCV 2019 [oral]
OSIS0.725 541.000 10.885 620.653 670.657 620.801 560.576 400.695 490.828 330.698 360.534 610.457 580.500 650.857 570.831 480.841 500.627 641.000 10.619 50
SSEN0.724 551.000 10.926 470.781 510.661 600.845 470.596 390.529 660.764 500.653 440.489 670.461 570.500 650.859 560.765 590.872 300.761 391.000 10.577 57
Dongsu Zhang, Junha Chun, Sang Kyun Cha, Young Min Kim: Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with Deep Metric Learning. Arxiv
NeuralBF0.718 561.000 10.945 380.901 70.754 420.817 540.460 530.700 480.772 450.688 370.568 550.000 780.500 650.981 480.606 690.872 290.740 441.000 10.614 52
Weiwei Sun, Daniel Rebain, Renjie Liao, Vladimir Tankovich, Soroosh Yazdani, Kwang Moo Yi, Andrea Tagliasacchi: NeuralBF: Neural Bilateral Filtering for Top-down Instance Segmentation on Point Clouds. WACV 2023
Sparse R-CNN0.714 571.000 10.926 480.694 620.699 580.890 320.636 340.516 670.693 580.743 290.588 520.369 630.601 500.594 710.800 540.886 190.676 590.986 550.546 60
SALoss-ResNet0.695 581.000 10.855 640.579 720.589 670.735 660.484 520.588 600.856 280.634 460.571 540.298 640.500 651.000 10.824 490.818 560.702 520.935 670.545 61
Zhidong Liang, Ming Yang, Hao Li, Chunxiang Wang: 3D Instance Embedding Learning With a Structure-Aware Loss Function for Point Cloud Segmentation. IEEE Robotics and Automation Letters (IROS2020)
PanopticFusion-inst0.693 591.000 10.852 650.655 660.616 640.788 580.334 620.763 370.771 460.457 720.555 580.652 330.518 620.857 570.765 590.732 720.631 620.944 600.577 58
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
Occipital-SCS0.688 601.000 10.913 530.730 600.737 500.743 650.442 560.855 160.655 600.546 630.546 590.263 660.508 640.889 550.568 700.771 690.705 510.889 700.625 49
3D-BoNet0.687 611.000 10.887 610.836 310.587 680.643 750.550 460.620 560.724 530.522 670.501 650.243 670.512 631.000 10.751 610.807 620.661 610.909 690.612 54
Bo Yang, Jianan Wang, Ronald Clark, Qingyong Hu, Sen Wang, Andrew Markham, Niki Trigoni: Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds. NeurIPS 2019 Spotlight
ClickSeg_Instance0.685 621.000 10.818 680.600 700.715 550.795 570.557 420.533 650.591 670.601 520.519 630.429 610.638 460.938 520.706 640.817 580.624 650.944 600.502 65
PCJC0.684 631.000 10.895 600.757 560.659 610.862 420.189 740.739 430.606 640.712 350.581 530.515 530.650 330.857 570.357 750.785 670.631 630.889 700.635 48
SPG_WSIS0.678 641.000 10.880 630.836 310.701 570.727 670.273 690.607 580.706 550.541 650.515 640.174 700.600 510.857 570.716 630.846 490.711 501.000 10.506 64
One_Thing_One_Clickpermissive0.675 651.000 10.823 670.782 490.621 630.766 600.211 710.736 440.560 690.586 560.522 620.636 370.453 690.641 690.853 440.850 470.694 550.997 420.411 70
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
SegGroup_inspermissive0.637 661.000 10.923 510.593 710.561 690.746 640.143 760.504 680.766 480.485 700.442 680.372 620.530 610.714 660.815 520.775 680.673 601.000 10.431 69
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
MASCpermissive0.615 670.711 740.802 690.540 730.757 410.777 590.029 770.577 640.588 680.521 680.600 490.436 600.534 600.697 670.616 680.838 510.526 680.980 570.534 62
Chen Liu, Yasutaka Furukawa: MASC: Multi-scale Affinity with Sparse Convolution for 3D Instance Segmentation.
UNet-backbone0.605 681.000 10.909 540.764 540.603 660.704 690.415 580.301 730.548 700.461 710.394 690.267 650.386 710.857 570.649 670.817 570.504 700.959 580.356 73
3D-SISpermissive0.558 691.000 10.773 700.614 690.503 720.691 710.200 720.412 690.498 730.546 640.311 740.103 740.600 510.857 570.382 720.799 650.445 760.938 660.371 71
Ji Hou, Angela Dai, Matthias Niessner: 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans. CVPR 2019
R-PointNet0.544 700.500 770.655 760.661 650.663 590.765 610.432 570.214 760.612 630.584 570.499 660.204 690.286 750.429 740.655 660.650 770.539 670.950 590.499 66
Hier3Dcopyleft0.540 711.000 10.727 710.626 680.467 750.693 700.200 720.412 690.480 740.528 660.318 730.077 770.600 510.688 680.382 720.768 700.472 720.941 640.350 74
Tan: HCFS3D: Hierarchical Coupled Feature Selection Network for 3D Semantic and Instance Segmentation.
Region-18class0.497 720.250 790.902 550.689 630.540 700.747 620.276 680.610 570.268 780.489 690.348 700.000 780.243 780.220 770.663 650.814 590.459 740.928 680.496 67
Sem_Recon_ins0.484 730.764 730.608 780.470 750.521 710.637 760.311 630.218 750.348 770.365 760.223 750.222 680.258 760.629 700.734 620.596 780.509 690.858 740.444 68
tmp0.474 741.000 10.727 710.433 770.481 740.673 730.022 790.380 710.517 720.436 740.338 720.128 720.343 730.429 740.291 770.728 730.473 710.833 750.300 76
SemRegionNet-20cls0.470 751.000 10.727 710.447 760.481 730.678 720.024 780.380 710.518 710.440 730.339 710.128 720.350 720.429 740.212 780.711 740.465 730.833 750.290 77
ASIS0.422 760.333 780.707 740.676 640.401 760.650 740.350 610.177 770.594 660.376 750.202 760.077 760.404 700.571 720.197 790.674 760.447 750.500 780.260 78
3D-BEVIS0.401 770.667 750.687 750.419 780.137 790.587 770.188 750.235 740.359 760.211 780.093 790.080 750.311 740.571 720.382 720.754 710.300 780.874 730.357 72
Cathrin Elich, Francis Engelmann, Jonas Schult, Theodora Kontogianni, Bastian Leibe: 3D-BEVIS: Birds-Eye-View Instance Segmentation.
Sgpn_scannet0.390 780.556 760.636 770.493 740.353 770.539 780.271 700.160 780.450 750.359 770.178 770.146 710.250 770.143 780.347 760.698 750.436 770.667 770.331 75
MaskRCNN 2d->3d Proj0.261 790.903 720.081 790.008 790.233 780.175 790.280 660.106 790.150 790.203 790.175 780.480 550.218 790.143 780.542 710.404 790.153 790.393 790.049 79


This table lists the benchmark results for the 2D semantic label scenario.


Method Infoavg ioubathtubbedbookshelfcabinetchaircountercurtaindeskdoorfloorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwallwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Virtual MVFusion (R)0.745 10.861 10.839 10.881 10.672 20.512 10.422 190.898 10.723 10.714 10.954 20.454 10.509 10.773 10.895 10.756 10.820 10.653 10.935 10.891 10.728 1
Abhijit Kundu, Xiaoqi Yin, Alireza Fathi, David Ross, Brian Brewington, Thomas Funkhouser, Caroline Pantofaru: Virtual Multi-view Fusion for 3D Semantic Segmentation. ECCV 2020
BPNet_2Dcopyleft0.670 20.822 30.795 30.836 20.659 30.481 20.451 150.769 50.656 30.567 40.931 30.395 60.390 60.700 40.534 40.689 110.770 20.574 30.865 110.831 30.675 6
Wenbo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia and Tien-Tsin Wong: Bidirectional Projection Network for Cross Dimension Scene Understanding. CVPR 2021 (Oral)
MVF-GNN(2D)0.636 30.606 160.794 40.434 170.688 10.337 80.464 140.798 40.632 50.589 30.908 90.420 20.329 140.743 20.594 20.738 20.676 50.527 40.906 20.818 60.715 3
CU-Hybrid-2D Net0.636 30.825 20.820 20.179 250.648 40.463 30.549 20.742 90.676 20.628 20.961 10.420 20.379 70.684 80.381 200.732 30.723 30.599 20.827 180.851 20.634 9
DVEFormer0.626 50.616 120.764 60.690 50.583 110.322 140.540 30.809 30.593 70.502 120.900 140.374 90.433 30.660 90.528 50.665 190.663 60.491 90.871 100.810 90.705 4
CMX0.613 60.681 90.725 130.502 130.634 60.297 190.478 120.830 20.651 40.537 70.924 40.375 70.315 160.686 70.451 150.714 50.543 230.504 60.894 70.823 50.688 5
DMMF_3d0.605 70.651 100.744 110.782 30.637 50.387 40.536 50.732 100.590 80.540 60.856 230.359 120.306 170.596 160.539 30.627 220.706 40.497 80.785 230.757 210.476 24
EMSANet0.600 80.716 40.746 100.395 200.614 90.382 50.523 60.713 130.571 120.503 100.922 70.404 50.397 50.655 100.400 170.626 230.663 60.469 140.900 40.827 40.577 16
Seichter, Daniel and Fischedick, Söhnke and Köhler, Mona and Gross, Horst-Michael: EMSANet: Efficient Multi-Task RGB-D Scene Analysis for Indoor Environments. IJCNN 2022
MCA-Net0.595 90.533 220.756 90.746 40.590 100.334 100.506 90.670 170.587 90.500 130.905 110.366 110.352 100.601 150.506 90.669 170.648 100.501 70.839 170.769 170.516 23
RFBNet0.592 100.616 120.758 80.659 60.581 120.330 110.469 130.655 200.543 150.524 80.924 40.355 140.336 120.572 190.479 110.671 150.648 100.480 110.814 210.814 70.614 12
FAN_NV_RVC0.586 110.510 230.764 60.079 280.620 80.330 110.494 100.753 70.573 100.556 50.884 180.405 40.303 180.718 30.452 140.672 140.658 80.509 50.898 50.813 80.727 2
WSGFormer0.585 120.706 50.708 180.434 170.574 140.283 220.538 40.759 60.542 170.482 170.924 40.351 160.333 130.614 120.393 180.692 100.551 220.461 150.874 90.809 100.673 7
DCRedNet0.583 130.682 80.723 140.542 120.510 220.310 160.451 150.668 180.549 140.520 90.920 80.375 70.446 20.528 220.417 160.670 160.577 190.478 120.862 120.806 110.628 11
MIX6D_RVC0.582 140.695 60.687 190.225 230.632 70.328 130.550 10.748 80.623 60.494 160.890 160.350 170.254 250.688 60.454 130.716 40.597 180.489 100.881 80.768 180.575 17
SSMAcopyleft0.577 150.695 60.716 160.439 150.563 160.314 150.444 170.719 110.551 130.503 100.887 170.346 180.348 110.603 140.353 220.709 60.600 160.457 160.901 30.786 130.599 15
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. International Journal of Computer Vision, 2019
DMMF0.567 160.623 110.767 50.238 220.571 150.347 60.413 210.719 110.472 220.418 240.895 150.357 130.260 240.696 50.523 80.666 180.642 120.437 200.895 60.793 120.603 14
UNIV_CNP_RVC_UE0.566 170.569 210.686 210.435 160.524 190.294 200.421 200.712 140.543 150.463 190.872 190.320 190.363 90.611 130.477 120.686 120.627 130.443 190.862 120.775 160.639 8
EMSAFormer0.564 180.581 180.736 120.564 110.546 180.219 250.517 70.675 160.486 210.427 230.904 120.352 150.320 150.589 170.528 50.708 70.464 260.413 240.847 160.786 130.611 13
SN_RN152pyrx8_RVCcopyleft0.546 190.572 190.663 230.638 80.518 200.298 180.366 260.633 230.510 190.446 210.864 210.296 220.267 210.542 210.346 230.704 80.575 200.431 210.853 150.766 190.630 10
UDSSEG_RVC0.545 200.610 150.661 240.588 90.556 170.268 230.482 110.642 220.572 110.475 180.836 250.312 200.367 80.630 110.189 250.639 210.495 250.452 170.826 190.756 220.541 19
segfomer with 6d0.542 210.594 170.687 190.146 260.579 130.308 170.515 80.703 150.472 220.498 140.868 200.369 100.282 190.589 170.390 190.701 90.556 210.416 230.860 140.759 200.539 21
FuseNetpermissive0.535 220.570 200.681 220.182 240.512 210.290 210.431 180.659 190.504 200.495 150.903 130.308 210.428 40.523 230.365 210.676 130.621 150.470 130.762 240.779 150.541 19
Caner Hazirbas, Lingni Ma, Csaba Domokos, Daniel Cremers: FuseNet: Incorporating Depth into Semantic Segmentation via Fusion-based CNN Architecture. ACCV 2016
AdapNet++copyleft0.503 230.613 140.722 150.418 190.358 280.337 80.370 250.479 260.443 240.368 260.907 100.207 250.213 270.464 260.525 70.618 240.657 90.450 180.788 220.721 250.408 27
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. International Journal of Computer Vision, 2019
3DMV (2d proj)0.498 240.481 260.612 250.579 100.456 240.343 70.384 230.623 240.525 180.381 250.845 240.254 240.264 230.557 200.182 260.581 260.598 170.429 220.760 250.661 270.446 26
Angela Dai, Matthias Niessner: 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation. ECCV'18
MSeg1080_RVCpermissive0.485 250.505 240.709 170.092 270.427 250.241 240.411 220.654 210.385 280.457 200.861 220.053 280.279 200.503 240.481 100.645 200.626 140.365 260.748 260.725 240.529 22
John Lambert*, Zhuang Liu*, Ozan Sener, James Hays, Vladlen Koltun: MSeg: A Composite Dataset for Multi-domain Semantic Segmentation. CVPR 2020
ILC-PSPNet0.475 260.490 250.581 260.289 210.507 230.067 280.379 240.610 250.417 260.435 220.822 270.278 230.267 210.503 240.228 240.616 250.533 240.375 250.820 200.729 230.560 18
Enet (reimpl)0.376 270.264 280.452 280.452 140.365 260.181 260.143 280.456 270.409 270.346 270.769 280.164 260.218 260.359 270.123 280.403 280.381 280.313 280.571 270.685 260.472 25
Re-implementation of Adam Paszke, Abhishek Chaurasia, Sangpil Kim, Eugenio Culurciello: ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation.
ScanNet (2d proj)permissive0.330 280.293 270.521 270.657 70.361 270.161 270.250 270.004 280.440 250.183 280.836 250.125 270.060 280.319 280.132 270.417 270.412 270.344 270.541 280.427 280.109 28
Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner: ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. CVPR'17


This table lists the benchmark results for the 2D semantic instance scenario.




Method Infoavg apbathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
EMSANet (Instance)0.241 10.401 10.439 10.085 10.242 10.220 10.081 10.289 20.117 20.121 10.182 10.126 10.346 10.181 20.181 20.358 10.156 10.675 20.131 1
Seichter, Daniel and Fischedick, Söhnke and Köhler, Mona and Gross, Horst-Michael: EMSANet: Efficient Multi-Task RGB-D Scene Analysis for Indoor Environments. IJCNN 2022
UniDet_RVC0.205 20.381 20.323 30.037 30.226 30.177 30.063 20.277 30.120 10.067 30.131 30.074 30.317 20.080 30.235 10.289 30.141 30.678 10.080 3
FKNet0.204 30.334 30.358 20.038 20.234 20.184 20.025 30.318 10.042 40.088 20.141 20.053 40.300 30.207 10.171 30.292 20.149 20.636 30.109 2
MaskRCNN_ScanNetpermissive0.119 40.129 40.212 40.002 40.112 40.148 40.014 40.205 40.044 30.066 40.078 40.095 20.142 40.030 40.128 40.139 40.080 40.459 40.057 4
Re-implementation of Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross Girshick: Mask R-CNN. ICCV'17


This table lists the benchmark results for the scene type classification scenario.




Method Infoavg recallapartmentbathroombedroom / hotelbookstore / libraryconference roomcopy/mail roomhallwaykitchenlaundry roomliving room / loungemiscofficestorage / basement / garage
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
LAST-PCL-type0.780 10.250 31.000 11.000 11.000 11.000 11.000 10.500 21.000 10.500 20.889 10.000 21.000 11.000 1
Yanmin Wu, Qiankun Gao, Renrui Zhang, and Jian Zhang: Language-Assisted 3D Scene Understanding. arxiv23.12
multi-taskpermissive0.700 20.500 11.000 10.882 30.500 31.000 11.000 10.500 21.000 11.000 10.778 20.000 20.938 20.000 3
Shengyu Huang, Mikhail Usvyatsov, Konrad Schindler: Indoor Scene Recognition in 3D. IROS 2020
3DASPP-SCE0.691 30.500 10.938 30.824 41.000 11.000 10.500 31.000 10.857 30.500 20.556 40.000 20.812 30.500 2
SE-ResNeXt-SSMA0.498 40.000 50.812 40.941 20.500 30.500 40.500 30.500 20.429 50.500 20.667 30.500 10.625 40.000 3
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. arXiv
resnet50_scannet0.353 50.250 30.812 40.529 50.500 30.500 40.000 50.500 20.571 40.000 50.556 40.000 20.375 50.000 3