Presenting the ScanNet200 Benchmark

We present the ScanNet200 benchmark, which studies an order of magnitude more class categories than previous version of ScanNet. The scene geometry is shared within the two tasks, but the parsing of surface annotation allows for a larger vocabulary and more realistic setting for in the wild 3D understanding methods.

The ScanNet200 benchmark includes both finer-grained categories as well as a large number of previously unaddressed classes. This induces a much more challenging setting regarding the diversity of naturally observed semantic classes seen in the raw ScanNet RGB-D observations, where the data also reflects naturally encountered class imbalances. The difference in category frequencies between ScanNet and ScanNet200 can be seen in the Figure above.

ScanNet200 Benchmark

This table lists the benchmark results for the ScanNet200 3D semantic label scenario.




Method Infoavg iouhead ioucommon ioutail ioualarm clockarmchairbackpackbagballbarbasketbathroom cabinetbathroom counterbathroom stallbathroom stall doorbathroom vanitybathtubbedbenchbicyclebinblackboardblanketblindsboardbookbookshelfbottlebowlboxbroombucketbulletin boardcabinetcalendarcandlecartcase of water bottlescd caseceilingceiling lightchairclockclosetcloset doorcloset rodcloset wallclothesclothes dryercoat rackcoffee kettlecoffee makercoffee tablecolumncomputer towercontainercopiercouchcountercratecupcurtaincushiondecorationdeskdining tabledish rackdishwasherdividerdoordoorframedresserdumbbelldustpanend tablefanfile cabinetfire alarmfire extinguisherfireplacefloorfolded chairfurnitureguitarguitar casehair dryerhandicap barhatheadphonesironing boardjacketkeyboardkeyboard pianokitchen cabinetkitchen counterladderlamplaptoplaundry basketlaundry detergentlaundry hamperledgelightlight switchluggagemachinemailboxmatmattressmicrowavemini fridgemirrormonitormousemusic standnightstandobjectoffice chairottomanovenpaperpaper bagpaper cutterpaper towel dispenserpaper towel rollpersonpianopicturepillarpillowpipeplantplateplungerposterpotted plantpower outletpower stripprinterprojectorprojector screenpurserackradiatorrailrange hoodrecycling binrefrigeratorscaleseatshelfshoeshowershower curtainshower curtain rodshower doorshower floorshower headshower wallsignsinksoap dishsoap dispensersofa chairspeakerstair railstairsstandstoolstorage binstorage containerstorage organizerstovestructurestuffed animalsuitcasetabletelephonetissue boxtoastertoaster oventoilettoilet papertoilet paper dispensertoilet paper holdertoilet seat cover dispensertoweltrash bintrash cantraytubetvtv standvacuum cleanerventwallwardrobewashing machinewater bottlewater coolerwater pitcherwhiteboardwindowwindowsill
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
IMFSegNet0.334 90.532 130.251 110.179 70.486 90.041 160.139 130.003 10.283 40.000 10.274 150.191 150.457 140.704 140.795 70.197 90.830 60.000 30.710 90.055 160.064 40.518 60.305 100.458 170.216 120.027 50.284 130.000 10.000 30.044 120.406 100.561 70.000 10.080 120.000 30.873 90.021 150.683 80.000 70.076 90.494 100.363 90.648 160.000 10.000 20.425 90.649 40.000 100.668 120.908 70.740 110.010 140.206 80.862 100.000 10.000 110.560 90.000 70.359 130.237 110.631 120.408 110.411 40.322 150.246 40.439 100.599 130.047 40.213 70.940 100.139 110.000 10.369 50.124 100.188 120.495 110.624 110.626 80.320 140.595 40.495 80.496 100.000 40.000 10.340 120.014 60.032 70.135 50.000 40.903 80.277 60.612 80.196 70.344 120.848 130.260 60.000 10.574 130.073 160.062 40.000 40.000 10.091 60.839 30.776 30.123 120.392 90.756 120.274 50.518 120.029 160.842 40.000 60.357 130.000 10.035 70.000 30.444 120.793 20.245 50.000 10.512 160.512 150.159 150.713 130.000 100.000 10.336 130.484 120.569 20.852 90.615 60.120 120.068 100.228 80.000 10.733 100.773 20.190 40.000 100.608 60.792 40.000 10.597 70.000 140.025 20.000 10.573 170.000 20.000 10.508 110.555 80.363 100.139 120.610 20.947 80.305 70.594 90.527 90.009 170.633 130.000 10.060 30.820 50.604 150.799 90.000 10.799 110.034 140.784 130.000 10.618 60.424 20.134 160.646 130.214 14
OA-CNN-L_ScanNet2000.333 110.558 50.269 90.124 130.448 140.080 90.272 50.000 30.000 70.000 10.342 80.515 40.524 70.713 130.789 90.158 120.384 120.000 30.806 60.125 70.000 90.496 80.332 70.498 140.227 80.024 60.474 30.000 10.003 20.071 90.487 30.000 110.000 10.110 80.000 30.876 70.013 170.703 30.000 70.076 90.473 120.355 110.906 60.000 10.000 20.476 60.706 10.000 100.672 100.835 130.748 90.015 130.223 70.860 110.000 10.000 110.572 70.000 70.509 70.313 70.662 40.398 130.396 80.411 130.276 20.527 40.711 50.000 70.076 130.946 60.166 60.000 10.022 100.160 70.183 130.493 130.699 90.637 60.403 60.330 120.406 130.526 60.024 20.000 10.392 110.000 110.016 160.000 120.196 30.915 50.112 120.557 100.197 60.352 100.877 30.000 120.000 10.592 120.103 110.000 140.067 10.000 10.089 70.735 70.625 110.130 90.568 60.836 70.271 80.534 90.043 130.799 110.001 50.445 50.000 10.000 80.024 20.661 40.000 50.262 30.000 10.591 80.517 130.373 80.788 70.021 80.000 10.455 40.517 90.320 80.823 120.200 160.001 170.150 50.100 120.000 10.736 90.668 100.103 140.052 60.662 40.720 80.000 10.602 60.112 70.002 60.000 10.637 90.000 20.000 10.621 100.569 50.398 90.412 50.234 120.949 60.363 50.492 140.495 110.251 40.665 90.000 10.001 110.805 70.833 60.794 110.000 10.821 50.314 50.843 110.000 10.560 100.245 70.262 60.713 40.370 11
CSC-Pretrainpermissive0.249 170.455 170.171 160.079 170.418 150.059 140.186 100.000 30.000 70.000 10.335 100.250 130.316 160.766 70.697 170.142 130.170 140.003 20.553 140.112 90.097 10.201 160.186 140.476 150.081 160.000 90.216 170.000 10.000 30.001 170.314 170.000 110.000 10.055 150.000 30.832 160.094 30.659 150.002 50.076 90.310 160.293 170.664 140.000 10.000 20.175 170.634 60.130 20.552 170.686 170.700 170.076 70.110 150.770 170.000 10.000 110.430 170.000 70.319 150.166 150.542 170.327 160.205 160.332 140.052 160.375 130.444 170.000 70.012 170.930 170.203 30.000 10.000 120.046 120.175 140.413 160.592 140.471 160.299 150.152 160.340 160.247 170.000 40.000 10.225 150.058 30.037 40.000 120.207 20.862 150.014 140.548 130.033 160.233 160.816 160.000 120.000 10.542 150.123 50.121 10.019 20.000 10.000 110.463 160.454 170.045 170.128 170.557 150.235 140.441 160.063 110.484 170.000 60.308 170.000 10.000 80.000 30.318 170.000 50.000 90.000 10.545 140.543 120.164 140.734 90.000 100.000 10.215 170.371 160.198 140.743 140.205 150.062 150.000 110.079 140.000 10.683 160.547 160.142 90.000 100.441 110.579 150.000 10.464 140.098 90.041 10.000 10.590 140.000 20.000 10.373 130.494 140.174 150.105 160.001 170.895 160.222 160.537 120.307 160.180 50.625 140.000 10.000 120.591 170.609 140.398 150.000 10.766 170.014 160.638 170.000 10.377 130.004 130.206 130.609 170.465 5
Ji Hou, Benjamin Graham, Matthias Nießner, Saining Xie: Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts. CVPR 2021
LGroundpermissive0.272 150.485 150.184 150.106 150.476 110.077 100.218 80.000 30.000 70.000 10.547 20.295 110.540 50.746 100.745 150.058 160.112 160.005 10.658 110.077 150.000 90.322 140.178 160.512 110.190 130.199 20.277 150.000 10.000 30.173 70.399 120.000 110.000 10.039 160.000 30.858 140.085 70.676 110.002 50.103 60.498 80.323 140.703 120.000 10.000 20.296 150.549 120.216 10.702 60.768 140.718 140.028 100.092 160.786 160.000 10.000 110.453 160.022 50.251 170.252 90.572 150.348 140.321 110.514 70.063 150.279 160.552 150.000 70.019 160.932 150.132 150.000 10.000 120.000 150.156 170.457 150.623 120.518 140.265 160.358 110.381 150.395 140.000 40.000 10.127 170.012 80.051 10.000 120.000 40.886 130.014 140.437 170.179 80.244 150.826 150.000 120.000 10.599 100.136 10.085 30.000 40.000 10.000 110.565 130.612 130.143 50.207 150.566 140.232 150.446 150.127 40.708 150.000 60.384 90.000 10.000 80.000 30.402 140.000 50.059 70.000 10.525 150.566 110.229 120.659 150.000 100.000 10.265 150.446 140.147 160.720 170.597 80.066 140.000 110.187 90.000 10.726 130.467 170.134 120.000 100.413 150.629 120.000 10.363 160.055 100.022 30.000 10.626 110.000 20.000 10.323 150.479 170.154 160.117 150.028 160.901 150.243 150.415 160.295 170.143 60.610 160.000 10.000 120.777 120.397 170.324 160.000 10.778 150.179 80.702 160.000 10.274 160.404 40.233 100.622 150.398 7
David Rozenberszki, Or Litany, Angela Dai: Language-Grounded Indoor 3D Semantic Segmentation in the Wild. arXiv
AWCS0.305 140.508 140.225 140.142 110.463 130.063 130.195 90.000 30.000 70.000 10.467 30.551 30.504 80.773 60.764 140.142 130.029 170.000 30.626 130.100 110.000 90.360 130.179 150.507 130.137 150.006 80.300 120.000 10.000 30.172 80.364 150.512 90.000 10.056 140.000 30.865 130.093 40.634 170.000 70.071 130.396 140.296 160.876 90.000 10.000 20.373 130.436 160.063 90.749 20.877 100.721 120.131 30.124 140.804 150.000 10.000 110.515 120.010 60.452 100.252 90.578 140.417 80.179 170.484 100.171 70.337 140.606 120.000 70.115 100.937 140.142 90.000 10.008 110.000 150.157 160.484 140.402 170.501 150.339 90.553 70.529 30.478 120.000 40.000 10.404 100.001 100.022 130.077 90.000 40.894 120.219 70.628 70.093 150.305 140.886 10.233 90.000 10.603 90.112 60.023 90.000 40.000 10.000 110.741 60.664 80.097 150.253 140.782 100.264 110.523 110.154 20.707 160.000 60.411 80.000 10.000 80.000 30.332 160.000 50.000 90.000 10.602 70.595 100.185 130.656 160.159 60.000 10.355 110.424 150.154 150.729 150.516 100.220 100.620 30.084 130.000 10.707 140.651 130.173 50.014 90.381 170.582 140.000 10.619 30.049 120.000 70.000 10.702 40.000 20.000 10.302 160.489 150.317 130.334 70.392 70.922 140.254 130.533 130.394 130.129 140.613 150.000 10.000 120.820 50.649 110.749 130.000 10.782 140.282 60.863 60.000 10.288 150.006 120.220 110.633 140.542 3
: Long-Tailed 3D Semantic Segmentation with Adaptive Weight Constraint and Sampling. ICRA 2024
CeCo0.340 70.551 90.247 130.181 60.475 120.057 150.142 120.000 30.000 70.000 10.387 60.463 60.499 90.924 20.774 110.213 60.257 130.000 30.546 150.100 110.006 80.615 20.177 170.534 70.246 60.000 90.400 50.000 10.338 10.006 160.484 50.609 50.000 10.083 110.000 30.873 90.089 50.661 140.000 70.048 150.560 40.408 60.892 80.000 10.000 20.586 10.616 80.000 100.692 80.900 80.721 120.162 10.228 60.860 110.000 10.000 110.575 50.083 30.550 40.347 40.624 130.410 100.360 90.740 30.109 130.321 150.660 80.000 70.121 90.939 130.143 80.000 10.400 20.003 130.190 110.564 60.652 100.615 110.421 50.304 130.579 10.547 50.000 40.000 10.296 140.000 110.030 90.096 70.000 40.916 40.037 130.551 120.171 90.376 70.865 70.286 50.000 10.633 50.102 120.027 80.011 30.000 10.000 110.474 140.742 50.133 70.311 130.824 80.242 130.503 140.068 90.828 90.000 60.429 70.000 10.063 50.000 30.781 20.000 50.000 90.000 10.665 20.633 60.450 60.818 20.000 100.000 10.429 50.532 70.226 130.825 110.510 110.377 50.709 20.079 140.000 10.753 50.683 80.102 150.063 50.401 160.620 130.000 10.619 30.000 140.000 70.000 10.595 130.000 20.000 10.345 140.564 60.411 80.603 10.384 80.945 90.266 110.643 50.367 140.304 10.663 100.000 10.010 70.726 150.767 70.898 30.000 10.784 130.435 10.861 70.000 10.447 110.000 150.257 70.656 110.377 10
Zhisheng Zhong, Jiequan Cui, Yibo Yang, Xiaoyang Wu, Xiaojuan Qi, Xiangyu Zhang, Jiaya Jia: Understanding Imbalanced Semantic Segmentation Through Neural Collapse. CVPR 2023
OctFormer ScanNet200permissive0.326 130.539 100.265 100.131 120.499 60.110 40.522 30.000 30.000 70.000 10.318 110.427 70.455 150.743 110.765 130.175 110.842 40.000 30.828 50.204 40.033 60.429 110.335 60.601 20.312 30.000 90.357 100.000 10.000 30.047 110.423 90.000 110.000 10.105 90.000 30.873 90.079 90.670 120.000 70.117 50.471 130.432 30.829 110.000 10.000 20.584 20.417 170.089 60.684 90.837 120.705 160.021 120.178 110.892 60.000 10.028 80.505 130.000 70.457 90.200 140.662 40.412 90.244 150.496 80.000 170.451 80.626 90.000 70.102 110.943 90.138 130.000 10.000 120.149 80.291 30.534 90.722 70.632 70.331 100.253 140.453 110.487 110.000 40.000 10.479 60.000 110.022 130.000 120.000 40.900 100.128 110.684 30.164 100.413 40.854 100.000 120.000 10.512 160.074 150.003 110.000 40.000 10.000 110.469 150.613 120.132 80.529 70.871 30.227 160.582 70.026 170.787 120.000 60.339 150.000 10.000 80.000 30.626 70.000 50.029 80.000 10.587 90.612 80.411 70.724 100.000 100.000 10.407 60.552 50.513 30.849 100.655 40.408 40.000 110.296 20.000 10.686 150.645 140.145 80.022 80.414 140.633 110.000 10.637 20.224 30.000 70.000 10.650 80.000 20.000 10.622 90.535 120.343 120.483 30.230 130.943 100.289 100.618 70.596 50.140 80.679 80.000 10.022 60.783 110.620 120.906 10.000 10.806 80.137 100.865 50.000 10.378 120.000 150.168 150.680 80.227 13
Peng-Shuai Wang: OctFormer: Octree-based Transformers for 3D Point Clouds. SIGGRAPH 2023
PPT-SpUNet-F.T.0.332 120.556 60.270 70.123 140.519 40.091 70.349 40.000 30.000 70.000 10.339 90.383 100.498 100.833 40.807 40.241 40.584 90.000 30.755 70.124 80.000 90.608 30.330 80.530 90.314 20.000 90.374 80.000 10.000 30.197 50.459 70.000 110.000 10.117 60.000 30.876 70.095 20.682 90.000 70.086 80.518 70.433 20.930 40.000 10.000 20.563 30.542 140.077 70.715 40.858 110.756 50.008 160.171 120.874 80.000 10.039 70.550 110.000 70.545 50.256 80.657 80.453 40.351 100.449 110.213 60.392 120.611 110.000 70.037 150.946 60.138 130.000 10.000 120.063 110.308 20.537 80.796 50.673 40.323 110.392 100.400 140.509 70.000 40.000 10.649 10.000 110.023 120.000 120.000 40.914 60.002 160.506 160.163 110.359 80.872 50.000 120.000 10.623 70.112 60.001 120.000 40.000 10.021 90.753 50.565 150.150 40.579 40.806 90.267 90.616 40.042 140.783 130.000 60.374 110.000 10.000 80.000 30.620 80.000 50.000 90.000 10.572 130.634 50.350 90.792 50.000 100.000 10.376 90.535 60.378 60.855 70.672 30.074 130.000 110.185 100.000 10.727 120.660 120.076 170.000 100.432 120.646 100.000 10.594 80.006 130.000 70.000 10.658 70.000 20.000 10.661 40.549 100.300 140.291 80.045 140.942 110.304 80.600 80.572 70.135 120.695 50.000 10.008 90.793 90.942 20.899 20.000 10.816 60.181 70.897 20.000 10.679 40.223 80.264 50.691 50.345 12
Xiaoyang Wu, Zhuotao Tian, Xin Wen, Bohao Peng, Xihui Liu, Kaicheng Yu, Hengshuang Zhao: Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt Training. CVPR 2024
L3DETR-ScanNet_2000.336 80.533 110.279 60.155 100.508 50.073 110.101 170.000 30.058 60.000 10.294 140.233 140.548 40.927 10.788 100.264 20.463 110.000 30.638 120.098 130.014 70.411 120.226 130.525 100.225 90.010 70.397 60.000 10.000 30.192 60.380 140.598 60.000 10.117 60.000 30.883 60.082 80.689 40.000 70.032 170.549 60.417 40.910 50.000 10.000 20.448 80.613 90.000 100.697 70.960 30.759 40.158 20.293 30.883 70.000 10.312 30.583 40.079 40.422 110.068 170.660 70.418 70.298 120.430 120.114 110.526 50.776 30.051 30.679 30.946 60.152 70.000 10.183 80.000 150.211 80.511 100.409 160.565 120.355 80.448 80.512 50.557 30.000 40.000 10.420 90.000 110.007 170.104 60.000 40.125 170.330 30.514 150.146 120.321 130.860 80.174 110.000 10.629 60.075 140.000 140.000 40.000 10.002 100.671 80.712 70.141 60.339 120.856 40.261 120.529 100.067 100.835 60.000 60.369 120.000 10.259 20.000 30.629 60.000 50.487 10.000 10.579 110.646 40.107 170.720 110.122 70.000 10.333 140.505 100.303 90.908 30.503 130.565 20.074 80.324 10.000 10.740 80.661 110.109 130.000 100.427 130.563 170.000 10.579 110.108 80.000 70.000 10.664 60.000 20.000 10.641 70.539 110.416 70.515 20.256 110.940 120.312 60.209 170.620 30.138 110.636 110.000 10.000 120.775 130.861 50.765 120.000 10.801 90.119 110.860 80.000 10.687 20.001 140.192 140.679 90.699 1
Yanmin Wu, Qiankun Gao, Renrui Zhang, Jian Zhang: Language-Assisted 3D Scene Understanding. arXiv23.12
GSTran0.334 100.533 120.250 120.179 80.487 80.041 160.139 130.003 10.273 50.000 10.273 160.189 160.465 120.704 140.794 80.198 80.831 50.000 30.712 80.055 160.063 50.518 60.306 90.459 160.217 100.028 40.282 140.000 10.000 30.044 120.405 110.558 80.000 10.080 120.000 30.873 90.020 160.684 70.000 70.075 120.496 90.363 90.651 150.000 10.000 20.425 90.648 50.000 100.669 110.914 60.741 100.009 150.200 90.864 90.000 10.000 110.560 90.000 70.357 140.233 120.633 110.408 110.411 40.320 160.242 50.440 90.598 140.047 40.205 80.940 100.139 110.000 10.372 40.138 90.191 100.495 110.618 130.624 90.321 120.595 40.496 70.499 80.000 40.000 10.340 120.014 60.032 70.136 40.000 40.903 80.279 50.601 90.198 50.345 110.849 110.260 60.000 10.573 140.072 170.060 50.000 40.000 10.089 70.838 40.775 40.125 110.381 110.752 130.274 50.517 130.032 150.841 50.000 60.354 140.000 10.047 60.000 30.439 130.787 30.252 40.000 10.512 160.507 160.158 160.717 120.000 100.000 10.337 120.483 130.570 10.853 80.614 70.121 110.070 90.229 70.000 10.732 110.773 20.193 30.000 100.606 70.791 50.000 10.593 90.000 140.010 50.000 10.574 160.000 20.000 10.507 120.554 90.361 110.136 130.608 30.948 70.304 80.593 100.533 80.011 160.634 120.000 10.060 30.821 40.613 130.797 100.000 10.799 110.036 130.782 140.000 10.609 70.423 30.133 170.647 120.213 15
PTv3 ScanNet2000.393 30.592 30.330 20.216 30.520 30.109 50.108 160.000 30.337 10.000 10.310 120.394 90.494 110.753 90.848 20.256 30.717 80.000 30.842 40.192 50.065 30.449 100.346 40.546 60.190 130.000 90.384 70.000 10.000 30.218 40.505 20.791 30.000 10.136 40.000 30.903 20.073 120.687 60.000 70.168 20.551 50.387 70.941 30.000 10.000 20.397 120.654 30.000 100.714 50.759 150.752 70.118 40.264 40.926 30.000 10.048 60.575 50.000 70.597 20.366 20.755 10.469 20.474 30.798 20.140 100.617 30.692 70.000 70.592 40.971 20.188 40.000 10.133 90.593 20.349 10.650 30.717 80.699 30.455 20.790 20.523 40.636 10.301 10.000 10.622 20.000 110.017 150.259 30.000 40.921 30.337 10.733 20.210 40.514 20.860 80.407 10.000 10.688 20.109 80.000 140.000 40.000 10.151 50.671 80.782 20.115 130.641 20.903 20.349 10.616 40.088 70.832 80.000 60.480 20.000 10.428 10.000 30.497 100.000 50.000 90.000 10.662 30.690 20.612 10.828 10.575 10.000 10.404 70.644 20.325 70.887 40.728 10.009 160.134 70.026 170.000 10.761 30.731 40.172 60.077 40.528 80.727 70.000 10.603 50.220 50.022 30.000 10.740 10.000 20.000 10.661 40.586 20.566 40.436 40.531 50.978 30.457 20.708 30.583 60.141 70.748 30.000 10.026 50.822 30.871 40.879 50.000 10.851 20.405 20.914 10.000 10.682 30.000 150.281 40.738 30.463 6
Xiaoyang Wu, Li Jiang, Peng-Shuai Wang, Zhijian Liu, Xihui Liu, Yu Qiao, Wanli Ouyang, Tong He, Hengshuang Zhao: Point Transformer V3: Simpler, Faster, Stronger. CVPR 2024 (Oral)
PonderV2 ScanNet2000.346 60.552 80.270 80.175 90.497 70.070 120.239 70.000 30.000 70.000 10.232 170.412 80.584 20.842 30.804 50.212 70.540 100.000 30.433 160.106 100.000 90.590 50.290 120.548 50.243 70.000 90.356 110.000 10.000 30.062 100.398 130.441 100.000 10.104 100.000 30.888 50.076 110.682 90.030 30.094 70.491 110.351 120.869 100.000 10.063 10.403 110.700 20.000 100.660 130.881 90.761 30.050 80.186 100.852 130.000 10.007 90.570 80.100 20.565 30.326 60.641 100.431 60.290 140.621 60.259 30.408 110.622 100.125 20.082 120.950 50.179 50.000 10.263 60.424 50.193 90.558 70.880 40.545 130.375 70.727 30.445 120.499 80.000 40.000 10.475 70.002 90.034 60.083 80.000 40.924 20.290 40.636 60.115 140.400 50.874 40.186 100.000 10.611 80.128 30.113 20.000 40.000 10.000 110.584 120.636 100.103 140.385 100.843 60.283 40.603 60.080 80.825 100.000 60.377 100.000 10.000 80.000 30.457 110.000 50.000 90.000 10.574 120.608 90.481 40.792 50.394 50.000 10.357 100.503 110.261 100.817 130.504 120.304 70.472 40.115 110.000 10.750 70.677 90.202 20.000 100.509 90.729 60.000 10.519 120.000 140.000 70.000 10.620 120.000 20.000 10.660 60.560 70.486 60.384 60.346 100.952 50.247 140.667 40.436 120.269 30.691 60.000 10.010 70.787 100.889 30.880 40.000 10.810 70.336 40.860 80.000 10.606 80.009 110.248 90.681 70.392 9
Haoyi Zhu, Honghui Yang, Xiaoyang Wu, Di Huang, Sha Zhang, Xianglong He, Tong He, Hengshuang Zhao, Chunhua Shen, Yu Qiao, Wanli Ouyang: PonderV2: Pave the Way for 3D Foundataion Model with A Universal Pre-training Paradigm.
ODIN - Sem200permissive0.368 40.562 40.297 40.207 40.380 170.196 10.828 20.000 30.321 20.000 10.400 50.775 10.460 130.501 170.769 120.065 150.870 30.000 30.913 10.213 30.000 90.000 170.389 20.554 40.312 30.000 90.591 10.000 10.000 30.491 10.487 30.894 20.000 10.378 20.303 10.796 170.088 60.669 130.081 10.216 10.256 170.334 130.898 70.000 10.000 20.370 140.599 100.000 100.581 160.988 20.749 80.090 60.242 50.921 40.000 10.202 50.609 20.000 70.655 10.214 130.654 90.346 150.408 70.485 90.169 80.631 20.704 60.000 70.814 10.940 100.127 160.000 10.000 120.462 40.227 60.641 40.885 30.657 50.434 30.000 170.550 20.393 150.000 40.000 10.590 40.000 110.048 20.077 90.000 40.784 160.131 100.557 100.316 20.359 80.833 140.373 20.000 10.661 40.108 90.001 120.000 40.000 10.301 30.612 110.565 150.129 100.482 80.468 160.274 50.561 80.376 10.912 20.181 10.440 60.000 10.166 40.000 30.641 50.000 50.426 20.000 10.642 50.626 70.259 110.787 80.429 40.000 10.589 10.523 80.246 110.857 60.000 170.228 90.000 110.265 40.000 10.752 60.832 10.090 160.157 10.791 10.578 160.000 10.373 150.539 10.000 70.000 10.685 50.000 20.000 10.632 80.575 30.663 10.152 110.358 90.926 130.397 30.454 150.610 40.119 150.685 70.000 10.000 120.803 80.740 90.441 140.000 10.800 100.000 170.871 30.000 10.220 170.487 10.862 10.682 60.054 17
Ayush Jain, Pushkal Katara, Nikolaos Gkanatsios, Adam W. Harley, Gabriel Sarch, Kriti Aggarwal, Vishrav Chaudhary, Katerina Fragkiadaki: ODIN: A Single Model for 2D and 3D Segmentation. CVPR 2024
ALS-MinkowskiNetcopyleft0.414 20.610 20.322 30.271 20.542 20.153 30.159 110.000 30.000 70.000 10.404 40.503 50.532 60.672 160.804 50.285 10.888 20.000 30.900 20.226 20.087 20.598 40.342 50.671 10.217 100.087 30.449 40.000 10.000 30.253 30.477 61.000 10.000 10.118 50.000 30.905 10.071 130.710 20.076 20.047 160.665 10.376 80.981 10.000 10.000 20.466 70.632 70.113 40.769 10.956 40.795 20.031 90.314 10.936 10.000 10.390 20.601 30.000 70.458 80.366 20.719 30.440 50.564 10.699 40.314 10.464 70.784 20.200 10.283 60.973 10.142 90.000 10.250 70.285 60.220 70.718 10.752 60.723 20.460 10.248 150.475 100.463 130.000 40.000 10.446 80.021 50.025 110.285 10.000 40.972 10.149 80.769 10.230 30.535 10.879 20.252 80.000 10.693 10.129 20.000 140.000 40.000 10.447 10.958 10.662 90.159 20.598 30.780 110.344 20.646 30.106 60.893 30.135 30.455 30.000 10.194 30.259 10.726 30.475 40.000 90.000 10.741 10.865 10.571 20.817 30.445 30.000 10.506 20.630 30.230 120.916 20.728 10.635 11.000 10.252 60.000 10.804 20.697 70.137 110.043 70.717 20.807 30.000 10.510 130.245 20.000 70.000 10.709 30.000 20.000 10.703 20.572 40.646 20.223 100.531 50.984 10.397 30.813 10.798 10.135 120.800 10.000 10.097 20.832 20.752 80.842 70.000 10.852 10.149 90.846 100.000 10.666 50.359 50.252 80.777 10.690 2
Guangda Ji, Silvan Weder, Francis Engelmann, Marc Pollefeys, Hermann Blum: ARKit LabelMaker: A New Scale for Indoor 3D Scene Understanding. CVPR 2025
BFANet ScanNet200permissive0.360 50.553 70.293 50.193 50.483 100.096 60.266 60.000 30.000 70.000 10.298 130.255 120.661 10.810 50.810 30.194 100.785 70.000 30.000 170.161 60.000 90.494 90.382 30.574 30.258 50.000 90.372 90.000 10.000 30.043 140.436 80.000 110.000 10.239 30.000 30.901 30.105 10.689 40.025 40.128 40.614 20.436 10.493 170.000 10.000 20.526 40.546 130.109 50.651 140.953 50.753 60.101 50.143 130.897 50.000 10.431 10.469 150.000 70.522 60.337 50.661 60.459 30.409 60.666 50.102 140.508 60.757 40.000 70.060 140.970 30.497 10.000 10.376 30.511 30.262 40.688 20.921 20.617 100.321 120.590 60.491 90.556 40.000 40.000 10.481 50.093 10.043 30.284 20.000 40.875 140.135 90.669 40.124 130.394 60.849 110.298 40.000 10.476 170.088 130.042 70.000 40.000 10.254 40.653 100.741 60.215 10.573 50.852 50.266 100.654 20.056 120.835 60.000 60.492 10.000 10.000 80.000 30.612 90.000 50.000 90.000 10.616 60.469 170.460 50.698 140.516 20.000 10.378 80.563 40.476 40.863 50.574 90.330 60.000 110.282 30.000 10.760 40.710 50.233 10.000 100.641 50.814 20.000 10.585 100.053 110.000 70.000 10.629 100.000 20.000 10.678 30.528 130.534 50.129 140.596 40.973 40.264 120.772 20.526 100.139 90.707 40.000 10.000 120.764 140.591 160.848 60.000 10.827 40.338 30.806 120.000 10.568 90.151 100.358 20.659 100.510 4
Weiguang Zhao, Rui Zhang, Qiufeng Wang, Guangliang Cheng, Kaizhu Huang: BFANet: Revisiting 3D Semantic Segmentation with Boundary Feature Analysis. CVPR 2025
DITR0.449 10.629 10.392 10.289 10.650 10.168 20.862 10.000 30.313 30.000 10.580 10.568 20.564 30.766 70.867 10.238 50.949 10.000 30.866 30.300 10.000 90.664 10.482 10.508 120.317 10.420 10.551 20.000 10.000 30.486 20.519 10.662 40.000 10.385 10.000 30.901 30.079 90.727 10.000 70.160 30.606 30.417 40.967 20.000 10.000 20.498 50.596 110.130 20.728 30.998 10.805 10.000 170.314 10.934 20.000 10.278 40.636 10.000 70.403 120.367 10.741 20.484 10.500 21.000 10.113 120.828 10.815 10.000 70.733 20.969 40.374 20.000 10.579 11.000 10.230 50.617 50.983 10.729 10.423 40.855 10.508 60.622 20.018 30.000 10.591 30.034 40.028 100.066 110.869 10.904 70.334 20.651 50.716 10.514 20.871 60.315 30.000 10.664 30.128 30.014 100.000 40.000 10.392 20.851 20.817 10.153 30.823 10.991 10.318 30.680 10.134 30.913 10.157 20.448 40.000 10.000 80.000 30.826 10.978 10.091 60.000 10.660 40.647 30.571 20.804 40.001 90.000 10.480 30.700 10.421 50.947 10.433 140.411 30.148 60.262 50.000 10.849 10.709 60.138 100.150 20.714 30.889 10.000 10.698 10.222 40.000 70.000 10.720 20.000 20.000 10.805 10.600 10.642 30.268 90.904 10.982 20.477 10.632 60.718 20.139 90.776 20.000 10.178 10.886 10.962 10.839 80.000 10.851 20.043 120.869 40.000 10.710 10.315 60.348 30.753 20.397 8
Karim Abou Zeid, Kadir Yilmaz, Daan de Geus, Alexander Hermans, David Adrian, Timm Linder, Bastian Leibe: DINO in the Room: Leveraging 2D Foundation Models for 3D Segmentation.
Minkowski 34Dpermissive0.253 160.463 160.154 170.102 160.381 160.084 80.134 150.000 30.000 70.000 10.386 70.141 170.279 170.737 120.703 160.014 170.164 150.000 30.663 100.092 140.000 90.224 150.291 110.531 80.056 170.000 90.242 160.000 10.000 30.013 150.331 160.000 110.000 10.035 170.001 20.858 140.059 140.650 160.000 70.056 140.353 150.299 150.670 130.000 10.000 20.284 160.484 150.071 80.594 150.720 160.710 150.027 110.068 170.813 140.000 10.005 100.492 140.164 10.274 160.111 160.571 160.307 170.293 130.307 170.150 90.163 170.531 160.002 60.545 50.932 150.093 170.000 10.000 120.002 140.159 150.368 170.581 150.440 170.228 170.406 90.282 170.294 160.000 40.000 10.189 160.060 20.036 50.000 120.000 40.897 110.000 170.525 140.025 170.205 170.771 170.000 120.000 10.593 110.108 90.044 60.000 40.000 10.000 110.282 170.589 140.094 160.169 160.466 170.227 160.419 170.125 50.757 140.002 40.334 160.000 10.000 80.000 30.357 150.000 50.000 90.000 10.582 100.513 140.337 100.612 170.000 100.000 10.250 160.352 170.136 170.724 160.655 40.280 80.000 110.046 160.000 10.606 170.559 150.159 70.102 30.445 100.655 90.000 10.310 170.117 60.000 70.000 10.581 150.026 10.000 10.265 170.483 160.084 170.097 170.044 150.865 170.142 170.588 110.351 150.272 20.596 170.000 10.003 100.622 160.720 100.096 170.000 10.771 160.016 150.772 150.000 10.302 140.194 90.214 120.621 160.197 16
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019


This table lists the benchmark results for the ScanNet200 3D semantic instance scenario.




Method Infoavgalarm clockarmchairbackpackbagballbarbasketbathroom cabinetbathroom counterbathroom stallbathroom stall doorbathroom vanitybathtubbedbenchbicyclebinblackboardblanketblindsboardbookbookshelfbottlebowlboxbroombucketbulletin boardcabinetcalendarcandlecartcase of water bottlescd caseceilingceiling lightchairclockclosetcloset doorcloset rodcloset wallclothesclothes dryercoat rackcoffee kettlecoffee makercoffee tablecolumncomputer towercontainercopiercouchcountercratecupcurtaincushiondecorationdeskdining tabledish rackdishwasherdividerdoordoorframedresserdumbbelldustpanend tablefanfile cabinetfire alarmfire extinguisherfireplacefolded chairfurnitureguitarguitar casehair dryerhandicap barhatheadphonesironing boardjacketkeyboardkeyboard pianokitchen cabinetkitchen counterladderlamplaptoplaundry basketlaundry detergentlaundry hamperledgelightlight switchluggagemachinemailboxmatmattressmicrowavemini fridgemirrormonitormousemusic standnightstandobjectoffice chairottomanovenpaperpaper bagpaper cutterpaper towel dispenserpaper towel rollpersonpianopicturepillarpillowpipeplantplateplungerposterpotted plantpower outletpower stripprinterprojectorprojector screenpurserackradiatorrailrange hoodrecycling binrefrigeratorscaleseatshelfshoeshowershower curtainshower curtain rodshower doorshower floorshower headshower wallsignsinksoap dishsoap dispensersofa chairspeakerstair railstairsstandstoolstorage binstorage containerstorage organizerstovestructurestuffed animalsuitcasetabletelephonetissue boxtoastertoaster oventoilettoilet papertoilet paper dispensertoilet paper holdertoilet seat cover dispensertoweltrash bintrash cantraytubetvtv standvacuum cleanerventwardrobewashing machinewater bottlewater coolerwater pitcherwhiteboardwindowwindowsill
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Mask3D Scannet2000.388 10.542 10.357 20.237 20.610 10.091 20.125 60.000 10.000 20.000 10.065 30.668 10.451 11.000 10.955 10.640 10.500 20.039 10.125 30.063 30.409 10.311 20.291 10.609 40.266 20.000 10.163 20.000 10.008 10.044 30.496 21.000 10.000 10.018 30.000 20.756 10.573 10.808 20.000 20.010 20.042 40.130 40.552 20.042 10.000 11.000 10.725 40.750 10.883 11.000 10.832 40.024 30.107 20.614 30.226 10.250 10.628 20.792 10.677 30.400 10.741 20.278 20.511 20.077 60.111 20.313 30.715 20.302 10.017 40.200 20.000 10.188 10.000 20.178 30.736 21.000 10.615 10.514 10.409 20.380 60.600 10.000 10.000 10.400 20.013 20.254 10.381 10.000 10.123 50.400 10.839 20.258 20.463 10.926 10.265 20.000 10.857 20.099 10.021 20.500 10.027 10.028 21.000 10.502 60.016 20.076 50.500 10.612 10.578 10.005 30.597 30.194 20.497 10.000 10.500 10.000 20.323 50.000 11.000 10.000 10.748 10.708 20.050 50.890 21.000 10.008 20.151 40.301 21.000 11.000 10.792 30.945 11.000 10.511 10.004 20.753 10.776 30.287 20.020 20.003 50.974 30.033 10.412 60.000 20.000 20.000 20.667 20.000 10.000 10.491 20.676 20.352 20.335 10.060 30.822 60.527 31.000 10.517 20.606 10.853 20.000 10.004 10.806 11.000 10.727 10.000 10.042 20.739 20.000 10.399 30.391 10.504 20.591 10.571 1
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
TD3D Scannet200permissive0.320 30.501 30.264 30.164 30.506 30.062 30.500 10.000 10.000 20.000 10.208 10.431 30.252 41.000 10.733 40.587 20.000 30.008 20.000 40.106 10.000 20.356 10.123 50.686 10.101 30.000 10.152 30.000 10.000 20.226 20.280 40.000 30.000 10.250 20.000 20.619 20.061 40.841 10.000 20.000 30.167 20.194 20.333 30.000 20.000 10.667 20.820 10.250 30.790 41.000 10.879 20.077 10.094 40.708 10.217 20.049 30.634 10.792 10.331 50.033 60.716 30.159 30.396 30.331 50.099 30.415 10.842 10.000 20.458 20.542 10.000 10.101 20.000 20.218 20.513 30.500 30.458 30.104 30.516 10.456 10.268 50.000 10.000 10.400 20.022 10.233 20.143 30.000 10.677 10.400 10.504 60.095 40.083 60.890 20.061 30.000 10.906 10.076 20.231 10.125 30.000 20.003 30.792 40.881 10.000 30.098 40.125 50.498 50.459 30.063 10.715 20.000 30.241 40.000 10.396 20.063 10.605 20.000 10.000 30.000 10.448 60.629 40.202 30.967 10.250 30.038 10.192 20.185 30.083 41.000 11.000 10.857 20.000 20.470 20.012 10.565 40.798 20.621 10.111 10.500 11.000 10.017 20.509 20.000 20.008 11.000 10.525 30.000 10.000 10.332 40.679 10.264 30.333 20.267 11.000 10.549 20.299 60.387 30.328 30.744 50.000 10.000 20.435 61.000 10.283 50.000 10.196 10.817 10.000 10.472 10.222 40.123 50.560 20.156 3
Maksim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: Top-Down Beats Bottom-Up in 3D Instance Segmentation. WACV 2024
ODIN - Ins200permissive0.381 20.507 20.375 10.237 10.484 40.108 10.500 10.000 10.125 10.000 10.058 40.647 20.385 20.667 40.853 20.542 31.000 10.000 31.000 10.093 20.000 20.028 50.274 20.682 20.550 10.000 10.269 10.000 10.000 20.714 10.566 11.000 10.000 10.500 10.125 10.585 30.066 30.653 60.083 10.049 10.264 10.227 10.667 10.000 20.000 10.278 60.723 50.250 30.786 51.000 10.744 60.039 20.209 10.494 50.000 30.250 10.446 30.500 30.750 10.200 30.780 10.333 10.602 10.469 30.163 10.406 20.530 40.000 20.668 10.200 20.000 10.000 30.500 10.313 10.769 11.000 10.511 20.196 20.286 30.393 50.337 20.000 10.000 10.600 10.000 30.174 30.226 20.000 10.579 20.200 30.887 10.750 10.428 20.782 30.438 10.000 10.795 30.063 30.003 30.500 10.000 20.333 11.000 10.742 20.083 10.585 10.417 40.448 60.496 20.055 20.734 10.472 10.174 50.000 10.250 30.000 20.688 10.000 11.000 10.000 10.631 30.667 30.275 10.694 61.000 10.000 30.328 10.422 10.000 51.000 10.500 40.638 30.000 20.391 30.000 30.582 30.800 10.208 50.000 30.246 20.667 50.000 30.638 10.167 10.000 20.000 20.778 10.000 10.000 10.563 10.614 30.841 10.333 20.250 20.938 50.569 10.500 40.695 10.264 40.863 10.000 10.000 20.550 51.000 10.668 20.000 10.000 30.667 30.000 10.333 40.333 20.665 10.434 30.264 2
Minkowski 34D Inst.permissive0.203 60.369 50.134 60.078 60.479 50.003 50.500 10.000 10.000 20.000 10.100 20.371 40.300 30.667 40.746 30.400 40.000 30.000 30.000 40.031 40.000 20.074 40.165 40.413 60.000 50.000 10.070 50.000 10.000 20.000 40.221 60.000 30.000 10.000 40.000 20.372 60.070 20.706 40.000 20.000 30.000 60.123 50.033 60.000 20.000 10.422 50.732 30.000 50.778 61.000 10.845 30.000 40.090 50.636 20.000 30.000 40.158 50.000 40.250 60.050 50.693 40.123 50.051 60.385 40.009 50.118 60.406 60.000 20.000 50.200 20.000 10.000 30.000 20.133 50.307 60.500 30.251 50.000 50.281 40.402 40.317 30.000 10.000 10.000 40.000 30.060 50.000 40.000 10.396 30.200 30.669 30.021 50.218 50.720 60.000 40.000 10.696 40.025 50.000 40.000 40.000 20.000 40.125 60.596 30.000 30.191 20.500 10.595 20.369 50.000 40.500 50.000 30.143 60.000 10.000 40.000 20.226 60.000 10.000 30.000 10.701 20.511 50.000 60.851 40.000 40.000 30.150 50.052 60.100 30.981 40.500 40.286 40.000 20.000 60.000 30.545 50.522 60.250 30.000 30.000 60.522 60.000 30.500 30.000 20.000 20.000 20.282 60.000 10.000 10.178 60.382 50.018 60.056 50.000 40.997 30.107 60.677 20.313 50.000 50.726 60.000 10.000 20.583 40.903 50.200 60.000 10.000 30.333 50.000 10.442 20.083 50.109 60.387 50.000 6
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019
CSC-Pretrain Inst.permissive0.209 50.361 60.157 50.085 50.506 20.007 40.500 10.000 10.000 20.000 10.000 60.093 60.221 50.667 40.524 60.400 40.000 30.000 30.000 40.004 50.000 20.000 60.109 60.589 50.000 50.000 10.059 60.000 10.000 20.000 40.322 30.000 30.000 10.000 40.000 20.405 40.055 50.700 50.000 20.000 30.028 50.091 60.083 40.000 20.000 10.667 20.768 20.000 50.807 31.000 10.776 50.000 40.000 60.340 60.000 30.000 40.103 60.000 40.750 10.200 30.634 60.053 60.246 40.677 20.006 60.198 40.432 50.000 20.000 50.050 50.000 10.000 30.000 20.111 60.356 50.500 30.188 60.000 50.220 50.448 20.050 60.000 10.000 10.000 40.000 30.032 60.000 40.000 10.396 30.000 50.573 50.000 60.228 40.747 50.000 40.000 10.573 60.021 60.000 40.000 40.000 20.000 40.500 50.573 40.000 30.000 60.125 50.592 30.364 60.000 40.450 60.000 30.364 20.000 10.000 40.000 20.340 40.000 10.000 30.000 10.610 40.833 10.221 20.702 50.000 40.000 30.135 60.094 50.125 20.571 50.500 40.143 60.000 20.125 40.000 30.618 20.667 50.115 60.000 30.125 31.000 10.000 30.500 30.000 20.000 20.000 20.502 50.000 10.000 10.312 50.248 60.050 50.000 60.000 40.997 30.420 40.500 40.149 60.451 20.748 30.000 10.000 20.636 30.667 60.600 30.000 10.000 30.278 60.000 10.333 40.000 60.294 30.381 60.110 4
Ji Hou, Benjamin Graham, Matthias Nießner, Saining Xie: Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts. CVPR 2021
LGround Inst.permissive0.246 40.413 40.170 40.130 40.455 60.003 60.500 10.000 10.000 20.000 10.017 50.333 50.111 61.000 10.681 50.400 40.000 30.000 31.000 10.003 60.000 20.167 30.190 30.637 30.067 40.000 10.081 40.000 10.000 20.000 40.264 50.000 30.000 10.000 40.000 20.387 50.031 60.754 30.000 20.000 30.151 30.135 30.056 50.000 20.000 10.582 40.589 60.500 20.815 21.000 10.903 10.000 40.097 30.588 40.000 30.000 40.234 40.000 40.500 40.400 10.682 50.156 40.159 50.750 10.046 40.125 50.660 30.000 20.200 30.000 60.000 10.000 30.000 20.164 40.402 40.500 30.373 40.025 40.143 60.426 30.317 30.000 10.000 10.000 40.000 30.063 40.000 40.000 10.000 60.000 50.575 40.250 30.241 30.772 40.000 40.000 10.653 50.034 40.000 40.000 40.000 20.000 41.000 10.561 50.000 30.100 30.500 10.541 40.452 40.000 40.581 40.000 30.364 20.000 10.000 40.000 20.571 30.000 10.000 30.000 10.568 50.511 50.167 40.857 30.000 40.000 30.164 30.112 40.000 50.530 61.000 10.286 40.000 20.125 40.000 30.464 60.706 40.208 40.000 30.125 30.744 40.000 30.500 30.000 20.000 20.000 20.511 40.000 10.000 10.344 30.541 40.068 40.333 20.000 41.000 10.196 50.533 30.318 40.000 50.748 40.000 10.000 20.690 21.000 10.400 40.000 10.000 30.667 30.000 10.333 40.333 20.270 40.399 40.083 5
David Rozenberszki, Or Litany, Angela Dai: Language-Grounded Indoor 3D Semantic Segmentation in the Wild.


ScanNet Benchmark

This table lists the benchmark results for the 3D semantic label scenario.


Method Infoavg ioubathtubbedbookshelfcabinetchaircountercurtaindeskdoorfloorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwallwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
PTv3-PPT-ALCcopyleft0.798 10.911 110.812 220.854 80.770 120.856 150.555 170.943 10.660 260.735 20.979 10.606 70.492 10.792 40.934 40.841 20.819 60.716 90.947 100.906 10.822 1
Guangda Ji, Silvan Weder, Francis Engelmann, Marc Pollefeys, Hermann Blum: ARKit LabelMaker: A New Scale for Indoor 3D Scene Understanding. CVPR 2025
DITR ScanNet0.797 20.727 760.869 10.882 10.785 60.868 70.578 50.943 10.744 10.727 30.979 10.627 20.364 90.824 10.949 20.779 150.844 10.757 10.982 10.905 20.802 3
Karim Abou Zeid, Kadir Yilmaz, Daan de Geus, Alexander Hermans, David Adrian, Timm Linder, Bastian Leibe: DINO in the Room: Leveraging 2D Foundation Models for 3D Segmentation.
PTv3 ScanNet0.794 30.941 30.813 210.851 110.782 70.890 20.597 10.916 60.696 110.713 50.979 10.635 10.384 30.793 30.907 100.821 50.790 360.696 140.967 40.903 30.805 2
Xiaoyang Wu, Li Jiang, Peng-Shuai Wang, Zhijian Liu, Xihui Liu, Yu Qiao, Wanli Ouyang, Tong He, Hengshuang Zhao: Point Transformer V3: Simpler, Faster, Stronger. CVPR 2024 (Oral)
PonderV20.785 40.978 10.800 300.833 290.788 40.853 200.545 210.910 90.713 30.705 60.979 10.596 90.390 20.769 150.832 450.821 50.792 350.730 20.975 20.897 60.785 7
Haoyi Zhu, Honghui Yang, Xiaoyang Wu, Di Huang, Sha Zhang, Xianglong He, Tong He, Hengshuang Zhao, Chunhua Shen, Yu Qiao, Wanli Ouyang: PonderV2: Pave the Way for 3D Foundataion Model with A Universal Pre-training Paradigm.
Mix3Dpermissive0.781 50.964 20.855 20.843 200.781 80.858 130.575 80.831 390.685 170.714 40.979 10.594 100.310 300.801 20.892 190.841 20.819 60.723 60.940 150.887 80.725 28
Alexey Nekrasov, Jonas Schult, Or Litany, Bastian Leibe, Francis Engelmann: Mix3D: Out-of-Context Data Augmentation for 3D Scenes. 3DV 2021 (Oral)
Swin3Dpermissive0.779 60.861 230.818 160.836 260.790 30.875 40.576 70.905 100.704 70.739 10.969 120.611 30.349 120.756 250.958 10.702 510.805 190.708 100.916 390.898 50.801 4
TTT-KD0.773 70.646 970.818 160.809 410.774 100.878 30.581 30.943 10.687 150.704 70.978 60.607 60.336 190.775 110.912 80.838 40.823 40.694 150.967 40.899 40.794 6
Lisa Weijler, Muhammad Jehanzeb Mirza, Leon Sick, Can Ekkazan, Pedro Hermosilla: TTT-KD: Test-Time Training for 3D Semantic Segmentation through Knowledge Distillation from Foundation Models.
ResLFE_HDS0.772 80.939 40.824 70.854 80.771 110.840 350.564 130.900 120.686 160.677 140.961 180.537 360.348 130.769 150.903 120.785 130.815 90.676 260.939 160.880 130.772 11
PPT-SpUNet-Joint0.766 90.932 50.794 360.829 310.751 260.854 180.540 250.903 110.630 390.672 170.963 160.565 260.357 100.788 50.900 140.737 310.802 200.685 200.950 80.887 80.780 8
Xiaoyang Wu, Zhuotao Tian, Xin Wen, Bohao Peng, Xihui Liu, Kaicheng Yu, Hengshuang Zhao: Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt Training. CVPR 2024
OctFormerpermissive0.766 90.925 70.808 260.849 130.786 50.846 300.566 120.876 190.690 130.674 160.960 190.576 220.226 730.753 270.904 110.777 160.815 90.722 70.923 310.877 160.776 10
Peng-Shuai Wang: OctFormer: Octree-based Transformers for 3D Point Clouds. SIGGRAPH 2023
CU-Hybrid Net0.764 110.924 80.819 140.840 230.757 210.853 200.580 40.848 310.709 50.643 270.958 230.587 160.295 380.753 270.884 230.758 230.815 90.725 50.927 270.867 270.743 19
OccuSeg+Semantic0.764 110.758 610.796 340.839 240.746 300.907 10.562 140.850 300.680 190.672 170.978 60.610 40.335 210.777 90.819 490.847 10.830 30.691 170.972 30.885 100.727 26
O-CNNpermissive0.762 130.924 80.823 80.844 190.770 120.852 220.577 60.847 330.711 40.640 310.958 230.592 110.217 790.762 200.888 200.758 230.813 130.726 40.932 250.868 260.744 18
Peng-Shuai Wang, Yang Liu, Yu-Xiao Guo, Chun-Yu Sun, Xin Tong: O-CNN: Octree-based Convolutional Neural Networks for 3D Shape Analysis. SIGGRAPH 2017
DiffSegNet0.758 140.725 780.789 410.843 200.762 170.856 150.562 140.920 40.657 290.658 210.958 230.589 140.337 180.782 60.879 240.787 110.779 410.678 220.926 290.880 130.799 5
DTC0.757 150.843 290.820 120.847 160.791 20.862 110.511 380.870 220.707 60.652 230.954 400.604 80.279 490.760 210.942 30.734 320.766 500.701 130.884 610.874 220.736 20
OA-CNN-L_ScanNet200.756 160.783 470.826 60.858 60.776 90.837 390.548 200.896 150.649 310.675 150.962 170.586 170.335 210.771 140.802 540.770 190.787 380.691 170.936 200.880 130.761 13
ConDaFormer0.755 170.927 60.822 100.836 260.801 10.849 250.516 350.864 270.651 300.680 130.958 230.584 190.282 460.759 230.855 350.728 340.802 200.678 220.880 660.873 230.756 16
Lunhao Duan, Shanshan Zhao, Nan Xue, Mingming Gong, Guisong Xia, Dacheng Tao: ConDaFormer : Disassembled Transformer with Local Structure Enhancement for 3D Point Cloud Understanding. Neurips, 2023
LSK3DNetpermissive0.755 170.899 160.823 80.843 200.764 160.838 380.584 20.845 340.717 20.638 330.956 300.580 210.229 720.640 490.900 140.750 260.813 130.729 30.920 350.872 240.757 14
Tuo Feng, Wenguan Wang, Fan Ma, Yi Yang: LSK3DNet: Towards Effective and Efficient 3D Perception with Large Sparse Kernels. CVPR 2024
PNE0.755 170.786 450.835 50.834 280.758 190.849 250.570 100.836 380.648 320.668 190.978 60.581 200.367 70.683 400.856 330.804 80.801 240.678 220.961 60.889 70.716 35
P. Hermosilla: Point Neighborhood Embeddings.
PointTransformerV20.752 200.742 680.809 250.872 20.758 190.860 120.552 180.891 170.610 460.687 80.960 190.559 300.304 330.766 180.926 60.767 200.797 280.644 380.942 130.876 190.722 31
Xiaoyang Wu, Yixing Lao, Li Jiang, Xihui Liu, Hengshuang Zhao: Point Transformer V2: Grouped Vector Attention and Partition-based Pooling. NeurIPS 2022
DMF-Net0.752 200.906 140.793 380.802 470.689 460.825 520.556 160.867 230.681 180.602 500.960 190.555 320.365 80.779 80.859 300.747 270.795 320.717 80.917 380.856 350.764 12
C.Yang, Y.Yan, W.Zhao, J.Ye, X.Yang, A.Hussain, B.Dong, K.Huang: Towards Deeper and Better Multi-view Feature Fusion for 3D Semantic Segmentation. ICONIP 2023
PointConvFormer0.749 220.793 430.790 390.807 430.750 280.856 150.524 310.881 180.588 580.642 300.977 100.591 120.274 520.781 70.929 50.804 80.796 290.642 390.947 100.885 100.715 36
Wenxuan Wu, Qi Shan, Li Fuxin: PointConvFormer: Revenge of the Point-based Convolution.
BPNetcopyleft0.749 220.909 120.818 160.811 390.752 240.839 370.485 530.842 350.673 210.644 260.957 280.528 420.305 320.773 120.859 300.788 100.818 80.693 160.916 390.856 350.723 30
Wenbo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia, Tien-Tsin Wong: Bidirectional Projection Network for Cross Dimension Scene Understanding. CVPR 2021 (Oral)
MSP0.748 240.623 1000.804 280.859 50.745 310.824 540.501 420.912 80.690 130.685 100.956 300.567 250.320 270.768 170.918 70.720 390.802 200.676 260.921 330.881 120.779 9
StratifiedFormerpermissive0.747 250.901 150.803 290.845 180.757 210.846 300.512 370.825 420.696 110.645 250.956 300.576 220.262 630.744 330.861 290.742 290.770 480.705 110.899 510.860 320.734 21
Xin Lai*, Jianhui Liu*, Li Jiang, Liwei Wang, Hengshuang Zhao, Shu Liu, Xiaojuan Qi, Jiaya Jia: Stratified Transformer for 3D Point Cloud Segmentation. CVPR 2022
Virtual MVFusion0.746 260.771 550.819 140.848 150.702 430.865 100.397 910.899 130.699 90.664 200.948 620.588 150.330 230.746 320.851 390.764 210.796 290.704 120.935 210.866 280.728 24
Abhijit Kundu, Xiaoqi Yin, Alireza Fathi, David Ross, Brian Brewington, Thomas Funkhouser, Caroline Pantofaru: Virtual Multi-view Fusion for 3D Semantic Segmentation. ECCV 2020
VMNetpermissive0.746 260.870 210.838 30.858 60.729 360.850 240.501 420.874 200.587 590.658 210.956 300.564 270.299 350.765 190.900 140.716 420.812 150.631 440.939 160.858 330.709 37
Zeyu HU, Xuyang Bai, Jiaxiang Shang, Runze Zhang, Jiayu Dong, Xin Wang, Guangyuan Sun, Hongbo Fu, Chiew-Lan Tai: VMNet: Voxel-Mesh Network for Geodesic-Aware 3D Semantic Segmentation. ICCV 2021 (Oral)
DiffSeg3D20.745 280.725 780.814 200.837 250.751 260.831 460.514 360.896 150.674 200.684 110.960 190.564 270.303 340.773 120.820 480.713 450.798 270.690 190.923 310.875 200.757 14
ODINpermissive0.744 290.658 930.752 640.870 30.714 400.843 330.569 110.919 50.703 80.622 400.949 590.591 120.343 150.736 340.784 560.816 70.838 20.672 310.918 370.854 390.725 28
Ayush Jain, Pushkal Katara, Nikolaos Gkanatsios, Adam W. Harley, Gabriel Sarch, Kriti Aggarwal, Vishrav Chaudhary, Katerina Fragkiadaki: ODIN: A Single Model for 2D and 3D Segmentation. CVPR 2024
Retro-FPN0.744 290.842 300.800 300.767 610.740 320.836 410.541 230.914 70.672 220.626 370.958 230.552 330.272 540.777 90.886 220.696 520.801 240.674 290.941 140.858 330.717 33
Peng Xiang*, Xin Wen*, Yu-Shen Liu, Hui Zhang, Yi Fang, Zhizhong Han: Retrospective Feature Pyramid Network for Point Cloud Semantic Segmentation. ICCV 2023
EQ-Net0.743 310.620 1010.799 330.849 130.730 350.822 560.493 500.897 140.664 230.681 120.955 340.562 290.378 40.760 210.903 120.738 300.801 240.673 300.907 430.877 160.745 17
Zetong Yang*, Li Jiang*, Yanan Sun, Bernt Schiele, Jiaya JIa: A Unified Query-based Paradigm for Point Cloud Understanding. CVPR 2022
SAT0.742 320.860 240.765 550.819 340.769 140.848 270.533 270.829 400.663 240.631 360.955 340.586 170.274 520.753 270.896 170.729 330.760 560.666 330.921 330.855 370.733 22
LRPNet0.742 320.816 380.806 270.807 430.752 240.828 500.575 80.839 370.699 90.637 340.954 400.520 460.320 270.755 260.834 430.760 220.772 450.676 260.915 410.862 300.717 33
LargeKernel3D0.739 340.909 120.820 120.806 450.740 320.852 220.545 210.826 410.594 570.643 270.955 340.541 350.263 620.723 380.858 320.775 180.767 490.678 220.933 230.848 430.694 42
Yukang Chen*, Jianhui Liu*, Xiangyu Zhang, Xiaojuan Qi, Jiaya Jia: LargeKernel3D: Scaling up Kernels in 3D Sparse CNNs. CVPR 2023
RPN0.736 350.776 510.790 390.851 110.754 230.854 180.491 520.866 250.596 560.686 90.955 340.536 370.342 160.624 560.869 260.787 110.802 200.628 450.927 270.875 200.704 39
MinkowskiNetpermissive0.736 350.859 250.818 160.832 300.709 410.840 350.521 330.853 290.660 260.643 270.951 510.544 340.286 440.731 360.893 180.675 610.772 450.683 210.874 730.852 410.727 26
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019
IPCA0.731 370.890 170.837 40.864 40.726 370.873 50.530 300.824 430.489 930.647 240.978 60.609 50.336 190.624 560.733 640.758 230.776 430.570 710.949 90.877 160.728 24
online3d0.727 380.715 830.777 480.854 80.748 290.858 130.497 470.872 210.572 660.639 320.957 280.523 430.297 370.750 300.803 530.744 280.810 160.587 670.938 180.871 250.719 32
PointTransformer++0.725 390.727 760.811 240.819 340.765 150.841 340.502 410.814 480.621 420.623 390.955 340.556 310.284 450.620 580.866 270.781 140.757 600.648 360.932 250.862 300.709 37
SparseConvNet0.725 390.647 960.821 110.846 170.721 380.869 60.533 270.754 640.603 520.614 420.955 340.572 240.325 250.710 390.870 250.724 370.823 40.628 450.934 220.865 290.683 45
MatchingNet0.724 410.812 400.812 220.810 400.735 340.834 430.495 490.860 280.572 660.602 500.954 400.512 480.280 480.757 240.845 410.725 360.780 400.606 550.937 190.851 420.700 41
INS-Conv-semantic0.717 420.751 640.759 580.812 380.704 420.868 70.537 260.842 350.609 480.608 460.953 440.534 390.293 390.616 590.864 280.719 410.793 330.640 400.933 230.845 470.663 51
PointMetaBase0.714 430.835 310.785 430.821 320.684 480.846 300.531 290.865 260.614 430.596 540.953 440.500 510.246 680.674 410.888 200.692 530.764 520.624 470.849 880.844 480.675 47
contrastBoundarypermissive0.705 440.769 580.775 490.809 410.687 470.820 590.439 790.812 490.661 250.591 560.945 700.515 470.171 980.633 530.856 330.720 390.796 290.668 320.889 580.847 440.689 43
Liyao Tang, Yibing Zhan, Zhe Chen, Baosheng Yu, Dacheng Tao: Contrastive Boundary Learning for Point Cloud Segmentation. CVPR2022
ClickSeg_Semantic0.703 450.774 530.800 300.793 520.760 180.847 290.471 570.802 520.463 1000.634 350.968 140.491 540.271 560.726 370.910 90.706 470.815 90.551 830.878 670.833 490.570 83
RFCR0.702 460.889 180.745 700.813 370.672 510.818 630.493 500.815 470.623 400.610 440.947 640.470 630.249 670.594 630.848 400.705 480.779 410.646 370.892 560.823 550.611 66
Jingyu Gong, Jiachen Xu, Xin Tan, Haichuan Song, Yanyun Qu, Yuan Xie, Lizhuang Ma: Omni-Supervised Point Cloud Segmentation via Gradual Receptive Field Component Reasoning. CVPR2021
One Thing One Click0.701 470.825 350.796 340.723 680.716 390.832 450.433 810.816 450.634 370.609 450.969 120.418 890.344 140.559 750.833 440.715 430.808 180.560 770.902 480.847 440.680 46
JSENetpermissive0.699 480.881 200.762 560.821 320.667 520.800 760.522 320.792 550.613 440.607 470.935 900.492 530.205 850.576 680.853 370.691 550.758 580.652 350.872 760.828 520.649 55
Zeyu HU, Mingmin Zhen, Xuyang BAI, Hongbo Fu, Chiew-lan Tai: JSENet: Joint Semantic Segmentation and Edge Detection Network for 3D Point Clouds. ECCV 2020
One-Thing-One-Click0.693 490.743 670.794 360.655 910.684 480.822 560.497 470.719 740.622 410.617 410.977 100.447 760.339 170.750 300.664 810.703 500.790 360.596 600.946 120.855 370.647 56
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
PicassoNet-IIpermissive0.692 500.732 720.772 500.786 530.677 500.866 90.517 340.848 310.509 860.626 370.952 490.536 370.225 750.545 810.704 710.689 580.810 160.564 760.903 470.854 390.729 23
Huan Lei, Naveed Akhtar, Mubarak Shah, and Ajmal Mian: Geometric feature learning for 3D meshes.
Feature_GeometricNetpermissive0.690 510.884 190.754 620.795 500.647 590.818 630.422 830.802 520.612 450.604 480.945 700.462 660.189 930.563 740.853 370.726 350.765 510.632 430.904 450.821 580.606 70
Kangcheng Liu, Ben M. Chen: https://arxiv.org/abs/2012.09439. arXiv Preprint
FusionNet0.688 520.704 850.741 740.754 650.656 540.829 480.501 420.741 690.609 480.548 640.950 550.522 450.371 50.633 530.756 590.715 430.771 470.623 480.861 840.814 610.658 52
Feihu Zhang, Jin Fang, Benjamin Wah, Philip Torr: Deep FusionNet for Point Cloud Semantic Segmentation. ECCV 2020
Feature-Geometry Netpermissive0.685 530.866 220.748 670.819 340.645 610.794 790.450 690.802 520.587 590.604 480.945 700.464 650.201 880.554 770.840 420.723 380.732 710.602 580.907 430.822 570.603 73
VACNN++0.684 540.728 750.757 610.776 580.690 440.804 740.464 620.816 450.577 650.587 570.945 700.508 500.276 510.671 420.710 690.663 660.750 640.589 650.881 640.832 510.653 54
KP-FCNN0.684 540.847 280.758 600.784 550.647 590.814 660.473 560.772 580.605 500.594 550.935 900.450 740.181 960.587 640.805 520.690 560.785 390.614 510.882 630.819 590.632 62
H. Thomas, C. Qi, J. Deschaud, B. Marcotegui, F. Goulette, L. Guibas.: KPConv: Flexible and Deformable Convolution for Point Clouds. ICCV 2019
DGNet0.684 540.712 840.784 440.782 570.658 530.835 420.499 460.823 440.641 340.597 530.950 550.487 560.281 470.575 690.619 850.647 740.764 520.620 500.871 790.846 460.688 44
PointContrast_LA_SEM0.683 570.757 620.784 440.786 530.639 630.824 540.408 860.775 570.604 510.541 660.934 940.532 400.269 580.552 780.777 570.645 770.793 330.640 400.913 420.824 540.671 48
Superpoint Network0.683 570.851 270.728 780.800 490.653 560.806 720.468 590.804 500.572 660.602 500.946 670.453 730.239 710.519 860.822 460.689 580.762 550.595 620.895 540.827 530.630 63
VI-PointConv0.676 590.770 570.754 620.783 560.621 670.814 660.552 180.758 620.571 690.557 620.954 400.529 410.268 600.530 840.682 750.675 610.719 740.603 570.888 590.833 490.665 50
Xingyi Li, Wenxuan Wu, Xiaoli Z. Fern, Li Fuxin: The Devils in the Point Clouds: Studying the Robustness of Point Cloud Convolutions.
ROSMRF3D0.673 600.789 440.748 670.763 630.635 650.814 660.407 880.747 660.581 630.573 590.950 550.484 570.271 560.607 600.754 600.649 710.774 440.596 600.883 620.823 550.606 70
SALANet0.670 610.816 380.770 530.768 600.652 570.807 710.451 660.747 660.659 280.545 650.924 1000.473 620.149 1080.571 710.811 510.635 810.746 650.623 480.892 560.794 750.570 83
O3DSeg0.668 620.822 360.771 520.496 1120.651 580.833 440.541 230.761 610.555 750.611 430.966 150.489 550.370 60.388 1050.580 880.776 170.751 620.570 710.956 70.817 600.646 57
PointConvpermissive0.666 630.781 480.759 580.699 760.644 620.822 560.475 550.779 560.564 720.504 830.953 440.428 830.203 870.586 660.754 600.661 670.753 610.588 660.902 480.813 630.642 58
Wenxuan Wu, Zhongang Qi, Li Fuxin: PointConv: Deep Convolutional Networks on 3D Point Clouds. CVPR 2019
PointASNLpermissive0.666 630.703 860.781 460.751 670.655 550.830 470.471 570.769 590.474 960.537 680.951 510.475 610.279 490.635 510.698 740.675 610.751 620.553 820.816 950.806 650.703 40
Xu Yan, Chaoda Zheng, Zhen Li, Sheng Wang, Shuguang Cui: PointASNL: Robust Point Clouds Processing using Nonlocal Neural Networks with Adaptive Sampling. CVPR 2020
PPCNN++permissive0.663 650.746 650.708 810.722 690.638 640.820 590.451 660.566 1020.599 540.541 660.950 550.510 490.313 290.648 470.819 490.616 860.682 890.590 640.869 800.810 640.656 53
Pyunghwan Ahn, Juyoung Yang, Eojindl Yi, Chanho Lee, Junmo Kim: Projection-based Point Convolution for Efficient Point Cloud Segmentation. IEEE Access
MVF-GNN0.658 660.558 1080.751 650.655 910.690 440.722 1010.453 650.867 230.579 640.576 580.893 1120.523 430.293 390.733 350.571 900.692 530.659 960.606 550.875 700.804 670.668 49
DCM-Net0.658 660.778 490.702 840.806 450.619 680.813 690.468 590.693 820.494 890.524 740.941 820.449 750.298 360.510 880.821 470.675 610.727 730.568 740.826 930.803 680.637 60
Jonas Schult*, Francis Engelmann*, Theodora Kontogianni, Bastian Leibe: DualConvMesh-Net: Joint Geodesic and Euclidean Convolutions on 3D Meshes. CVPR 2020 [Oral]
HPGCNN0.656 680.698 880.743 720.650 930.564 850.820 590.505 400.758 620.631 380.479 870.945 700.480 590.226 730.572 700.774 580.690 560.735 690.614 510.853 870.776 900.597 76
Jisheng Dang, Qingyong Hu, Yulan Guo, Jun Yang: HPGCNN.
SAFNet-segpermissive0.654 690.752 630.734 760.664 890.583 800.815 650.399 900.754 640.639 350.535 700.942 800.470 630.309 310.665 430.539 920.650 700.708 790.635 420.857 860.793 770.642 58
Linqing Zhao, Jiwen Lu, Jie Zhou: Similarity-Aware Fusion Network for 3D Semantic Segmentation. IROS 2021
RandLA-Netpermissive0.645 700.778 490.731 770.699 760.577 810.829 480.446 710.736 700.477 950.523 760.945 700.454 700.269 580.484 950.749 630.618 840.738 670.599 590.827 920.792 800.621 65
PointConv-SFPN0.641 710.776 510.703 830.721 700.557 880.826 510.451 660.672 870.563 730.483 860.943 790.425 860.162 1030.644 480.726 650.659 680.709 780.572 700.875 700.786 850.559 89
MVPNetpermissive0.641 710.831 320.715 790.671 860.590 760.781 850.394 920.679 840.642 330.553 630.937 870.462 660.256 640.649 460.406 1050.626 820.691 860.666 330.877 680.792 800.608 69
Maximilian Jaritz, Jiayuan Gu, Hao Su: Multi-view PointNet for 3D Scene Understanding. GMDL Workshop, ICCV 2019
PointMRNet0.640 730.717 820.701 850.692 790.576 820.801 750.467 610.716 750.563 730.459 930.953 440.429 820.169 1000.581 670.854 360.605 870.710 760.550 840.894 550.793 770.575 81
FPConvpermissive0.639 740.785 460.760 570.713 740.603 710.798 770.392 940.534 1070.603 520.524 740.948 620.457 680.250 660.538 820.723 670.598 910.696 840.614 510.872 760.799 700.567 86
Yiqun Lin, Zizheng Yan, Haibin Huang, Dong Du, Ligang Liu, Shuguang Cui, Xiaoguang Han: FPConv: Learning Local Flattening for Point Convolution. CVPR 2020
PD-Net0.638 750.797 420.769 540.641 980.590 760.820 590.461 630.537 1060.637 360.536 690.947 640.388 960.206 840.656 440.668 790.647 740.732 710.585 680.868 810.793 770.473 109
PointSPNet0.637 760.734 710.692 920.714 730.576 820.797 780.446 710.743 680.598 550.437 980.942 800.403 920.150 1070.626 550.800 550.649 710.697 830.557 800.846 890.777 890.563 87
SConv0.636 770.830 330.697 880.752 660.572 840.780 870.445 730.716 750.529 790.530 710.951 510.446 770.170 990.507 900.666 800.636 800.682 890.541 900.886 600.799 700.594 77
Supervoxel-CNN0.635 780.656 940.711 800.719 710.613 690.757 960.444 760.765 600.534 780.566 600.928 980.478 600.272 540.636 500.531 940.664 650.645 1000.508 980.864 830.792 800.611 66
joint point-basedpermissive0.634 790.614 1020.778 470.667 880.633 660.825 520.420 840.804 500.467 980.561 610.951 510.494 520.291 410.566 720.458 1000.579 970.764 520.559 790.838 900.814 610.598 75
Hung-Yueh Chiang, Yen-Liang Lin, Yueh-Cheng Liu, Winston H. Hsu: A Unified Point-Based Framework for 3D Segmentation. 3DV 2019
PointMTL0.632 800.731 730.688 950.675 830.591 750.784 840.444 760.565 1030.610 460.492 840.949 590.456 690.254 650.587 640.706 700.599 900.665 950.612 540.868 810.791 830.579 80
PointNet2-SFPN0.631 810.771 550.692 920.672 840.524 940.837 390.440 780.706 800.538 770.446 950.944 760.421 880.219 780.552 780.751 620.591 930.737 680.543 890.901 500.768 920.557 90
APCF-Net0.631 810.742 680.687 970.672 840.557 880.792 820.408 860.665 890.545 760.508 800.952 490.428 830.186 940.634 520.702 720.620 830.706 800.555 810.873 740.798 720.581 79
Haojia, Lin: Adaptive Pyramid Context Fusion for Point Cloud Perception. GRSL
3DSM_DMMF0.631 810.626 990.745 700.801 480.607 700.751 970.506 390.729 730.565 710.491 850.866 1150.434 780.197 910.595 620.630 840.709 460.705 810.560 770.875 700.740 1000.491 104
FusionAwareConv0.630 840.604 1040.741 740.766 620.590 760.747 980.501 420.734 710.503 880.527 720.919 1040.454 700.323 260.550 800.420 1040.678 600.688 870.544 870.896 530.795 740.627 64
Jiazhao Zhang, Chenyang Zhu, Lintao Zheng, Kai Xu: Fusion-Aware Point Convolution for Online Semantic 3D Scene Segmentation. CVPR 2020
DenSeR0.628 850.800 410.625 1070.719 710.545 910.806 720.445 730.597 970.448 1030.519 780.938 860.481 580.328 240.489 940.499 990.657 690.759 570.592 630.881 640.797 730.634 61
SegGroup_sempermissive0.627 860.818 370.747 690.701 750.602 720.764 930.385 980.629 940.490 910.508 800.931 970.409 910.201 880.564 730.725 660.618 840.692 850.539 910.873 740.794 750.548 93
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
SIConv0.625 870.830 330.694 900.757 640.563 860.772 910.448 700.647 920.520 820.509 790.949 590.431 810.191 920.496 920.614 860.647 740.672 930.535 940.876 690.783 860.571 82
dtc_net0.625 870.703 860.751 650.794 510.535 920.848 270.480 540.676 860.528 800.469 900.944 760.454 700.004 1200.464 970.636 830.704 490.758 580.548 860.924 300.787 840.492 103
Weakly-Openseg v30.625 870.924 80.787 420.620 1000.555 900.811 700.393 930.666 880.382 1110.520 770.953 440.250 1150.208 820.604 610.670 770.644 780.742 660.538 920.919 360.803 680.513 101
HPEIN0.618 900.729 740.668 980.647 950.597 740.766 920.414 850.680 830.520 820.525 730.946 670.432 790.215 800.493 930.599 870.638 790.617 1050.570 710.897 520.806 650.605 72
Li Jiang, Hengshuang Zhao, Shu Liu, Xiaoyong Shen, Chi-Wing Fu, Jiaya Jia: Hierarchical Point-Edge Interaction Network for Point Cloud Semantic Segmentation. ICCV 2019
SPH3D-GCNpermissive0.610 910.858 260.772 500.489 1130.532 930.792 820.404 890.643 930.570 700.507 820.935 900.414 900.046 1170.510 880.702 720.602 890.705 810.549 850.859 850.773 910.534 96
Huan Lei, Naveed Akhtar, and Ajmal Mian: Spherical Kernel for Efficient Graph Convolution on 3D Point Clouds. TPAMI 2020
AttAN0.609 920.760 600.667 990.649 940.521 950.793 800.457 640.648 910.528 800.434 1000.947 640.401 930.153 1060.454 980.721 680.648 730.717 750.536 930.904 450.765 930.485 105
Gege Zhang, Qinghua Ma, Licheng Jiao, Fang Liu and Qigong Sun: AttAN: Attention Adversarial Networks for 3D Point Cloud Semantic Segmentation. IJCAI2020
wsss-transformer0.600 930.634 980.743 720.697 780.601 730.781 850.437 800.585 1000.493 900.446 950.933 950.394 940.011 1190.654 450.661 820.603 880.733 700.526 950.832 910.761 950.480 106
LAP-D0.594 940.720 800.692 920.637 990.456 1040.773 900.391 960.730 720.587 590.445 970.940 840.381 970.288 420.434 1010.453 1020.591 930.649 980.581 690.777 990.749 990.610 68
DPC0.592 950.720 800.700 860.602 1040.480 1000.762 950.380 990.713 780.585 620.437 980.940 840.369 990.288 420.434 1010.509 980.590 950.639 1030.567 750.772 1000.755 970.592 78
Francis Engelmann, Theodora Kontogianni, Bastian Leibe: Dilated Point Convolutions: On the Receptive Field Size of Point Convolutions on 3D Point Clouds. ICRA 2020
CCRFNet0.589 960.766 590.659 1020.683 810.470 1030.740 1000.387 970.620 960.490 910.476 880.922 1020.355 1020.245 690.511 870.511 970.571 980.643 1010.493 1020.872 760.762 940.600 74
ROSMRF0.580 970.772 540.707 820.681 820.563 860.764 930.362 1010.515 1080.465 990.465 920.936 890.427 850.207 830.438 990.577 890.536 1010.675 920.486 1030.723 1060.779 870.524 98
SD-DETR0.576 980.746 650.609 1110.445 1170.517 960.643 1120.366 1000.714 770.456 1010.468 910.870 1140.432 790.264 610.558 760.674 760.586 960.688 870.482 1040.739 1040.733 1020.537 95
SQN_0.1%0.569 990.676 900.696 890.657 900.497 970.779 880.424 820.548 1040.515 840.376 1050.902 1110.422 870.357 100.379 1060.456 1010.596 920.659 960.544 870.685 1090.665 1130.556 91
TextureNetpermissive0.566 1000.672 920.664 1000.671 860.494 980.719 1020.445 730.678 850.411 1090.396 1030.935 900.356 1010.225 750.412 1030.535 930.565 990.636 1040.464 1060.794 980.680 1100.568 85
Jingwei Huang, Haotian Zhang, Li Yi, Thomas Funkerhouser, Matthias Niessner, Leonidas Guibas: TextureNet: Consistent Local Parametrizations for Learning from High-Resolution Signals on Meshes. CVPR
DVVNet0.562 1010.648 950.700 860.770 590.586 790.687 1060.333 1050.650 900.514 850.475 890.906 1080.359 1000.223 770.340 1080.442 1030.422 1120.668 940.501 990.708 1070.779 870.534 96
Pointnet++ & Featurepermissive0.557 1020.735 700.661 1010.686 800.491 990.744 990.392 940.539 1050.451 1020.375 1060.946 670.376 980.205 850.403 1040.356 1080.553 1000.643 1010.497 1000.824 940.756 960.515 99
GMLPs0.538 1030.495 1130.693 910.647 950.471 1020.793 800.300 1080.477 1090.505 870.358 1070.903 1100.327 1050.081 1140.472 960.529 950.448 1100.710 760.509 960.746 1020.737 1010.554 92
PanopticFusion-label0.529 1040.491 1140.688 950.604 1030.386 1090.632 1130.225 1190.705 810.434 1060.293 1130.815 1170.348 1030.241 700.499 910.669 780.507 1030.649 980.442 1120.796 970.602 1170.561 88
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
subcloud_weak0.516 1050.676 900.591 1140.609 1010.442 1050.774 890.335 1040.597 970.422 1080.357 1080.932 960.341 1040.094 1130.298 1100.528 960.473 1080.676 910.495 1010.602 1150.721 1050.349 117
Online SegFusion0.515 1060.607 1030.644 1050.579 1060.434 1060.630 1140.353 1020.628 950.440 1040.410 1010.762 1200.307 1070.167 1010.520 850.403 1060.516 1020.565 1080.447 1100.678 1100.701 1070.514 100
Davide Menini, Suryansh Kumar, Martin R. Oswald, Erik Sandstroem, Cristian Sminchisescu, Luc van Gool: A Real-Time Learning Framework for Joint 3D Reconstruction and Semantic Segmentation. Robotics and Automation Letters Submission
3DMV, FTSDF0.501 1070.558 1080.608 1120.424 1190.478 1010.690 1050.246 1150.586 990.468 970.450 940.911 1060.394 940.160 1040.438 990.212 1150.432 1110.541 1130.475 1050.742 1030.727 1030.477 107
PCNN0.498 1080.559 1070.644 1050.560 1080.420 1080.711 1040.229 1170.414 1100.436 1050.352 1090.941 820.324 1060.155 1050.238 1150.387 1070.493 1040.529 1140.509 960.813 960.751 980.504 102
3DMV0.484 1090.484 1150.538 1170.643 970.424 1070.606 1170.310 1060.574 1010.433 1070.378 1040.796 1180.301 1080.214 810.537 830.208 1160.472 1090.507 1170.413 1150.693 1080.602 1170.539 94
Angela Dai, Matthias Niessner: 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation. ECCV'18
PointCNN with RGBpermissive0.458 1100.577 1060.611 1100.356 1210.321 1170.715 1030.299 1100.376 1140.328 1170.319 1110.944 760.285 1100.164 1020.216 1180.229 1130.484 1060.545 1120.456 1080.755 1010.709 1060.475 108
Yangyan Li, Rui Bu, Mingchao Sun, Baoquan Chen: PointCNN. NeurIPS 2018
FCPNpermissive0.447 1110.679 890.604 1130.578 1070.380 1100.682 1070.291 1110.106 1210.483 940.258 1190.920 1030.258 1140.025 1180.231 1170.325 1090.480 1070.560 1100.463 1070.725 1050.666 1120.231 121
Dario Rethage, Johanna Wald, Jürgen Sturm, Nassir Navab, Federico Tombari: Fully-Convolutional Point Networks for Large-Scale Point Clouds. ECCV 2018
DGCNN_reproducecopyleft0.446 1120.474 1160.623 1080.463 1150.366 1120.651 1100.310 1060.389 1130.349 1150.330 1100.937 870.271 1120.126 1100.285 1110.224 1140.350 1170.577 1070.445 1110.625 1130.723 1040.394 113
Yue Wang, Yongbin Sun, Ziwei Liu, Sanjay E. Sarma, Michael M. Bronstein, Justin M. Solomon: Dynamic Graph CNN for Learning on Point Clouds. TOG 2019
PNET20.442 1130.548 1100.548 1160.597 1050.363 1130.628 1150.300 1080.292 1160.374 1120.307 1120.881 1130.268 1130.186 940.238 1150.204 1170.407 1130.506 1180.449 1090.667 1110.620 1160.462 111
SurfaceConvPF0.442 1130.505 1120.622 1090.380 1200.342 1150.654 1090.227 1180.397 1120.367 1130.276 1150.924 1000.240 1160.198 900.359 1070.262 1110.366 1140.581 1060.435 1130.640 1120.668 1110.398 112
Hao Pan, Shilin Liu, Yang Liu, Xin Tong: Convolutional Neural Networks on 3D Surfaces Using Parallel Frames.
Tangent Convolutionspermissive0.438 1150.437 1180.646 1040.474 1140.369 1110.645 1110.353 1020.258 1180.282 1200.279 1140.918 1050.298 1090.147 1090.283 1120.294 1100.487 1050.562 1090.427 1140.619 1140.633 1150.352 116
Maxim Tatarchenko, Jaesik Park, Vladlen Koltun, Qian-Yi Zhou: Tangent convolutions for dense prediction in 3d. CVPR 2018
3DWSSS0.425 1160.525 1110.647 1030.522 1090.324 1160.488 1210.077 1220.712 790.353 1140.401 1020.636 1220.281 1110.176 970.340 1080.565 910.175 1210.551 1110.398 1160.370 1220.602 1170.361 115
SPLAT Netcopyleft0.393 1170.472 1170.511 1180.606 1020.311 1180.656 1080.245 1160.405 1110.328 1170.197 1200.927 990.227 1180.000 1220.001 1230.249 1120.271 1200.510 1150.383 1180.593 1160.699 1080.267 119
Hang Su, Varun Jampani, Deqing Sun, Subhransu Maji, Evangelos Kalogerakis, Ming-Hsuan Yang, Jan Kautz: SPLATNet: Sparse Lattice Networks for Point Cloud Processing. CVPR 2018
ScanNet+FTSDF0.383 1180.297 1200.491 1190.432 1180.358 1140.612 1160.274 1130.116 1200.411 1090.265 1160.904 1090.229 1170.079 1150.250 1130.185 1180.320 1180.510 1150.385 1170.548 1170.597 1200.394 113
PointNet++permissive0.339 1190.584 1050.478 1200.458 1160.256 1200.360 1220.250 1140.247 1190.278 1210.261 1180.677 1210.183 1190.117 1110.212 1190.145 1200.364 1150.346 1220.232 1220.548 1170.523 1210.252 120
Charles R. Qi, Li Yi, Hao Su, Leonidas J. Guibas: pointnet++: deep hierarchical feature learning on point sets in a metric space.
GrowSP++0.323 1200.114 1220.589 1150.499 1110.147 1220.555 1180.290 1120.336 1150.290 1190.262 1170.865 1160.102 1220.000 1220.037 1210.000 1230.000 1230.462 1190.381 1190.389 1210.664 1140.473 109
SSC-UNetpermissive0.308 1210.353 1190.290 1220.278 1220.166 1210.553 1190.169 1210.286 1170.147 1220.148 1220.908 1070.182 1200.064 1160.023 1220.018 1220.354 1160.363 1200.345 1200.546 1190.685 1090.278 118
ScanNetpermissive0.306 1220.203 1210.366 1210.501 1100.311 1180.524 1200.211 1200.002 1230.342 1160.189 1210.786 1190.145 1210.102 1120.245 1140.152 1190.318 1190.348 1210.300 1210.460 1200.437 1220.182 122
Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner: ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. CVPR'17
ERROR0.054 1230.000 1230.041 1230.172 1230.030 1230.062 1230.001 1230.035 1220.004 1230.051 1230.143 1230.019 1230.003 1210.041 1200.050 1210.003 1220.054 1230.018 1230.005 1230.264 1230.082 123


This table lists the benchmark results for the 3D semantic instance scenario.




Method Infoavg ap 50%bathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Competitor-MAFT0.816 11.000 10.983 40.872 110.718 60.941 20.588 50.652 410.819 30.776 30.720 60.780 60.769 121.000 10.797 110.813 310.798 91.000 10.659 5
PointRel0.816 11.000 10.971 90.908 60.743 20.923 90.573 90.714 220.695 200.734 110.747 20.725 130.809 11.000 10.814 90.899 50.820 41.000 10.610 19
: Relation3D: Enhancing Relation Modeling for Point Cloud Instance Segmentation. CVPR 2025
Spherical Mask(CtoF)0.812 31.000 10.973 80.852 160.718 70.917 110.574 70.677 310.748 130.729 150.715 90.795 30.809 11.000 10.831 40.854 110.787 131.000 10.638 8
PointComp0.811 40.850 600.969 100.864 140.739 30.946 10.539 150.671 340.835 10.700 190.742 30.817 10.766 131.000 10.755 220.909 10.808 71.000 10.687 2
EV3D0.811 41.000 10.968 110.852 160.717 80.921 100.574 80.677 310.748 130.730 140.703 150.795 30.809 11.000 10.831 40.854 110.778 171.000 10.638 9
VDG-Uni3DSeg0.804 61.000 10.990 10.886 90.688 210.912 130.602 20.703 260.786 80.771 40.708 120.700 180.669 270.981 410.789 170.903 20.772 201.000 10.609 20
SIM3D0.803 71.000 10.967 120.863 150.692 200.924 80.552 130.732 210.667 250.732 130.662 190.796 20.789 91.000 10.803 100.864 80.766 231.000 10.643 7
OneFormer3Dcopyleft0.801 81.000 10.973 70.909 50.698 160.928 60.582 60.668 370.685 210.780 20.687 170.698 220.702 161.000 10.794 130.900 40.784 150.986 550.635 10
Maxim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: OneFormer3D: One Transformer for Unified Point Cloud Segmentation.
Competitor-SPFormer0.800 91.000 10.986 30.845 180.705 140.915 120.532 160.733 200.757 120.733 120.708 110.698 210.648 390.981 410.890 10.830 210.796 100.997 420.644 6
UniPerception0.800 91.000 10.930 140.872 110.727 50.862 270.454 220.764 130.820 20.746 80.706 130.750 80.772 100.926 490.764 200.818 290.826 20.997 420.660 4
InsSSM0.799 111.000 10.915 160.710 440.729 40.925 70.664 10.670 350.770 90.766 50.739 40.737 90.700 171.000 10.792 140.829 230.815 50.997 420.625 12
Lei Yao, Yi Wang, Moyun Liu, Lap-Pui Chau: SGIFormer: Semantic-guided and Geometric-enhanced Interleaving Transformer for 3D Instance Segmentation. TCSVT, 2024
DCD0.798 121.000 10.878 230.792 300.693 190.936 30.596 30.685 300.663 270.736 90.717 70.788 50.693 221.000 10.825 70.840 170.837 11.000 10.689 1
TST3D0.795 131.000 10.929 150.918 40.709 110.884 220.596 40.704 250.769 100.734 100.644 240.699 200.751 141.000 10.794 120.876 70.757 260.997 420.550 36
Duc Tran Dang Trung, Byeongkeun Kang, Yeejin Lee: MSTA3D: Multi-scale Twin-attention for 3D Instance Segmentation. ACM Multimedia 2024
MG-Former0.791 141.000 10.980 60.837 210.626 290.897 150.543 140.759 150.800 70.766 60.659 200.769 70.697 201.000 10.791 150.707 520.791 121.000 10.610 18
ExtMask3D0.789 151.000 10.988 20.756 370.706 130.912 140.429 230.647 430.806 60.755 70.673 180.689 230.772 111.000 10.789 160.852 130.811 61.000 10.617 15
Queryformer0.787 161.000 10.933 130.601 540.754 10.886 200.558 120.661 390.767 110.665 220.716 80.639 290.808 51.000 10.844 30.897 60.804 81.000 10.624 13
MAFT0.786 171.000 10.894 210.807 250.694 180.893 180.486 180.674 330.740 150.786 10.704 140.727 120.739 151.000 10.707 280.849 150.756 271.000 10.685 3
KmaxOneFormerNetpermissive0.783 180.903 580.981 50.794 290.706 120.931 50.561 110.701 270.706 180.727 160.697 160.731 110.689 241.000 10.856 20.750 430.761 251.000 10.599 24
Mask3D0.780 191.000 10.786 470.716 420.696 170.885 210.500 170.714 220.810 50.672 210.715 90.679 240.809 11.000 10.831 40.833 200.787 131.000 10.602 22
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
SPFormerpermissive0.770 200.903 580.903 180.806 260.609 360.886 190.568 100.815 60.705 190.711 170.655 210.652 280.685 251.000 10.789 180.809 320.776 191.000 10.583 28
Sun Jiahao, Qing Chunmei, Tan Junpeng, Xu Xiangmin: Superpoint Transformer for 3D Scene Instance Segmentation. AAAI 2023 [Oral]
SoftGroup++0.769 211.000 10.803 400.937 10.684 220.865 240.213 390.870 20.664 260.571 290.758 10.702 170.807 61.000 10.653 350.902 30.792 111.000 10.626 11
SoftGrouppermissive0.761 221.000 10.808 360.845 180.716 90.862 260.243 360.824 40.655 290.620 230.734 50.699 190.791 80.981 410.716 250.844 160.769 211.000 10.594 26
Thang Vu, Kookhoi Kim, Tung M. Luu, Xuan Thanh Nguyen, Chang D. Yoo: SoftGroup for 3D Instance Segmentaiton on Point Clouds. CVPR 2022 [Oral]
ISBNetpermissive0.757 231.000 10.904 170.731 400.678 230.895 160.458 200.644 450.670 240.710 180.620 290.732 100.650 291.000 10.756 210.778 350.779 161.000 10.614 16
Tuan Duc Ngo, Binh-Son Hua, Khoi Nguyen: ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution. CVPR 2023
TD3Dpermissive0.751 241.000 10.774 480.867 130.621 310.934 40.404 240.706 240.812 40.605 260.633 270.626 300.690 231.000 10.640 370.820 260.777 181.000 10.612 17
Maksim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: Top-Down Beats Bottom-Up in 3D Instance Segmentation. WACV 2024
PBNetpermissive0.747 251.000 10.818 320.837 220.713 100.844 290.457 210.647 430.711 170.614 240.617 310.657 270.650 291.000 10.692 290.822 250.765 241.000 10.595 25
Weiguang Zhao, Yuyao Yan, Chaolong Yang, Jianan Ye, Xi Yang, Kaizhu Huang: Divide and Conquer: 3D Instance Segmentation With Point-Wise Binarization. ICCV 2023
GraphCut0.732 261.000 10.788 450.724 410.642 280.859 280.248 350.787 110.618 320.596 270.653 230.722 150.583 511.000 10.766 190.861 90.825 31.000 10.504 42
IPCA-Inst0.731 271.000 10.788 460.884 100.698 150.788 450.252 340.760 140.646 300.511 370.637 260.665 260.804 71.000 10.644 360.778 360.747 291.000 10.561 32
TopoSeg0.725 281.000 10.806 390.933 20.668 250.758 500.272 330.734 190.630 310.549 330.654 220.606 310.697 210.966 460.612 410.839 180.754 281.000 10.573 29
DKNet0.718 291.000 10.814 330.782 310.619 330.872 230.224 370.751 170.569 360.677 200.585 360.724 140.633 410.981 410.515 510.819 270.736 301.000 10.617 14
Yizheng Wu, Min Shi, Shuaiyuan Du, Hao Lu, Zhiguo Cao, Weicai Zhong: 3D Instances as 1D Kernels. ECCV 2022
SSEC0.707 301.000 10.850 250.924 30.648 260.747 530.162 410.862 30.572 350.520 350.624 280.549 340.649 381.000 10.560 460.706 530.768 221.000 10.591 27
HAISpermissive0.699 311.000 10.849 260.820 230.675 240.808 390.279 310.757 160.465 420.517 360.596 330.559 330.600 451.000 10.654 340.767 380.676 340.994 510.560 33
Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang: Hierarchical Aggregation for 3D Instance Segmentation. ICCV 2021
SSTNetpermissive0.698 321.000 10.697 640.888 80.556 430.803 400.387 250.626 470.417 470.556 320.585 370.702 160.600 451.000 10.824 80.720 510.692 321.000 10.509 41
Zhihao Liang, Zhihao Li, Songcen Xu, Mingkui Tan, Kui Jia: Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks. ICCV2021
DualGroup0.694 331.000 10.799 420.811 240.622 300.817 340.376 260.805 90.590 340.487 410.568 400.525 380.650 290.835 590.600 420.829 220.655 371.000 10.526 38
ODIN - Inspermissive0.693 341.000 10.880 220.647 490.620 320.779 470.336 280.501 620.681 220.577 280.595 340.679 250.683 261.000 10.709 270.816 300.637 410.770 710.557 34
Ayush Jain, Pushkal Katara, Nikolaos Gkanatsios, Adam W. Harley, Gabriel Sarch, Kriti Aggarwal, Vishrav Chaudhary, Katerina Fragkiadaki: ODIN: A Single Model for 2D and 3D Segmentation. CVPR 2024
DANCENET0.680 351.000 10.807 370.733 390.600 370.768 490.375 270.543 550.538 370.610 250.599 320.498 390.632 430.981 410.739 240.856 100.633 440.882 660.454 51
SphereSeg0.680 351.000 10.856 240.744 380.618 340.893 170.151 420.651 420.713 160.537 340.579 390.430 480.651 281.000 10.389 620.744 460.697 310.991 530.601 23
Box2Mask0.677 371.000 10.847 270.771 330.509 520.816 350.277 320.558 540.482 390.562 310.640 250.448 440.700 171.000 10.666 300.852 140.578 510.997 420.488 46
Julian Chibane, Francis Engelmann, Tuan Anh Tran, Gerard Pons-Moll: Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation Using Bounding Boxes. ECCV 2022
OccuSeg+instance0.672 381.000 10.758 560.682 460.576 410.842 300.477 190.504 610.524 380.567 300.585 380.451 430.557 531.000 10.751 230.797 330.563 541.000 10.467 50
Lei Han, Tian Zheng, Lan Xu, Lu Fang: OccuSeg: Occupancy-aware 3D Instance Segmentation. CVPR2020
Mask-Group0.664 391.000 10.822 310.764 360.616 350.815 360.139 460.694 290.597 330.459 450.566 410.599 320.600 450.516 690.715 260.819 280.635 421.000 10.603 21
Min Zhong, Xinghao Chen, Xiaokang Chen, Gang Zeng, Yunhe Wang: MaskGroup: Hierarchical Point Grouping and Masking for 3D Instance Segmentation. ICME 2022
INS-Conv-instance0.657 401.000 10.760 540.667 480.581 390.863 250.323 290.655 400.477 400.473 430.549 430.432 470.650 291.000 10.655 330.738 470.585 500.944 580.472 49
CSC-Pretrained0.648 411.000 10.810 340.768 340.523 500.813 370.143 450.819 50.389 500.422 540.511 470.443 450.650 291.000 10.624 390.732 480.634 431.000 10.375 58
PE0.645 421.000 10.773 500.798 280.538 450.786 460.088 540.799 100.350 540.435 520.547 440.545 350.646 400.933 480.562 450.761 410.556 590.997 420.501 44
Biao Zhang, Peter Wonka: Point Cloud Instance Segmentation using Probabilistic Embeddings. CVPR 2021
RPGN0.643 431.000 10.758 550.582 600.539 440.826 330.046 590.765 120.372 520.436 510.588 350.539 370.650 291.000 10.577 430.750 440.653 390.997 420.495 45
Shichao Dong, Guosheng Lin, Tzu-Yi Hung: Learning Regional Purity for Instance Segmentation on 3D Point Clouds. ECCV 2022
Dyco3Dcopyleft0.641 441.000 10.841 280.893 70.531 470.802 410.115 510.588 520.448 440.438 490.537 460.430 490.550 540.857 510.534 490.764 400.657 360.987 540.568 30
Tong He; Chunhua Shen; Anton van den Hengel: DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution. CVPR2021
GICN0.638 451.000 10.895 200.800 270.480 560.676 580.144 440.737 180.354 530.447 460.400 600.365 550.700 171.000 10.569 440.836 190.599 461.000 10.473 48
PointGroup0.636 461.000 10.765 510.624 510.505 540.797 420.116 500.696 280.384 510.441 470.559 420.476 410.596 481.000 10.666 300.756 420.556 580.997 420.513 40
Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia: PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation. CVPR 2020 [oral]
DD-UNet+Group0.635 470.667 620.797 440.714 430.562 420.774 480.146 430.810 80.429 460.476 420.546 450.399 510.633 411.000 10.632 380.722 500.609 451.000 10.514 39
H. Liu, R. Liu, K. Yang, J. Zhang, K. Peng, R. Stiefelhagen: HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor. ICCVW 2021
Mask3D_evaluation0.631 481.000 10.829 300.606 530.646 270.836 310.068 550.511 590.462 430.507 380.619 300.389 530.610 441.000 10.432 570.828 240.673 350.788 700.552 35
DENet0.629 491.000 10.797 430.608 520.589 380.627 620.219 380.882 10.310 560.402 590.383 620.396 520.650 291.000 10.663 320.543 700.691 331.000 10.568 31
3D-MPA0.611 501.000 10.833 290.765 350.526 490.756 510.136 480.588 520.470 410.438 500.432 560.358 570.650 290.857 510.429 580.765 390.557 571.000 10.430 53
Francis Engelmann, Martin Bokeloh, Alireza Fathi, Bastian Leibe, Matthias Nießner: 3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation. CVPR 2020
OSIS0.605 511.000 10.801 410.599 550.535 460.728 550.286 300.436 660.679 230.491 390.433 540.256 590.404 660.857 510.620 400.724 490.510 641.000 10.539 37
AOIA0.601 521.000 10.761 530.687 450.485 550.828 320.008 660.663 380.405 490.405 580.425 570.490 400.596 480.714 620.553 480.779 340.597 470.992 520.424 55
PCJC0.578 531.000 10.810 350.583 590.449 590.813 380.042 600.603 500.341 550.490 400.465 510.410 500.650 290.835 590.264 680.694 570.561 550.889 630.504 43
SSEN0.575 541.000 10.761 520.473 620.477 570.795 430.066 560.529 570.658 280.460 440.461 520.380 540.331 680.859 500.401 610.692 590.653 381.000 10.348 60
Dongsu Zhang, Junha Chun, Sang Kyun Cha, Young Min Kim: Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with Deep Metric Learning. Arxiv
RWSeg0.567 550.528 720.708 630.626 500.580 400.745 540.063 570.627 460.240 600.400 600.497 480.464 420.515 551.000 10.475 530.745 450.571 521.000 10.429 54
NeuralBF0.555 560.667 620.896 190.843 200.517 510.751 520.029 610.519 580.414 480.439 480.465 500.000 780.484 570.857 510.287 660.693 580.651 401.000 10.485 47
Weiwei Sun, Daniel Rebain, Renjie Liao, Vladimir Tankovich, Soroosh Yazdani, Kwang Moo Yi, Andrea Tagliasacchi: NeuralBF: Neural Bilateral Filtering for Top-down Instance Segmentation on Point Clouds. WACV 2023
MTML0.549 571.000 10.807 380.588 580.327 640.647 600.004 680.815 70.180 630.418 550.364 640.182 620.445 601.000 10.442 560.688 600.571 531.000 10.396 56
Jean Lahoud, Bernard Ghanem, Marc Pollefeys, Martin R. Oswald: 3D Instance Segmentation via Multi-task Metric Learning. ICCV 2019 [oral]
ClickSeg_Instance0.539 581.000 10.621 670.300 650.530 480.698 560.127 490.533 560.222 610.430 530.400 590.365 550.574 520.938 470.472 540.659 620.543 600.944 580.347 61
One_Thing_One_Clickpermissive0.529 590.667 620.718 590.777 320.399 600.683 570.000 710.669 360.138 660.391 610.374 630.539 360.360 670.641 660.556 470.774 370.593 480.997 420.251 66
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
Sparse R-CNN0.515 601.000 10.538 720.282 660.468 580.790 440.173 400.345 680.429 450.413 570.484 490.176 630.595 500.591 670.522 500.668 610.476 650.986 560.327 62
Occipital-SCS0.512 611.000 10.716 600.509 610.506 530.611 630.092 530.602 510.177 640.346 640.383 610.165 640.442 610.850 580.386 630.618 660.543 610.889 630.389 57
3D-BoNet0.488 621.000 10.672 660.590 570.301 660.484 730.098 520.620 480.306 570.341 650.259 680.125 660.434 630.796 610.402 600.499 720.513 630.909 620.439 52
Bo Yang, Jianan Wang, Ronald Clark, Qingyong Hu, Sen Wang, Andrew Markham, Niki Trigoni: Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds. NeurIPS 2019 Spotlight
PanopticFusion-inst0.478 630.667 620.712 620.595 560.259 690.550 690.000 710.613 490.175 650.250 700.434 530.437 460.411 650.857 510.485 520.591 690.267 750.944 580.359 59
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
SPG_WSIS0.470 640.667 620.685 650.677 470.372 620.562 670.000 710.482 630.244 590.316 670.298 650.052 730.442 620.857 510.267 670.702 540.559 561.000 10.287 64
SALoss-ResNet0.459 651.000 10.737 580.159 760.259 680.587 650.138 470.475 640.217 620.416 560.408 580.128 650.315 690.714 620.411 590.536 710.590 490.873 670.304 63
Zhidong Liang, Ming Yang, Hao Li, Chunxiang Wang: 3D Instance Embedding Learning With a Structure-Aware Loss Function for Point Cloud Segmentation. IEEE Robotics and Automation Letters (IROS2020)
MASCpermissive0.447 660.528 720.555 700.381 630.382 610.633 610.002 690.509 600.260 580.361 630.432 550.327 580.451 590.571 680.367 640.639 640.386 660.980 570.276 65
Chen Liu, Yasutaka Furukawa: MASC: Multi-scale Affinity with Sparse Convolution for 3D Instance Segmentation.
SegGroup_inspermissive0.445 670.667 620.773 490.185 730.317 650.656 590.000 710.407 670.134 670.381 620.267 670.217 610.476 580.714 620.452 550.629 650.514 621.000 10.222 69
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
3D-SISpermissive0.382 681.000 10.432 750.245 680.190 700.577 660.013 650.263 700.033 730.320 660.240 690.075 690.422 640.857 510.117 730.699 550.271 740.883 650.235 68
Ji Hou, Angela Dai, Matthias Niessner: 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans. CVPR 2019
Hier3Dcopyleft0.323 690.667 620.542 710.264 670.157 730.550 680.000 710.205 730.009 750.270 690.218 700.075 690.500 560.688 650.007 790.698 560.301 710.459 760.200 70
Tan: HCFS3D: Hierarchical Coupled Feature Selection Network for 3D Semantic and Instance Segmentation.
UNet-backbone0.319 700.667 620.715 610.233 690.189 710.479 740.008 660.218 710.067 720.201 720.173 710.107 670.123 740.438 700.150 700.615 670.355 670.916 610.093 78
R-PointNet0.306 710.500 740.405 760.311 640.348 630.589 640.054 580.068 760.126 680.283 680.290 660.028 740.219 720.214 730.331 650.396 760.275 720.821 690.245 67
Region-18class0.284 720.250 780.751 570.228 710.270 670.521 700.000 710.468 650.008 770.205 710.127 720.000 780.068 760.070 770.262 690.652 630.323 690.740 720.173 71
SemRegionNet-20cls0.250 730.333 750.613 680.229 700.163 720.493 710.000 710.304 690.107 690.147 750.100 740.052 720.231 700.119 750.039 750.445 740.325 680.654 730.141 74
tmp0.248 740.667 620.437 740.188 720.153 740.491 720.000 710.208 720.094 710.153 740.099 750.057 710.217 730.119 750.039 750.466 730.302 700.640 740.140 75
3D-BEVIS0.248 740.667 620.566 690.076 770.035 790.394 770.027 630.035 780.098 700.099 770.030 780.025 750.098 750.375 720.126 720.604 680.181 770.854 680.171 72
Cathrin Elich, Francis Engelmann, Jonas Schult, Theodora Kontogianni, Bastian Leibe: 3D-BEVIS: Birds-Eye-View Instance Segmentation.
Sem_Recon_ins0.227 760.764 610.486 730.069 780.098 760.426 760.017 640.067 770.015 740.172 730.100 730.096 680.054 780.183 740.135 710.366 770.260 760.614 750.168 73
ASIS0.199 770.333 750.253 780.167 750.140 750.438 750.000 710.177 740.008 760.121 760.069 760.004 770.231 710.429 710.036 770.445 750.273 730.333 780.119 77
Sgpn_scannet0.143 780.208 790.390 770.169 740.065 770.275 780.029 620.069 750.000 780.087 780.043 770.014 760.027 790.000 780.112 740.351 780.168 780.438 770.138 76
MaskRCNN 2d->3d Proj0.058 790.333 750.002 790.000 790.053 780.002 790.002 700.021 790.000 780.045 790.024 790.238 600.065 770.000 780.014 780.107 790.020 790.110 790.006 79


This table lists the benchmark results for the 2D semantic label scenario.


Method Infoavg ioubathtubbedbookshelfcabinetchaircountercurtaindeskdoorfloorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwallwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Virtual MVFusion (R)0.745 10.861 10.839 10.881 10.672 20.512 10.422 190.898 10.723 10.714 10.954 20.454 10.509 10.773 10.895 10.756 10.820 10.653 10.935 10.891 10.728 1
Abhijit Kundu, Xiaoqi Yin, Alireza Fathi, David Ross, Brian Brewington, Thomas Funkhouser, Caroline Pantofaru: Virtual Multi-view Fusion for 3D Semantic Segmentation. ECCV 2020
BPNet_2Dcopyleft0.670 20.822 30.795 30.836 20.659 30.481 20.451 150.769 50.656 30.567 40.931 30.395 60.390 60.700 40.534 40.689 110.770 20.574 30.865 110.831 30.675 6
Wenbo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia and Tien-Tsin Wong: Bidirectional Projection Network for Cross Dimension Scene Understanding. CVPR 2021 (Oral)
MVF-GNN(2D)0.636 30.606 160.794 40.434 170.688 10.337 80.464 140.798 40.632 50.589 30.908 90.420 20.329 140.743 20.594 20.738 20.676 50.527 40.906 20.818 60.715 3
CU-Hybrid-2D Net0.636 30.825 20.820 20.179 250.648 40.463 30.549 20.742 90.676 20.628 20.961 10.420 20.379 70.684 80.381 200.732 30.723 30.599 20.827 180.851 20.634 9
DVEFormer0.626 50.616 120.764 60.690 50.583 110.322 140.540 30.809 30.593 70.502 120.900 140.374 90.433 30.660 90.528 50.665 190.663 60.491 90.871 100.810 90.705 4
CMX0.613 60.681 90.725 130.502 130.634 60.297 190.478 120.830 20.651 40.537 70.924 40.375 70.315 160.686 70.451 150.714 50.543 230.504 60.894 70.823 50.688 5
DMMF_3d0.605 70.651 100.744 110.782 30.637 50.387 40.536 50.732 100.590 80.540 60.856 230.359 120.306 170.596 160.539 30.627 220.706 40.497 80.785 230.757 210.476 24
EMSANet0.600 80.716 40.746 100.395 200.614 90.382 50.523 60.713 130.571 120.503 100.922 70.404 50.397 50.655 100.400 170.626 230.663 60.469 140.900 40.827 40.577 16
Seichter, Daniel and Fischedick, Söhnke and Köhler, Mona and Gross, Horst-Michael: EMSANet: Efficient Multi-Task RGB-D Scene Analysis for Indoor Environments. IJCNN 2022
MCA-Net0.595 90.533 220.756 90.746 40.590 100.334 100.506 90.670 170.587 90.500 130.905 110.366 110.352 100.601 150.506 90.669 170.648 100.501 70.839 170.769 170.516 23
RFBNet0.592 100.616 120.758 80.659 60.581 120.330 110.469 130.655 200.543 150.524 80.924 40.355 140.336 120.572 190.479 110.671 150.648 100.480 110.814 210.814 70.614 12
FAN_NV_RVC0.586 110.510 230.764 60.079 280.620 80.330 110.494 100.753 70.573 100.556 50.884 180.405 40.303 180.718 30.452 140.672 140.658 80.509 50.898 50.813 80.727 2
WSGFormer0.585 120.706 50.708 180.434 170.574 140.283 220.538 40.759 60.542 170.482 170.924 40.351 160.333 130.614 120.393 180.692 100.551 220.461 150.874 90.809 100.673 7
DCRedNet0.583 130.682 80.723 140.542 120.510 220.310 160.451 150.668 180.549 140.520 90.920 80.375 70.446 20.528 220.417 160.670 160.577 190.478 120.862 120.806 110.628 11
MIX6D_RVC0.582 140.695 60.687 190.225 230.632 70.328 130.550 10.748 80.623 60.494 160.890 160.350 170.254 250.688 60.454 130.716 40.597 180.489 100.881 80.768 180.575 17
SSMAcopyleft0.577 150.695 60.716 160.439 150.563 160.314 150.444 170.719 110.551 130.503 100.887 170.346 180.348 110.603 140.353 220.709 60.600 160.457 160.901 30.786 130.599 15
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. International Journal of Computer Vision, 2019
DMMF0.567 160.623 110.767 50.238 220.571 150.347 60.413 210.719 110.472 220.418 240.895 150.357 130.260 240.696 50.523 80.666 180.642 120.437 200.895 60.793 120.603 14
UNIV_CNP_RVC_UE0.566 170.569 210.686 210.435 160.524 190.294 200.421 200.712 140.543 150.463 190.872 190.320 190.363 90.611 130.477 120.686 120.627 130.443 190.862 120.775 160.639 8
EMSAFormer0.564 180.581 180.736 120.564 110.546 180.219 250.517 70.675 160.486 210.427 230.904 120.352 150.320 150.589 170.528 50.708 70.464 260.413 240.847 160.786 130.611 13
SN_RN152pyrx8_RVCcopyleft0.546 190.572 190.663 230.638 80.518 200.298 180.366 260.633 230.510 190.446 210.864 210.296 220.267 210.542 210.346 230.704 80.575 200.431 210.853 150.766 190.630 10
UDSSEG_RVC0.545 200.610 150.661 240.588 90.556 170.268 230.482 110.642 220.572 110.475 180.836 250.312 200.367 80.630 110.189 250.639 210.495 250.452 170.826 190.756 220.541 19
segfomer with 6d0.542 210.594 170.687 190.146 260.579 130.308 170.515 80.703 150.472 220.498 140.868 200.369 100.282 190.589 170.390 190.701 90.556 210.416 230.860 140.759 200.539 21
FuseNetpermissive0.535 220.570 200.681 220.182 240.512 210.290 210.431 180.659 190.504 200.495 150.903 130.308 210.428 40.523 230.365 210.676 130.621 150.470 130.762 240.779 150.541 19
Caner Hazirbas, Lingni Ma, Csaba Domokos, Daniel Cremers: FuseNet: Incorporating Depth into Semantic Segmentation via Fusion-based CNN Architecture. ACCV 2016
AdapNet++copyleft0.503 230.613 140.722 150.418 190.358 280.337 80.370 250.479 260.443 240.368 260.907 100.207 250.213 270.464 260.525 70.618 240.657 90.450 180.788 220.721 250.408 27
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. International Journal of Computer Vision, 2019
3DMV (2d proj)0.498 240.481 260.612 250.579 100.456 240.343 70.384 230.623 240.525 180.381 250.845 240.254 240.264 230.557 200.182 260.581 260.598 170.429 220.760 250.661 270.446 26
Angela Dai, Matthias Niessner: 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation. ECCV'18
MSeg1080_RVCpermissive0.485 250.505 240.709 170.092 270.427 250.241 240.411 220.654 210.385 280.457 200.861 220.053 280.279 200.503 240.481 100.645 200.626 140.365 260.748 260.725 240.529 22
John Lambert*, Zhuang Liu*, Ozan Sener, James Hays, Vladlen Koltun: MSeg: A Composite Dataset for Multi-domain Semantic Segmentation. CVPR 2020
ILC-PSPNet0.475 260.490 250.581 260.289 210.507 230.067 280.379 240.610 250.417 260.435 220.822 270.278 230.267 210.503 240.228 240.616 250.533 240.375 250.820 200.729 230.560 18
Enet (reimpl)0.376 270.264 280.452 280.452 140.365 260.181 260.143 280.456 270.409 270.346 270.769 280.164 260.218 260.359 270.123 280.403 280.381 280.313 280.571 270.685 260.472 25
Re-implementation of Adam Paszke, Abhishek Chaurasia, Sangpil Kim, Eugenio Culurciello: ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation.
ScanNet (2d proj)permissive0.330 280.293 270.521 270.657 70.361 270.161 270.250 270.004 280.440 250.183 280.836 250.125 270.060 280.319 280.132 270.417 270.412 270.344 270.541 280.427 280.109 28
Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner: ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. CVPR'17


This table lists the benchmark results for the 2D semantic instance scenario.




Method Infoavg apbathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
EMSANet (Instance)0.241 10.401 10.439 10.085 10.242 10.220 10.081 10.289 20.117 20.121 10.182 10.126 10.346 10.181 20.181 20.358 10.156 10.675 20.131 1
Seichter, Daniel and Fischedick, Söhnke and Köhler, Mona and Gross, Horst-Michael: EMSANet: Efficient Multi-Task RGB-D Scene Analysis for Indoor Environments. IJCNN 2022
UniDet_RVC0.205 20.381 20.323 30.037 30.226 30.177 30.063 20.277 30.120 10.067 30.131 30.074 30.317 20.080 30.235 10.289 30.141 30.678 10.080 3
FKNet0.204 30.334 30.358 20.038 20.234 20.184 20.025 30.318 10.042 40.088 20.141 20.053 40.300 30.207 10.171 30.292 20.149 20.636 30.109 2
MaskRCNN_ScanNetpermissive0.119 40.129 40.212 40.002 40.112 40.148 40.014 40.205 40.044 30.066 40.078 40.095 20.142 40.030 40.128 40.139 40.080 40.459 40.057 4
Re-implementation of Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross Girshick: Mask R-CNN. ICCV'17


This table lists the benchmark results for the scene type classification scenario.




Method Infoavg iouapartmentbathroombedroom / hotelbookstore / libraryconference roomcopy/mail roomhallwaykitchenlaundry roomliving room / loungemiscofficestorage / basement / garage
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
LAST-PCL-type0.738 10.250 31.000 10.895 11.000 11.000 11.000 10.500 11.000 10.500 20.842 10.000 20.941 10.667 1
Yanmin Wu, Qiankun Gao, Renrui Zhang, and Jian Zhang: Language-Assisted 3D Scene Understanding. arxiv23.12
multi-taskpermissive0.646 20.500 11.000 10.789 20.333 30.667 31.000 10.500 11.000 11.000 10.778 20.000 20.833 20.000 3
Shengyu Huang, Mikhail Usvyatsov, Konrad Schindler: Indoor Scene Recognition in 3D. IROS 2020
3DASPP-SCE0.556 30.500 10.938 30.778 30.667 21.000 10.250 30.500 10.750 30.333 30.500 40.000 20.812 30.200 2
SE-ResNeXt-SSMA0.355 40.000 50.684 40.696 40.200 50.500 40.200 40.500 10.429 40.200 40.545 30.111 10.556 40.000 3
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. arXiv
resnet50_scannet0.231 50.200 40.481 50.346 50.250 40.250 50.000 50.500 10.333 50.000 50.357 50.000 20.286 50.000 3