Presenting the ScanNet200 Benchmark

We present the ScanNet200 benchmark, which studies an order of magnitude more class categories than previous version of ScanNet. The scene geometry is shared within the two tasks, but the parsing of surface annotation allows for a larger vocabulary and more realistic setting for in the wild 3D understanding methods.

The ScanNet200 benchmark includes both finer-grained categories as well as a large number of previously unaddressed classes. This induces a much more challenging setting regarding the diversity of naturally observed semantic classes seen in the raw ScanNet RGB-D observations, where the data also reflects naturally encountered class imbalances. The difference in category frequencies between ScanNet and ScanNet200 can be seen in the Figure above.

ScanNet200 Benchmark

This table lists the benchmark results for the ScanNet200 3D semantic label scenario.




Method Infoavg iouhead ioucommon ioutail ioualarm clockarmchairbackpackbagballbarbasketbathroom cabinetbathroom counterbathroom stallbathroom stall doorbathroom vanitybathtubbedbenchbicyclebinblackboardblanketblindsboardbookbookshelfbottlebowlboxbroombucketbulletin boardcabinetcalendarcandlecartcase of water bottlescd caseceilingceiling lightchairclockclosetcloset doorcloset rodcloset wallclothesclothes dryercoat rackcoffee kettlecoffee makercoffee tablecolumncomputer towercontainercopiercouchcountercratecupcurtaincushiondecorationdeskdining tabledish rackdishwasherdividerdoordoorframedresserdumbbelldustpanend tablefanfile cabinetfire alarmfire extinguisherfireplacefloorfolded chairfurnitureguitarguitar casehair dryerhandicap barhatheadphonesironing boardjacketkeyboardkeyboard pianokitchen cabinetkitchen counterladderlamplaptoplaundry basketlaundry detergentlaundry hamperledgelightlight switchluggagemachinemailboxmatmattressmicrowavemini fridgemirrormonitormousemusic standnightstandobjectoffice chairottomanovenpaperpaper bagpaper cutterpaper towel dispenserpaper towel rollpersonpianopicturepillarpillowpipeplantplateplungerposterpotted plantpower outletpower stripprinterprojectorprojector screenpurserackradiatorrailrange hoodrecycling binrefrigeratorscaleseatshelfshoeshowershower curtainshower curtain rodshower doorshower floorshower headshower wallsignsinksoap dishsoap dispensersofa chairspeakerstair railstairsstandstoolstorage binstorage containerstorage organizerstovestructurestuffed animalsuitcasetabletelephonetissue boxtoastertoaster oventoilettoilet papertoilet paper dispensertoilet paper holdertoilet seat cover dispensertoweltrash bintrash cantraytubetvtv standvacuum cleanerventwallwardrobewashing machinewater bottlewater coolerwater pitcherwhiteboardwindowwindowsill
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
PPT-SpUNet-F.T.0.332 50.556 30.270 20.123 70.519 20.091 30.349 20.000 10.000 20.000 10.339 60.383 70.498 60.833 30.807 20.241 20.584 30.000 30.755 40.124 40.000 50.608 20.330 40.530 60.314 10.000 40.374 40.000 10.000 30.197 20.459 40.000 50.000 10.117 20.000 20.876 30.095 10.682 30.000 40.086 50.518 30.433 10.930 20.000 10.000 20.563 30.542 70.077 40.715 20.858 40.756 20.008 100.171 60.874 30.000 10.039 20.550 50.000 60.545 40.256 50.657 40.453 20.351 40.449 70.213 30.392 50.611 60.000 30.037 80.946 30.138 70.000 10.000 60.063 50.308 20.537 40.796 20.673 20.323 70.392 50.400 70.509 40.000 30.000 10.649 10.000 60.023 60.000 50.000 30.914 50.002 90.506 90.163 60.359 50.872 40.000 50.000 10.623 30.112 40.001 80.000 40.000 10.021 30.753 10.565 90.150 10.579 20.806 60.267 40.616 10.042 90.783 60.000 30.374 70.000 10.000 30.000 20.620 40.000 10.000 40.000 10.572 80.634 20.350 60.792 30.000 50.000 10.376 50.535 30.378 20.855 20.672 20.074 60.000 60.185 30.000 10.727 50.660 50.076 100.000 70.432 60.646 50.000 10.594 60.006 80.000 50.000 10.658 30.000 20.000 10.661 10.549 50.300 70.291 70.045 70.942 60.304 30.600 50.572 30.135 90.695 20.000 10.008 50.793 40.942 10.899 20.000 10.816 30.181 60.897 20.000 10.679 20.223 30.264 20.691 30.345 8
Xiaoyang Wu, Zhuotao Tian, Xin Wen, Bohao Peng, Xihui Liu, Kaicheng Yu, Hengshuang Zhao: Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt Training.
OctFormer ScanNet200permissive0.326 60.539 60.265 50.131 50.499 30.110 10.522 10.000 10.000 20.000 10.318 80.427 40.455 80.743 80.765 60.175 50.842 10.000 30.828 20.204 10.033 30.429 60.335 20.601 10.312 20.000 40.357 50.000 10.000 30.047 70.423 50.000 50.000 10.105 40.000 20.873 50.079 60.670 60.000 40.117 20.471 70.432 20.829 70.000 10.000 20.584 20.417 100.089 30.684 60.837 50.705 90.021 80.178 50.892 20.000 10.028 30.505 70.000 60.457 60.200 80.662 20.412 50.244 80.496 50.000 100.451 30.626 40.000 30.102 50.943 50.138 70.000 10.000 60.149 40.291 30.534 50.722 30.632 40.331 60.253 90.453 40.487 60.000 30.000 10.479 30.000 60.022 70.000 50.000 30.900 60.128 40.684 20.164 50.413 20.854 70.000 50.000 10.512 100.074 100.003 70.000 40.000 10.000 40.469 80.613 60.132 40.529 40.871 20.227 90.582 40.026 100.787 50.000 30.339 80.000 10.000 30.000 20.626 30.000 10.029 30.000 10.587 50.612 40.411 40.724 70.000 50.000 10.407 30.552 20.513 10.849 30.655 30.408 10.000 60.296 10.000 10.686 80.645 70.145 50.022 50.414 70.633 60.000 10.637 10.224 10.000 50.000 10.650 40.000 20.000 10.622 40.535 60.343 50.483 20.230 60.943 50.289 40.618 40.596 10.140 80.679 40.000 10.022 20.783 60.620 80.906 10.000 10.806 50.137 80.865 30.000 10.378 60.000 80.168 100.680 50.227 9
Peng-Shuai Wang: OctFormer: Octree-based Transformers for 3D Point Clouds. SIGGRAPH 2023
CeCo0.340 30.551 50.247 60.181 20.475 60.057 100.142 80.000 10.000 20.000 10.387 30.463 30.499 50.924 10.774 50.213 30.257 60.000 30.546 90.100 70.006 40.615 10.177 100.534 40.246 30.000 40.400 20.000 10.338 10.006 90.484 30.609 20.000 10.083 60.000 20.873 50.089 40.661 70.000 40.048 100.560 10.408 30.892 40.000 10.000 20.586 10.616 50.000 70.692 50.900 10.721 50.162 10.228 20.860 40.000 10.000 60.575 10.083 30.550 30.347 20.624 60.410 60.360 30.740 20.109 70.321 80.660 30.000 30.121 30.939 60.143 50.000 10.400 10.003 70.190 50.564 20.652 60.615 50.421 20.304 80.579 10.547 20.000 30.000 10.296 70.000 60.030 50.096 20.000 30.916 30.037 60.551 60.171 40.376 40.865 50.286 20.000 10.633 20.102 90.027 50.011 30.000 10.000 40.474 70.742 20.133 30.311 60.824 50.242 60.503 70.068 60.828 20.000 30.429 30.000 10.063 20.000 20.781 10.000 10.000 40.000 10.665 10.633 30.450 30.818 20.000 50.000 10.429 20.532 40.226 60.825 40.510 70.377 20.709 10.079 70.000 10.753 20.683 20.102 90.063 30.401 90.620 80.000 10.619 20.000 90.000 50.000 10.595 80.000 20.000 10.345 70.564 30.411 30.603 10.384 30.945 40.266 50.643 30.367 70.304 10.663 60.000 10.010 30.726 80.767 50.898 30.000 10.784 60.435 10.861 50.000 10.447 50.000 80.257 40.656 60.377 6
Zhisheng Zhong, Jiequan Cui, Yibo Yang, Xiaoyang Wu, Xiaojuan Qi, Xiangyu Zhang, Jiaya Jia: Understanding Imbalanced Semantic Segmentation Through Neural Collapse. CVPR 2023
PonderV2 ScanNet2000.346 20.552 40.270 30.175 30.497 40.070 70.239 40.000 10.000 20.000 10.232 100.412 50.584 10.842 20.804 30.212 40.540 40.000 30.433 100.106 60.000 50.590 30.290 60.548 20.243 40.000 40.356 60.000 10.000 30.062 60.398 70.441 40.000 10.104 50.000 20.888 20.076 70.682 30.030 10.094 40.491 50.351 60.869 60.000 10.063 10.403 50.700 20.000 70.660 80.881 20.761 10.050 50.186 40.852 60.000 10.007 40.570 40.100 20.565 20.326 30.641 50.431 30.290 70.621 30.259 20.408 40.622 50.125 10.082 60.950 20.179 30.000 10.263 20.424 20.193 40.558 30.880 10.545 60.375 40.727 20.445 50.499 50.000 30.000 10.475 40.002 40.034 40.083 30.000 30.924 10.290 20.636 30.115 70.400 30.874 30.186 40.000 10.611 40.128 20.113 20.000 40.000 10.000 40.584 50.636 40.103 70.385 50.843 30.283 20.603 30.080 50.825 30.000 30.377 60.000 10.000 30.000 20.457 60.000 10.000 40.000 10.574 70.608 50.481 20.792 30.394 20.000 10.357 60.503 60.261 50.817 60.504 80.304 30.472 30.115 40.000 10.750 30.677 30.202 10.000 70.509 30.729 10.000 10.519 70.000 90.000 50.000 10.620 70.000 20.000 10.660 30.560 40.486 20.384 50.346 40.952 20.247 70.667 20.436 50.269 30.691 30.000 10.010 30.787 50.889 20.880 40.000 10.810 40.336 30.860 60.000 10.606 30.009 50.248 50.681 40.392 5
Haoyi Zhu, Honghui Yang, Xiaoyang Wu, Di Huang, Sha Zhang, Xianglong He, Tong He, Hengshuang Zhao, Chunhua Shen, Yu Qiao, Wanli Ouyang: PonderV2: Pave the Way for 3D Foundataion Model with A Universal Pre-training Paradigm.
OA-CNN-L_ScanNet2000.333 40.558 20.269 40.124 60.448 80.080 50.272 30.000 10.000 20.000 10.342 50.515 20.524 30.713 100.789 40.158 60.384 50.000 30.806 30.125 30.000 50.496 40.332 30.498 90.227 50.024 20.474 10.000 10.003 20.071 50.487 20.000 50.000 10.110 30.000 20.876 30.013 100.703 10.000 40.076 60.473 60.355 50.906 30.000 10.000 20.476 40.706 10.000 70.672 70.835 60.748 40.015 90.223 30.860 40.000 10.000 60.572 30.000 60.509 50.313 40.662 20.398 70.396 20.411 80.276 10.527 20.711 10.000 30.076 70.946 30.166 40.000 10.022 40.160 30.183 60.493 60.699 50.637 30.403 30.330 70.406 60.526 30.024 20.000 10.392 60.000 60.016 100.000 50.196 20.915 40.112 50.557 50.197 20.352 60.877 20.000 50.000 10.592 80.103 80.000 90.067 10.000 10.089 20.735 30.625 50.130 50.568 30.836 40.271 30.534 50.043 80.799 40.001 20.445 20.000 10.000 30.024 10.661 20.000 10.262 10.000 10.591 40.517 90.373 50.788 50.021 40.000 10.455 10.517 50.320 40.823 50.200 100.001 100.150 40.100 50.000 10.736 40.668 40.103 80.052 40.662 10.720 30.000 10.602 50.112 40.002 40.000 10.637 50.000 20.000 10.621 50.569 20.398 40.412 40.234 50.949 30.363 20.492 90.495 40.251 40.665 50.000 10.001 70.805 30.833 40.794 60.000 10.821 20.314 40.843 70.000 10.560 40.245 20.262 30.713 20.370 7
PTv3 ScanNet2000.393 10.592 10.330 10.216 10.520 10.109 20.108 100.000 10.337 10.000 10.310 90.394 60.494 70.753 60.848 10.256 10.717 20.000 30.842 10.192 20.065 20.449 50.346 10.546 30.190 60.000 40.384 30.000 10.000 30.218 10.505 10.791 10.000 10.136 10.000 20.903 10.073 80.687 20.000 40.168 10.551 20.387 40.941 10.000 10.000 20.397 60.654 30.000 70.714 30.759 80.752 30.118 30.264 10.926 10.000 10.048 10.575 10.000 60.597 10.366 10.755 10.469 10.474 10.798 10.140 60.617 10.692 20.000 30.592 10.971 10.188 20.000 10.133 30.593 10.349 10.650 10.717 40.699 10.455 10.790 10.523 30.636 10.301 10.000 10.622 20.000 60.017 90.259 10.000 30.921 20.337 10.733 10.210 10.514 10.860 60.407 10.000 10.688 10.109 60.000 90.000 40.000 10.151 10.671 40.782 10.115 60.641 10.903 10.349 10.616 10.088 40.832 10.000 30.480 10.000 10.428 10.000 20.497 50.000 10.000 40.000 10.662 20.690 10.612 10.828 10.575 10.000 10.404 40.644 10.325 30.887 10.728 10.009 90.134 50.026 100.000 10.761 10.731 10.172 30.077 20.528 20.727 20.000 10.603 40.220 20.022 20.000 10.740 10.000 20.000 10.661 10.586 10.566 10.436 30.531 10.978 10.457 10.708 10.583 20.141 70.748 10.000 10.026 10.822 10.871 30.879 50.000 10.851 10.405 20.914 10.000 10.682 10.000 80.281 10.738 10.463 3
LGroundpermissive0.272 80.485 80.184 80.106 80.476 50.077 60.218 50.000 10.000 20.000 10.547 10.295 80.540 20.746 70.745 80.058 90.112 90.005 10.658 60.077 100.000 50.322 80.178 90.512 70.190 60.199 10.277 80.000 10.000 30.173 30.399 60.000 50.000 10.039 90.000 20.858 80.085 50.676 50.002 20.103 30.498 40.323 70.703 80.000 10.000 20.296 80.549 60.216 10.702 40.768 70.718 70.028 60.092 90.786 90.000 10.000 60.453 90.022 40.251 100.252 60.572 80.348 80.321 50.514 40.063 80.279 90.552 80.000 30.019 90.932 80.132 90.000 10.000 60.000 90.156 100.457 80.623 70.518 70.265 90.358 60.381 80.395 80.000 30.000 10.127 100.012 30.051 10.000 50.000 30.886 90.014 70.437 100.179 30.244 80.826 80.000 50.000 10.599 60.136 10.085 30.000 40.000 10.000 40.565 60.612 70.143 20.207 80.566 80.232 80.446 80.127 20.708 80.000 30.384 50.000 10.000 30.000 20.402 70.000 10.059 20.000 10.525 100.566 70.229 80.659 80.000 50.000 10.265 80.446 70.147 90.720 100.597 50.066 70.000 60.187 20.000 10.726 60.467 100.134 70.000 70.413 80.629 70.000 10.363 90.055 60.022 20.000 10.626 60.000 20.000 10.323 80.479 100.154 90.117 80.028 90.901 80.243 80.415 100.295 100.143 60.610 90.000 10.000 80.777 70.397 100.324 90.000 10.778 80.179 70.702 90.000 10.274 100.404 10.233 60.622 80.398 4
David Rozenberszki, Or Litany, Angela Dai: Language-Grounded Indoor 3D Semantic Segmentation in the Wild. arXiv
AWCS0.305 70.508 70.225 70.142 40.463 70.063 80.195 60.000 10.000 20.000 10.467 20.551 10.504 40.773 40.764 70.142 70.029 100.000 30.626 70.100 70.000 50.360 70.179 80.507 80.137 80.006 30.300 70.000 10.000 30.172 40.364 80.512 30.000 10.056 70.000 20.865 70.093 30.634 100.000 40.071 80.396 80.296 90.876 50.000 10.000 20.373 70.436 90.063 60.749 10.877 30.721 50.131 20.124 70.804 80.000 10.000 60.515 60.010 50.452 70.252 60.578 70.417 40.179 100.484 60.171 40.337 70.606 70.000 30.115 40.937 70.142 60.000 10.008 50.000 90.157 90.484 70.402 100.501 80.339 50.553 30.529 20.478 70.000 30.000 10.404 50.001 50.022 70.077 40.000 30.894 80.219 30.628 40.093 80.305 70.886 10.233 30.000 10.603 50.112 40.023 60.000 40.000 10.000 40.741 20.664 30.097 80.253 70.782 70.264 50.523 60.154 10.707 90.000 30.411 40.000 10.000 30.000 20.332 90.000 10.000 40.000 10.602 30.595 60.185 90.656 90.159 30.000 10.355 70.424 80.154 80.729 80.516 60.220 50.620 20.084 60.000 10.707 70.651 60.173 20.014 60.381 100.582 90.000 10.619 20.049 70.000 50.000 10.702 20.000 20.000 10.302 90.489 80.317 60.334 60.392 20.922 70.254 60.533 80.394 60.129 100.613 80.000 10.000 80.820 20.649 70.749 70.000 10.782 70.282 50.863 40.000 10.288 90.006 60.220 70.633 70.542 1
CSC-Pretrainpermissive0.249 100.455 100.171 90.079 100.418 90.059 90.186 70.000 10.000 20.000 10.335 70.250 90.316 90.766 50.697 100.142 70.170 70.003 20.553 80.112 50.097 10.201 100.186 70.476 100.081 90.000 40.216 100.000 10.000 30.001 100.314 100.000 50.000 10.055 80.000 20.832 100.094 20.659 80.002 20.076 60.310 100.293 100.664 100.000 10.000 20.175 100.634 40.130 20.552 100.686 100.700 100.076 40.110 80.770 100.000 10.000 60.430 100.000 60.319 80.166 90.542 100.327 90.205 90.332 90.052 90.375 60.444 100.000 30.012 100.930 100.203 10.000 10.000 60.046 60.175 70.413 90.592 80.471 90.299 80.152 100.340 90.247 100.000 30.000 10.225 80.058 20.037 20.000 50.207 10.862 100.014 70.548 70.033 90.233 90.816 90.000 50.000 10.542 90.123 30.121 10.019 20.000 10.000 40.463 90.454 100.045 100.128 100.557 90.235 70.441 90.063 70.484 100.000 30.308 100.000 10.000 30.000 20.318 100.000 10.000 40.000 10.545 90.543 80.164 100.734 60.000 50.000 10.215 100.371 90.198 70.743 70.205 90.062 80.000 60.079 70.000 10.683 90.547 90.142 60.000 70.441 50.579 100.000 10.464 80.098 50.041 10.000 10.590 90.000 20.000 10.373 60.494 70.174 80.105 90.001 100.895 90.222 90.537 70.307 90.180 50.625 70.000 10.000 80.591 100.609 90.398 80.000 10.766 100.014 100.638 100.000 10.377 70.004 70.206 90.609 100.465 2
Ji Hou, Benjamin Graham, Matthias Nießner, Saining Xie: Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts. CVPR 2021
Minkowski 34Dpermissive0.253 90.463 90.154 100.102 90.381 100.084 40.134 90.000 10.000 20.000 10.386 40.141 100.279 100.737 90.703 90.014 100.164 80.000 30.663 50.092 90.000 50.224 90.291 50.531 50.056 100.000 40.242 90.000 10.000 30.013 80.331 90.000 50.000 10.035 100.001 10.858 80.059 90.650 90.000 40.056 90.353 90.299 80.670 90.000 10.000 20.284 90.484 80.071 50.594 90.720 90.710 80.027 70.068 100.813 70.000 10.005 50.492 80.164 10.274 90.111 100.571 90.307 100.293 60.307 100.150 50.163 100.531 90.002 20.545 20.932 80.093 100.000 10.000 60.002 80.159 80.368 100.581 90.440 100.228 100.406 40.282 100.294 90.000 30.000 10.189 90.060 10.036 30.000 50.000 30.897 70.000 100.525 80.025 100.205 100.771 100.000 50.000 10.593 70.108 70.044 40.000 40.000 10.000 40.282 100.589 80.094 90.169 90.466 100.227 90.419 100.125 30.757 70.002 10.334 90.000 10.000 30.000 20.357 80.000 10.000 40.000 10.582 60.513 100.337 70.612 100.000 50.000 10.250 90.352 100.136 100.724 90.655 30.280 40.000 60.046 90.000 10.606 100.559 80.159 40.102 10.445 40.655 40.000 10.310 100.117 30.000 50.000 10.581 100.026 10.000 10.265 100.483 90.084 100.097 100.044 80.865 100.142 100.588 60.351 80.272 20.596 100.000 10.003 60.622 90.720 60.096 100.000 10.771 90.016 90.772 80.000 10.302 80.194 40.214 80.621 90.197 10
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019


This table lists the benchmark results for the ScanNet200 3D semantic instance scenario.




Method Infoavgalarm clockarmchairbackpackbagballbarbasketbathroom cabinetbathroom counterbathroom stallbathroom stall doorbathroom vanitybathtubbedbenchbicyclebinblackboardblanketblindsboardbookbookshelfbottlebowlboxbroombucketbulletin boardcabinetcalendarcandlecartcase of water bottlescd caseceilingceiling lightchairclockclosetcloset doorcloset rodcloset wallclothesclothes dryercoat rackcoffee kettlecoffee makercoffee tablecolumncomputer towercontainercopiercouchcountercratecupcurtaincushiondecorationdeskdining tabledish rackdishwasherdividerdoordoorframedresserdumbbelldustpanend tablefanfile cabinetfire alarmfire extinguisherfireplacefolded chairfurnitureguitarguitar casehair dryerhandicap barhatheadphonesironing boardjacketkeyboardkeyboard pianokitchen cabinetkitchen counterladderlamplaptoplaundry basketlaundry detergentlaundry hamperledgelightlight switchluggagemachinemailboxmatmattressmicrowavemini fridgemirrormonitormousemusic standnightstandobjectoffice chairottomanovenpaperpaper bagpaper cutterpaper towel dispenserpaper towel rollpersonpianopicturepillarpillowpipeplantplateplungerposterpotted plantpower outletpower stripprinterprojectorprojector screenpurserackradiatorrailrange hoodrecycling binrefrigeratorscaleseatshelfshoeshowershower curtainshower curtain rodshower doorshower floorshower headshower wallsignsinksoap dishsoap dispensersofa chairspeakerstair railstairsstandstoolstorage binstorage containerstorage organizerstovestructurestuffed animalsuitcasetabletelephonetissue boxtoastertoaster oventoilettoilet papertoilet paper dispensertoilet paper holdertoilet seat cover dispensertoweltrash bintrash cantraytubetvtv standvacuum cleanerventwardrobewashing machinewater bottlewater coolerwater pitcherwhiteboardwindowwindowsill
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Mask3D Scannet2000.388 10.542 10.357 10.237 10.610 10.091 10.125 50.000 10.000 10.000 10.065 30.668 10.451 11.000 10.955 10.640 10.500 10.039 10.125 20.063 20.409 10.311 20.291 10.609 30.266 10.000 10.163 10.000 10.008 10.044 20.496 11.000 10.000 10.018 20.000 10.756 10.573 10.808 20.000 10.010 10.042 30.130 30.552 10.042 10.000 11.000 10.725 40.750 10.883 11.000 10.832 40.024 20.107 10.614 30.226 10.250 10.628 20.792 10.677 20.400 10.741 10.278 10.511 10.077 50.111 10.313 20.715 20.302 10.017 30.200 20.000 10.188 10.000 10.178 20.736 11.000 10.615 10.514 10.409 20.380 50.600 10.000 10.000 10.400 10.013 20.254 10.381 10.000 10.123 40.400 10.839 10.258 10.463 10.926 10.265 10.000 10.857 20.099 10.021 20.500 10.027 10.028 11.000 10.502 50.016 10.076 40.500 10.612 10.578 10.005 20.597 20.194 10.497 10.000 10.500 10.000 20.323 40.000 11.000 10.000 10.748 10.708 20.050 40.890 21.000 10.008 20.151 30.301 11.000 11.000 10.792 30.945 11.000 10.511 10.004 20.753 10.776 20.287 20.020 20.003 40.974 30.033 10.412 50.000 10.000 20.000 20.667 10.000 10.000 10.491 10.676 20.352 10.335 10.060 20.822 50.527 21.000 10.517 10.606 10.853 10.000 10.004 10.806 11.000 10.727 10.000 10.042 20.739 20.000 10.399 30.391 10.504 10.591 10.571 1
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
TD3D Scannet200permissive0.320 20.501 20.264 20.164 20.506 30.062 20.500 10.000 10.000 10.000 10.208 10.431 20.252 31.000 10.733 30.587 20.000 20.008 20.000 30.106 10.000 20.356 10.123 40.686 10.101 20.000 10.152 20.000 10.000 20.226 10.280 30.000 20.000 10.250 10.000 10.619 20.061 30.841 10.000 10.000 20.167 10.194 10.333 20.000 20.000 10.667 20.820 10.250 30.790 41.000 10.879 20.077 10.094 30.708 10.217 20.049 20.634 10.792 10.331 40.033 50.716 20.159 20.396 20.331 40.099 20.415 10.842 10.000 20.458 10.542 10.000 10.101 20.000 10.218 10.513 20.500 20.458 20.104 20.516 10.456 10.268 40.000 10.000 10.400 10.022 10.233 20.143 20.000 10.677 10.400 10.504 50.095 30.083 50.890 20.061 20.000 10.906 10.076 20.231 10.125 20.000 20.003 20.792 30.881 10.000 20.098 30.125 40.498 50.459 20.063 10.715 10.000 20.241 40.000 10.396 20.063 10.605 10.000 10.000 20.000 10.448 50.629 30.202 20.967 10.250 20.038 10.192 10.185 20.083 41.000 11.000 10.857 20.000 20.470 20.012 10.565 30.798 10.621 10.111 10.500 11.000 10.017 20.509 10.000 10.008 11.000 10.525 20.000 10.000 10.332 30.679 10.264 20.333 20.267 11.000 10.549 10.299 50.387 20.328 30.744 40.000 10.000 20.435 51.000 10.283 40.000 10.196 10.817 10.000 10.472 10.222 30.123 40.560 20.156 2
Maksim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: Top-Down Beats Bottom-Up in 3D Instance Segmentation. WACV 2024
LGround Inst.permissive0.246 30.413 30.170 30.130 30.455 50.003 50.500 10.000 10.000 10.000 10.017 40.333 40.111 51.000 10.681 40.400 30.000 20.000 31.000 10.003 50.000 20.167 30.190 20.637 20.067 30.000 10.081 30.000 10.000 20.000 30.264 40.000 20.000 10.000 30.000 10.387 40.031 50.754 30.000 10.000 20.151 20.135 20.056 40.000 20.000 10.582 40.589 50.500 20.815 21.000 10.903 10.000 30.097 20.588 40.000 30.000 30.234 30.000 30.500 30.400 10.682 40.156 30.159 40.750 10.046 30.125 40.660 30.000 20.200 20.000 50.000 10.000 30.000 10.164 30.402 30.500 20.373 30.025 30.143 50.426 30.317 20.000 10.000 10.000 30.000 30.063 30.000 30.000 10.000 50.000 40.575 30.250 20.241 20.772 30.000 30.000 10.653 40.034 30.000 30.000 30.000 20.000 31.000 10.561 40.000 20.100 20.500 10.541 40.452 30.000 30.581 30.000 20.364 20.000 10.000 30.000 20.571 20.000 10.000 20.000 10.568 40.511 40.167 30.857 30.000 30.000 30.164 20.112 30.000 50.530 51.000 10.286 30.000 20.125 30.000 30.464 50.706 30.208 40.000 30.125 20.744 40.000 30.500 20.000 10.000 20.000 20.511 30.000 10.000 10.344 20.541 30.068 30.333 20.000 31.000 10.196 40.533 30.318 30.000 40.748 30.000 10.000 20.690 21.000 10.400 30.000 10.000 30.667 30.000 10.333 40.333 20.270 30.399 30.083 4
David Rozenberszki, Or Litany, Angela Dai: Language-Grounded Indoor 3D Semantic Segmentation in the Wild.
Minkowski 34D Inst.permissive0.203 50.369 40.134 50.078 50.479 40.003 40.500 10.000 10.000 10.000 10.100 20.371 30.300 20.667 40.746 20.400 30.000 20.000 30.000 30.031 30.000 20.074 40.165 30.413 50.000 40.000 10.070 40.000 10.000 20.000 30.221 50.000 20.000 10.000 30.000 10.372 50.070 20.706 40.000 10.000 20.000 50.123 40.033 50.000 20.000 10.422 50.732 30.000 40.778 51.000 10.845 30.000 30.090 40.636 20.000 30.000 30.158 40.000 30.250 50.050 40.693 30.123 40.051 50.385 30.009 40.118 50.406 50.000 20.000 40.200 20.000 10.000 30.000 10.133 40.307 50.500 20.251 40.000 40.281 30.402 40.317 20.000 10.000 10.000 30.000 30.060 40.000 30.000 10.396 20.200 30.669 20.021 40.218 40.720 50.000 30.000 10.696 30.025 40.000 30.000 30.000 20.000 30.125 50.596 20.000 20.191 10.500 10.595 20.369 40.000 30.500 40.000 20.143 50.000 10.000 30.000 20.226 50.000 10.000 20.000 10.701 20.511 40.000 50.851 40.000 30.000 30.150 40.052 50.100 30.981 30.500 40.286 30.000 20.000 50.000 30.545 40.522 50.250 30.000 30.000 50.522 50.000 30.500 20.000 10.000 20.000 20.282 50.000 10.000 10.178 50.382 40.018 50.056 40.000 30.997 30.107 50.677 20.313 40.000 40.726 50.000 10.000 20.583 40.903 40.200 50.000 10.000 30.333 40.000 10.442 20.083 40.109 50.387 40.000 5
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019
CSC-Pretrain Inst.permissive0.209 40.361 50.157 40.085 40.506 20.007 30.500 10.000 10.000 10.000 10.000 50.093 50.221 40.667 40.524 50.400 30.000 20.000 30.000 30.004 40.000 20.000 50.109 50.589 40.000 40.000 10.059 50.000 10.000 20.000 30.322 20.000 20.000 10.000 30.000 10.405 30.055 40.700 50.000 10.000 20.028 40.091 50.083 30.000 20.000 10.667 20.768 20.000 40.807 31.000 10.776 50.000 30.000 50.340 50.000 30.000 30.103 50.000 30.750 10.200 30.634 50.053 50.246 30.677 20.006 50.198 30.432 40.000 20.000 40.050 40.000 10.000 30.000 10.111 50.356 40.500 20.188 50.000 40.220 40.448 20.050 50.000 10.000 10.000 30.000 30.032 50.000 30.000 10.396 20.000 40.573 40.000 50.228 30.747 40.000 30.000 10.573 50.021 50.000 30.000 30.000 20.000 30.500 40.573 30.000 20.000 50.125 40.592 30.364 50.000 30.450 50.000 20.364 20.000 10.000 30.000 20.340 30.000 10.000 20.000 10.610 30.833 10.221 10.702 50.000 30.000 30.135 50.094 40.125 20.571 40.500 40.143 50.000 20.125 30.000 30.618 20.667 40.115 50.000 30.125 21.000 10.000 30.500 20.000 10.000 20.000 20.502 40.000 10.000 10.312 40.248 50.050 40.000 50.000 30.997 30.420 30.500 40.149 50.451 20.748 20.000 10.000 20.636 30.667 50.600 20.000 10.000 30.278 50.000 10.333 40.000 50.294 20.381 50.110 3
Ji Hou, Benjamin Graham, Matthias Nießner, Saining Xie: Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts. CVPR 2021


ScanNet Benchmark

This table lists the benchmark results for the 3D semantic label scenario.


Method Infoavg ioubathtubbedbookshelfcabinetchaircountercurtaindeskdoorfloorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwallwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
PTv3 ScanNet0.794 10.941 30.813 150.851 60.782 50.890 20.597 10.916 10.696 70.713 30.979 10.635 10.384 20.793 20.907 60.821 30.790 280.696 100.967 30.903 10.805 1
PonderV20.785 20.978 10.800 230.833 200.788 30.853 140.545 140.910 40.713 10.705 40.979 10.596 50.390 10.769 100.832 380.821 30.792 270.730 10.975 10.897 30.785 3
Haoyi Zhu, Honghui Yang, Xiaoyang Wu, Di Huang, Sha Zhang, Xianglong He, Tong He, Hengshuang Zhao, Chunhua Shen, Yu Qiao, Wanli Ouyang: PonderV2: Pave the Way for 3D Foundataion Model with A Universal Pre-training Paradigm.
Mix3Dpermissive0.781 30.964 20.855 10.843 140.781 60.858 100.575 50.831 280.685 110.714 20.979 10.594 60.310 230.801 10.892 130.841 20.819 30.723 40.940 110.887 50.725 20
Alexey Nekrasov, Jonas Schult, Or Litany, Bastian Leibe, Francis Engelmann: Mix3D: Out-of-Context Data Augmentation for 3D Scenes. 3DV 2021 (Oral)
Swin3Dpermissive0.779 40.861 190.818 120.836 170.790 20.875 30.576 40.905 50.704 40.739 10.969 90.611 20.349 90.756 180.958 10.702 400.805 120.708 70.916 280.898 20.801 2
PPT-SpUNet-Joint0.766 50.932 40.794 290.829 220.751 190.854 120.540 170.903 60.630 300.672 120.963 120.565 190.357 70.788 30.900 90.737 220.802 130.685 140.950 50.887 50.780 4
Xiaoyang Wu, Zhuotao Tian, Xin Wen, Bohao Peng, Xihui Liu, Kaicheng Yu, Hengshuang Zhao: Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt Training.
OctFormerpermissive0.766 50.925 60.808 190.849 80.786 40.846 240.566 80.876 120.690 90.674 110.960 140.576 150.226 610.753 200.904 70.777 100.815 50.722 50.923 240.877 110.776 6
Peng-Shuai Wang: OctFormer: Octree-based Transformers for 3D Point Clouds. SIGGRAPH 2023
OccuSeg+Semantic0.764 70.758 550.796 270.839 160.746 210.907 10.562 90.850 200.680 130.672 120.978 40.610 30.335 140.777 60.819 410.847 10.830 10.691 120.972 20.885 70.727 18
CU-Hybrid Net0.764 70.924 70.819 100.840 150.757 140.853 140.580 20.848 210.709 30.643 200.958 170.587 100.295 290.753 200.884 170.758 160.815 50.725 30.927 210.867 180.743 12
O-CNNpermissive0.762 90.924 70.823 60.844 130.770 80.852 160.577 30.847 230.711 20.640 240.958 170.592 70.217 670.762 140.888 140.758 160.813 80.726 20.932 190.868 170.744 11
Peng-Shuai Wang, Yang Liu, Yu-Xiao Guo, Chun-Yu Sun, Xin Tong: O-CNN: Octree-based Convolutional Neural Networks for 3D Shape Analysis. SIGGRAPH 2017
OA-CNN-L_ScanNet200.756 100.783 410.826 50.858 40.776 70.837 300.548 130.896 90.649 220.675 100.962 130.586 110.335 140.771 90.802 450.770 120.787 300.691 120.936 140.880 100.761 8
ConDaFormer0.755 110.927 50.822 70.836 170.801 10.849 190.516 270.864 170.651 210.680 90.958 170.584 130.282 360.759 160.855 280.728 240.802 130.678 160.880 540.873 160.756 9
Lunhao Duan, Shanshan Zhao, Nan Xue, Mingming Gong, Guisong Xia, Dacheng Tao: ConDaFormer : Disassembled Transformer with Local Structure Enhancement for 3D Point Cloud Understanding. Neurips, 2023
PNE0.755 110.786 390.835 40.834 190.758 120.849 190.570 70.836 270.648 230.668 140.978 40.581 140.367 50.683 300.856 260.804 50.801 170.678 160.961 40.889 40.716 25
P. Hermosilla: Point Neighborhood Embeddings.
DMF-Net0.752 130.906 110.793 310.802 370.689 350.825 410.556 100.867 140.681 120.602 390.960 140.555 240.365 60.779 50.859 230.747 190.795 240.717 60.917 270.856 260.764 7
C.Yang, Y.Yan, W.Zhao, J.Ye, X.Yang, A.Hussain, B.Dong, K.Huang: Towards Deeper and Better Multi-view Feature Fusion for 3D Semantic Segmentation. ICONIP 2023
PointTransformerV20.752 130.742 620.809 180.872 10.758 120.860 90.552 110.891 100.610 370.687 50.960 140.559 220.304 260.766 120.926 30.767 130.797 200.644 290.942 90.876 140.722 22
Xiaoyang Wu, Yixing Lao, Li Jiang, Xihui Liu, Hengshuang Zhao: Point Transformer V2: Grouped Vector Attention and Partition-based Pooling. NeurIPS 2022
PointConvFormer0.749 150.793 370.790 320.807 330.750 200.856 110.524 230.881 110.588 490.642 230.977 70.591 80.274 410.781 40.929 20.804 50.796 210.642 300.947 70.885 70.715 26
Wenxuan Wu, Qi Shan, Li Fuxin: PointConvFormer: Revenge of the Point-based Convolution.
BPNetcopyleft0.749 150.909 90.818 120.811 300.752 170.839 290.485 420.842 240.673 140.644 190.957 210.528 330.305 250.773 80.859 230.788 70.818 40.693 110.916 280.856 260.723 21
Wenbo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia, Tien-Tsin Wong: Bidirectional Projection Network for Cross Dimension Scene Understanding. CVPR 2021 (Oral)
MSP0.748 170.623 880.804 210.859 30.745 220.824 430.501 320.912 30.690 90.685 70.956 220.567 180.320 200.768 110.918 40.720 290.802 130.676 190.921 250.881 90.779 5
StratifiedFormerpermissive0.747 180.901 120.803 220.845 120.757 140.846 240.512 280.825 310.696 70.645 180.956 220.576 150.262 520.744 250.861 220.742 200.770 390.705 80.899 400.860 230.734 13
Xin Lai*, Jianhui Liu*, Li Jiang, Liwei Wang, Hengshuang Zhao, Shu Liu, Xiaojuan Qi, Jiaya Jia: Stratified Transformer for 3D Point Cloud Segmentation. CVPR 2022
Virtual MVFusion0.746 190.771 490.819 100.848 100.702 330.865 80.397 790.899 70.699 50.664 150.948 500.588 90.330 160.746 240.851 320.764 140.796 210.704 90.935 150.866 190.728 16
Abhijit Kundu, Xiaoqi Yin, Alireza Fathi, David Ross, Brian Brewington, Thomas Funkhouser, Caroline Pantofaru: Virtual Multi-view Fusion for 3D Semantic Segmentation. ECCV 2020
VMNetpermissive0.746 190.870 170.838 20.858 40.729 270.850 180.501 320.874 130.587 500.658 160.956 220.564 200.299 270.765 130.900 90.716 320.812 90.631 350.939 120.858 240.709 27
Zeyu HU, Xuyang Bai, Jiaxiang Shang, Runze Zhang, Jiayu Dong, Xin Wang, Guangyuan Sun, Hongbo Fu, Chiew-Lan Tai: VMNet: Voxel-Mesh Network for Geodesic-Aware 3D Semantic Segmentation. ICCV 2021 (Oral)
Retro-FPN0.744 210.842 250.800 230.767 510.740 230.836 320.541 160.914 20.672 150.626 280.958 170.552 250.272 430.777 60.886 160.696 410.801 170.674 210.941 100.858 240.717 23
Peng Xiang*, Xin Wen*, Yu-Shen Liu, Hui Zhang, Yi Fang, Zhizhong Han: Retrospective Feature Pyramid Network for Point Cloud Semantic Segmentation. ICCV 2023
EQ-Net0.743 220.620 890.799 260.849 80.730 260.822 450.493 390.897 80.664 160.681 80.955 250.562 210.378 30.760 150.903 80.738 210.801 170.673 220.907 320.877 110.745 10
Zetong Yang*, Li Jiang*, Yanan Sun, Bernt Schiele, Jiaya JIa: A Unified Query-based Paradigm for Point Cloud Understanding. CVPR 2022
SAT0.742 230.860 200.765 440.819 250.769 90.848 210.533 190.829 290.663 170.631 270.955 250.586 110.274 410.753 200.896 110.729 230.760 460.666 240.921 250.855 280.733 14
LRPNet0.742 230.816 320.806 200.807 330.752 170.828 390.575 50.839 260.699 50.637 250.954 310.520 350.320 200.755 190.834 360.760 150.772 360.676 190.915 300.862 210.717 23
LargeKernel3D0.739 250.909 90.820 90.806 350.740 230.852 160.545 140.826 300.594 480.643 200.955 250.541 270.263 510.723 280.858 250.775 110.767 400.678 160.933 170.848 330.694 32
Yukang Chen*, Jianhui Liu*, Xiangyu Zhang, Xiaojuan Qi, Jiaya Jia: LargeKernel3D: Scaling up Kernels in 3D Sparse CNNs. CVPR 2023
RPN0.736 260.776 450.790 320.851 60.754 160.854 120.491 410.866 150.596 470.686 60.955 250.536 280.342 110.624 450.869 190.787 80.802 130.628 360.927 210.875 150.704 29
MinkowskiNetpermissive0.736 260.859 210.818 120.832 210.709 310.840 280.521 250.853 190.660 190.643 200.951 400.544 260.286 340.731 260.893 120.675 490.772 360.683 150.874 600.852 310.727 18
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019
IPCA0.731 280.890 130.837 30.864 20.726 280.873 40.530 220.824 320.489 810.647 170.978 40.609 40.336 130.624 450.733 540.758 160.776 340.570 600.949 60.877 110.728 16
PointTransformer++0.725 290.727 700.811 170.819 250.765 100.841 270.502 310.814 370.621 330.623 300.955 250.556 230.284 350.620 470.866 200.781 90.757 500.648 270.932 190.862 210.709 27
SparseConvNet0.725 290.647 850.821 80.846 110.721 290.869 50.533 190.754 520.603 430.614 320.955 250.572 170.325 180.710 290.870 180.724 270.823 20.628 360.934 160.865 200.683 35
MatchingNet0.724 310.812 340.812 160.810 310.735 250.834 340.495 380.860 180.572 560.602 390.954 310.512 370.280 380.757 170.845 340.725 260.780 320.606 460.937 130.851 320.700 31
INS-Conv-semantic0.717 320.751 580.759 470.812 290.704 320.868 60.537 180.842 240.609 390.608 350.953 340.534 300.293 300.616 480.864 210.719 310.793 250.640 310.933 170.845 370.663 40
PointMetaBase0.714 330.835 260.785 340.821 230.684 370.846 240.531 210.865 160.614 340.596 430.953 340.500 400.246 570.674 310.888 140.692 420.764 420.624 380.849 750.844 380.675 37
contrastBoundarypermissive0.705 340.769 520.775 390.809 320.687 360.820 480.439 670.812 380.661 180.591 450.945 580.515 360.171 850.633 420.856 260.720 290.796 210.668 230.889 470.847 340.689 33
Liyao Tang, Yibing Zhan, Zhe Chen, Baosheng Yu, Dacheng Tao: Contrastive Boundary Learning for Point Cloud Segmentation. CVPR2022
ClickSeg_Semantic0.703 350.774 470.800 230.793 420.760 110.847 230.471 460.802 410.463 880.634 260.968 110.491 430.271 450.726 270.910 50.706 360.815 50.551 710.878 550.833 390.570 71
RFCR0.702 360.889 140.745 570.813 280.672 400.818 520.493 390.815 360.623 310.610 330.947 520.470 510.249 560.594 510.848 330.705 370.779 330.646 280.892 450.823 450.611 54
Jingyu Gong, Jiachen Xu, Xin Tan, Haichuan Song, Yanyun Qu, Yuan Xie, Lizhuang Ma: Omni-Supervised Point Cloud Segmentation via Gradual Receptive Field Component Reasoning. CVPR2021
One Thing One Click0.701 370.825 300.796 270.723 580.716 300.832 350.433 690.816 340.634 280.609 340.969 90.418 770.344 100.559 630.833 370.715 330.808 110.560 650.902 370.847 340.680 36
JSENetpermissive0.699 380.881 160.762 450.821 230.667 410.800 640.522 240.792 440.613 350.607 360.935 780.492 420.205 720.576 560.853 300.691 430.758 480.652 260.872 630.828 420.649 44
Zeyu HU, Mingmin Zhen, Xuyang BAI, Hongbo Fu, Chiew-lan Tai: JSENet: Joint Semantic Segmentation and Edge Detection Network for 3D Point Clouds. ECCV 2020
One-Thing-One-Click0.693 390.743 610.794 290.655 810.684 370.822 450.497 370.719 620.622 320.617 310.977 70.447 640.339 120.750 230.664 700.703 390.790 280.596 500.946 80.855 280.647 45
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
PicassoNet-IIpermissive0.692 400.732 660.772 400.786 430.677 390.866 70.517 260.848 210.509 740.626 280.952 380.536 280.225 630.545 690.704 610.689 460.810 100.564 640.903 360.854 300.729 15
Huan Lei, Naveed Akhtar, Mubarak Shah, and Ajmal Mian: Geometric feature learning for 3D meshes.
Feature_GeometricNetpermissive0.690 410.884 150.754 510.795 400.647 470.818 520.422 710.802 410.612 360.604 370.945 580.462 540.189 800.563 620.853 300.726 250.765 410.632 340.904 340.821 480.606 58
Kangcheng Liu, Ben M. Chen: https://arxiv.org/abs/2012.09439. arXiv Preprint
FusionNet0.688 420.704 750.741 610.754 550.656 430.829 370.501 320.741 570.609 390.548 520.950 440.522 340.371 40.633 420.756 490.715 330.771 380.623 390.861 710.814 500.658 41
Feihu Zhang, Jin Fang, Benjamin Wah, Philip Torr: Deep FusionNet for Point Cloud Semantic Segmentation. ECCV 2020
Feature-Geometry Netpermissive0.685 430.866 180.748 540.819 250.645 490.794 670.450 570.802 410.587 500.604 370.945 580.464 530.201 750.554 650.840 350.723 280.732 590.602 480.907 320.822 470.603 61
VACNN++0.684 440.728 690.757 500.776 480.690 340.804 620.464 510.816 340.577 550.587 460.945 580.508 390.276 400.671 320.710 590.663 540.750 530.589 550.881 520.832 410.653 43
KP-FCNN0.684 440.847 240.758 490.784 450.647 470.814 550.473 450.772 470.605 410.594 440.935 780.450 620.181 830.587 520.805 440.690 440.785 310.614 420.882 510.819 490.632 50
H. Thomas, C. Qi, J. Deschaud, B. Marcotegui, F. Goulette, L. Guibas.: KPConv: Flexible and Deformable Convolution for Point Clouds. ICCV 2019
DGNet0.684 440.712 740.784 350.782 470.658 420.835 330.499 360.823 330.641 250.597 420.950 440.487 440.281 370.575 570.619 740.647 620.764 420.620 410.871 660.846 360.688 34
Superpoint Network0.683 470.851 230.728 650.800 390.653 450.806 600.468 480.804 390.572 560.602 390.946 550.453 610.239 600.519 740.822 390.689 460.762 450.595 520.895 430.827 430.630 51
PointContrast_LA_SEM0.683 470.757 560.784 350.786 430.639 510.824 430.408 740.775 460.604 420.541 540.934 820.532 310.269 470.552 660.777 470.645 650.793 250.640 310.913 310.824 440.671 38
VI-PointConv0.676 490.770 510.754 510.783 460.621 550.814 550.552 110.758 500.571 580.557 500.954 310.529 320.268 490.530 720.682 650.675 490.719 620.603 470.888 480.833 390.665 39
Xingyi Li, Wenxuan Wu, Xiaoli Z. Fern, Li Fuxin: The Devils in the Point Clouds: Studying the Robustness of Point Cloud Convolutions.
ROSMRF3D0.673 500.789 380.748 540.763 530.635 530.814 550.407 760.747 540.581 540.573 470.950 440.484 450.271 450.607 490.754 500.649 590.774 350.596 500.883 500.823 450.606 58
SALANet0.670 510.816 320.770 420.768 500.652 460.807 590.451 540.747 540.659 200.545 530.924 880.473 500.149 950.571 590.811 430.635 680.746 540.623 390.892 450.794 620.570 71
PointConvpermissive0.666 520.781 420.759 470.699 660.644 500.822 450.475 440.779 450.564 610.504 700.953 340.428 710.203 740.586 540.754 500.661 550.753 510.588 560.902 370.813 520.642 46
Wenxuan Wu, Zhongang Qi, Li Fuxin: PointConv: Deep Convolutional Networks on 3D Point Clouds. CVPR 2019
PointASNLpermissive0.666 520.703 760.781 370.751 570.655 440.830 360.471 460.769 480.474 840.537 560.951 400.475 490.279 390.635 400.698 640.675 490.751 520.553 700.816 820.806 540.703 30
Xu Yan, Chaoda Zheng, Zhen Li, Sheng Wang, Shuguang Cui: PointASNL: Robust Point Clouds Processing using Nonlocal Neural Networks with Adaptive Sampling. CVPR 2020
PPCNN++permissive0.663 540.746 590.708 680.722 590.638 520.820 480.451 540.566 890.599 450.541 540.950 440.510 380.313 220.648 370.819 410.616 730.682 770.590 540.869 670.810 530.656 42
Pyunghwan Ahn, Juyoung Yang, Eojindl Yi, Chanho Lee, Junmo Kim: Projection-based Point Convolution for Efficient Point Cloud Segmentation. IEEE Access
DCM-Net0.658 550.778 430.702 710.806 350.619 560.813 580.468 480.693 700.494 770.524 620.941 700.449 630.298 280.510 760.821 400.675 490.727 610.568 620.826 800.803 560.637 48
Jonas Schult*, Francis Engelmann*, Theodora Kontogianni, Bastian Leibe: DualConvMesh-Net: Joint Geodesic and Euclidean Convolutions on 3D Meshes. CVPR 2020 [Oral]
HPGCNN0.656 560.698 780.743 590.650 820.564 730.820 480.505 300.758 500.631 290.479 740.945 580.480 470.226 610.572 580.774 480.690 440.735 570.614 420.853 740.776 770.597 64
Jisheng Dang, Qingyong Hu, Yulan Guo, Jun Yang: HPGCNN.
SAFNet-segpermissive0.654 570.752 570.734 630.664 790.583 680.815 540.399 780.754 520.639 260.535 580.942 680.470 510.309 240.665 330.539 790.650 580.708 670.635 330.857 730.793 640.642 46
Linqing Zhao, Jiwen Lu, Jie Zhou: Similarity-Aware Fusion Network for 3D Semantic Segmentation. IROS 2021
RandLA-Netpermissive0.645 580.778 430.731 640.699 660.577 690.829 370.446 590.736 580.477 830.523 640.945 580.454 580.269 470.484 830.749 530.618 710.738 550.599 490.827 790.792 670.621 53
PointConv-SFPN0.641 590.776 450.703 700.721 600.557 760.826 400.451 540.672 750.563 620.483 730.943 670.425 740.162 900.644 380.726 550.659 560.709 660.572 590.875 580.786 720.559 77
MVPNetpermissive0.641 590.831 270.715 660.671 760.590 640.781 730.394 800.679 720.642 240.553 510.937 750.462 540.256 530.649 360.406 920.626 690.691 740.666 240.877 560.792 670.608 57
Maximilian Jaritz, Jiayuan Gu, Hao Su: Multi-view PointNet for 3D Scene Understanding. GMDL Workshop, ICCV 2019
PointMRNet0.640 610.717 730.701 720.692 690.576 700.801 630.467 500.716 630.563 620.459 800.953 340.429 700.169 870.581 550.854 290.605 740.710 640.550 720.894 440.793 640.575 69
FPConvpermissive0.639 620.785 400.760 460.713 640.603 590.798 650.392 810.534 940.603 430.524 620.948 500.457 560.250 550.538 700.723 570.598 780.696 720.614 420.872 630.799 570.567 74
Yiqun Lin, Zizheng Yan, Haibin Huang, Dong Du, Ligang Liu, Shuguang Cui, Xiaoguang Han: FPConv: Learning Local Flattening for Point Convolution. CVPR 2020
PD-Net0.638 630.797 360.769 430.641 870.590 640.820 480.461 520.537 930.637 270.536 570.947 520.388 840.206 710.656 340.668 680.647 620.732 590.585 570.868 680.793 640.473 96
PointSPNet0.637 640.734 650.692 790.714 630.576 700.797 660.446 590.743 560.598 460.437 850.942 680.403 800.150 940.626 440.800 460.649 590.697 710.557 680.846 760.777 760.563 75
SConv0.636 650.830 280.697 750.752 560.572 720.780 750.445 610.716 630.529 670.530 590.951 400.446 650.170 860.507 780.666 690.636 670.682 770.541 780.886 490.799 570.594 65
Supervoxel-CNN0.635 660.656 830.711 670.719 610.613 570.757 840.444 640.765 490.534 660.566 480.928 860.478 480.272 430.636 390.531 810.664 530.645 870.508 850.864 700.792 670.611 54
joint point-basedpermissive0.634 670.614 900.778 380.667 780.633 540.825 410.420 720.804 390.467 860.561 490.951 400.494 410.291 310.566 600.458 870.579 840.764 420.559 670.838 770.814 500.598 63
Hung-Yueh Chiang, Yen-Liang Lin, Yueh-Cheng Liu, Winston H. Hsu: A Unified Point-Based Framework for 3D Segmentation. 3DV 2019
PointMTL0.632 680.731 670.688 820.675 730.591 630.784 720.444 640.565 900.610 370.492 710.949 480.456 570.254 540.587 520.706 600.599 770.665 830.612 450.868 680.791 700.579 68
APCF-Net0.631 690.742 620.687 840.672 740.557 760.792 700.408 740.665 760.545 640.508 670.952 380.428 710.186 810.634 410.702 620.620 700.706 680.555 690.873 610.798 590.581 67
Haojia, Lin: Adaptive Pyramid Context Fusion for Point Cloud Perception. GRSL
PointNet2-SFPN0.631 690.771 490.692 790.672 740.524 810.837 300.440 660.706 680.538 650.446 820.944 640.421 760.219 660.552 660.751 520.591 800.737 560.543 770.901 390.768 790.557 78
3DSM_DMMF0.631 690.626 870.745 570.801 380.607 580.751 850.506 290.729 610.565 600.491 720.866 1020.434 660.197 780.595 500.630 730.709 350.705 690.560 650.875 580.740 870.491 91
FusionAwareConv0.630 720.604 920.741 610.766 520.590 640.747 860.501 320.734 590.503 760.527 600.919 920.454 580.323 190.550 680.420 910.678 480.688 750.544 750.896 420.795 610.627 52
Jiazhao Zhang, Chenyang Zhu, Lintao Zheng, Kai Xu: Fusion-Aware Point Convolution for Online Semantic 3D Scene Segmentation. CVPR 2020
DenSeR0.628 730.800 350.625 940.719 610.545 780.806 600.445 610.597 840.448 910.519 650.938 740.481 460.328 170.489 820.499 860.657 570.759 470.592 530.881 520.797 600.634 49
SegGroup_sempermissive0.627 740.818 310.747 560.701 650.602 600.764 810.385 850.629 810.490 790.508 670.931 850.409 790.201 750.564 610.725 560.618 710.692 730.539 790.873 610.794 620.548 81
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
SIConv0.625 750.830 280.694 770.757 540.563 740.772 790.448 580.647 790.520 700.509 660.949 480.431 690.191 790.496 800.614 750.647 620.672 810.535 810.876 570.783 730.571 70
dtc_net0.625 750.703 760.751 530.794 410.535 790.848 210.480 430.676 740.528 680.469 770.944 640.454 580.004 1070.464 850.636 720.704 380.758 480.548 740.924 230.787 710.492 90
HPEIN0.618 770.729 680.668 850.647 840.597 620.766 800.414 730.680 710.520 700.525 610.946 550.432 670.215 680.493 810.599 760.638 660.617 920.570 600.897 410.806 540.605 60
Li Jiang, Hengshuang Zhao, Shu Liu, Xiaoyong Shen, Chi-Wing Fu, Jiaya Jia: Hierarchical Point-Edge Interaction Network for Point Cloud Semantic Segmentation. ICCV 2019
SPH3D-GCNpermissive0.610 780.858 220.772 400.489 990.532 800.792 700.404 770.643 800.570 590.507 690.935 780.414 780.046 1040.510 760.702 620.602 760.705 690.549 730.859 720.773 780.534 84
Huan Lei, Naveed Akhtar, and Ajmal Mian: Spherical Kernel for Efficient Graph Convolution on 3D Point Clouds. TPAMI 2020
AttAN0.609 790.760 540.667 860.649 830.521 820.793 680.457 530.648 780.528 680.434 870.947 520.401 810.153 930.454 860.721 580.648 610.717 630.536 800.904 340.765 800.485 92
Gege Zhang, Qinghua Ma, Licheng Jiao, Fang Liu and Qigong Sun: AttAN: Attention Adversarial Networks for 3D Point Cloud Semantic Segmentation. IJCAI2020
wsss-transformer0.600 800.634 860.743 590.697 680.601 610.781 730.437 680.585 870.493 780.446 820.933 830.394 820.011 1060.654 350.661 710.603 750.733 580.526 820.832 780.761 820.480 93
LAP-D0.594 810.720 710.692 790.637 880.456 910.773 780.391 830.730 600.587 500.445 840.940 720.381 850.288 320.434 890.453 890.591 800.649 850.581 580.777 860.749 860.610 56
DPC0.592 820.720 710.700 730.602 920.480 870.762 830.380 860.713 660.585 530.437 850.940 720.369 870.288 320.434 890.509 850.590 820.639 900.567 630.772 870.755 840.592 66
Francis Engelmann, Theodora Kontogianni, Bastian Leibe: Dilated Point Convolutions: On the Receptive Field Size of Point Convolutions on 3D Point Clouds. ICRA 2020
CCRFNet0.589 830.766 530.659 890.683 710.470 900.740 880.387 840.620 830.490 790.476 750.922 900.355 900.245 580.511 750.511 840.571 850.643 880.493 890.872 630.762 810.600 62
ROSMRF0.580 840.772 480.707 690.681 720.563 740.764 810.362 880.515 950.465 870.465 790.936 770.427 730.207 700.438 870.577 770.536 880.675 800.486 900.723 930.779 740.524 86
SD-DETR0.576 850.746 590.609 980.445 1030.517 830.643 990.366 870.714 650.456 890.468 780.870 1010.432 670.264 500.558 640.674 660.586 830.688 750.482 910.739 910.733 890.537 83
SQN_0.1%0.569 860.676 800.696 760.657 800.497 840.779 760.424 700.548 910.515 720.376 920.902 990.422 750.357 70.379 930.456 880.596 790.659 840.544 750.685 960.665 1000.556 79
TextureNetpermissive0.566 870.672 820.664 870.671 760.494 850.719 890.445 610.678 730.411 970.396 900.935 780.356 890.225 630.412 910.535 800.565 860.636 910.464 930.794 850.680 970.568 73
Jingwei Huang, Haotian Zhang, Li Yi, Thomas Funkerhouser, Matthias Niessner, Leonidas Guibas: TextureNet: Consistent Local Parametrizations for Learning from High-Resolution Signals on Meshes. CVPR
DVVNet0.562 880.648 840.700 730.770 490.586 670.687 930.333 920.650 770.514 730.475 760.906 960.359 880.223 650.340 950.442 900.422 990.668 820.501 860.708 940.779 740.534 84
Pointnet++ & Featurepermissive0.557 890.735 640.661 880.686 700.491 860.744 870.392 810.539 920.451 900.375 930.946 550.376 860.205 720.403 920.356 950.553 870.643 880.497 870.824 810.756 830.515 87
GMLPs0.538 900.495 1000.693 780.647 840.471 890.793 680.300 950.477 960.505 750.358 940.903 980.327 930.081 1010.472 840.529 820.448 970.710 640.509 830.746 890.737 880.554 80
PanopticFusion-label0.529 910.491 1010.688 820.604 910.386 960.632 1000.225 1050.705 690.434 940.293 1000.815 1030.348 910.241 590.499 790.669 670.507 900.649 850.442 990.796 840.602 1030.561 76
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
subcloud_weak0.516 920.676 800.591 1010.609 890.442 920.774 770.335 910.597 840.422 960.357 950.932 840.341 920.094 1000.298 970.528 830.473 950.676 790.495 880.602 1020.721 920.349 103
Online SegFusion0.515 930.607 910.644 920.579 940.434 930.630 1010.353 890.628 820.440 920.410 880.762 1060.307 950.167 880.520 730.403 930.516 890.565 950.447 970.678 970.701 940.514 88
Davide Menini, Suryansh Kumar, Martin R. Oswald, Erik Sandstroem, Cristian Sminchisescu, Luc van Gool: A Real-Time Learning Framework for Joint 3D Reconstruction and Semantic Segmentation. Robotics and Automation Letters Submission
3DMV, FTSDF0.501 940.558 960.608 990.424 1050.478 880.690 920.246 1010.586 860.468 850.450 810.911 940.394 820.160 910.438 870.212 1020.432 980.541 1000.475 920.742 900.727 900.477 94
PCNN0.498 950.559 950.644 920.560 960.420 950.711 910.229 1030.414 970.436 930.352 960.941 700.324 940.155 920.238 1020.387 940.493 910.529 1010.509 830.813 830.751 850.504 89
3DMV0.484 960.484 1020.538 1030.643 860.424 940.606 1040.310 930.574 880.433 950.378 910.796 1040.301 960.214 690.537 710.208 1030.472 960.507 1040.413 1020.693 950.602 1030.539 82
Angela Dai, Matthias Niessner: 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation. ECCV'18
PointCNN with RGBpermissive0.458 970.577 940.611 970.356 1070.321 1040.715 900.299 970.376 1010.328 1040.319 980.944 640.285 980.164 890.216 1050.229 1000.484 930.545 990.456 950.755 880.709 930.475 95
Yangyan Li, Rui Bu, Mingchao Sun, Baoquan Chen: PointCNN. NeurIPS 2018
FCPNpermissive0.447 980.679 790.604 1000.578 950.380 970.682 940.291 980.106 1070.483 820.258 1050.920 910.258 1020.025 1050.231 1040.325 960.480 940.560 970.463 940.725 920.666 990.231 107
Dario Rethage, Johanna Wald, Jürgen Sturm, Nassir Navab, Federico Tombari: Fully-Convolutional Point Networks for Large-Scale Point Clouds. ECCV 2018
DGCNN_reproducecopyleft0.446 990.474 1030.623 950.463 1010.366 990.651 970.310 930.389 1000.349 1020.330 970.937 750.271 1000.126 970.285 980.224 1010.350 1040.577 940.445 980.625 1000.723 910.394 99
Yue Wang, Yongbin Sun, Ziwei Liu, Sanjay E. Sarma, Michael M. Bronstein, Justin M. Solomon: Dynamic Graph CNN for Learning on Point Clouds. TOG 2019
SurfaceConvPF0.442 1000.505 990.622 960.380 1060.342 1020.654 960.227 1040.397 990.367 1000.276 1020.924 880.240 1030.198 770.359 940.262 980.366 1010.581 930.435 1000.640 990.668 980.398 98
Hao Pan, Shilin Liu, Yang Liu, Xin Tong: Convolutional Neural Networks on 3D Surfaces Using Parallel Frames.
PNET20.442 1000.548 970.548 1020.597 930.363 1000.628 1020.300 950.292 1020.374 990.307 990.881 1000.268 1010.186 810.238 1020.204 1040.407 1000.506 1050.449 960.667 980.620 1020.462 97
Tangent Convolutionspermissive0.438 1020.437 1050.646 910.474 1000.369 980.645 980.353 890.258 1040.282 1060.279 1010.918 930.298 970.147 960.283 990.294 970.487 920.562 960.427 1010.619 1010.633 1010.352 102
Maxim Tatarchenko, Jaesik Park, Vladlen Koltun, Qian-Yi Zhou: Tangent convolutions for dense prediction in 3d. CVPR 2018
3DWSSS0.425 1030.525 980.647 900.522 970.324 1030.488 1070.077 1080.712 670.353 1010.401 890.636 1080.281 990.176 840.340 950.565 780.175 1080.551 980.398 1030.370 1080.602 1030.361 101
SPLAT Netcopyleft0.393 1040.472 1040.511 1040.606 900.311 1050.656 950.245 1020.405 980.328 1040.197 1060.927 870.227 1050.000 1090.001 1090.249 990.271 1070.510 1020.383 1050.593 1030.699 950.267 105
Hang Su, Varun Jampani, Deqing Sun, Subhransu Maji, Evangelos Kalogerakis, Ming-Hsuan Yang, Jan Kautz: SPLATNet: Sparse Lattice Networks for Point Cloud Processing. CVPR 2018
ScanNet+FTSDF0.383 1050.297 1070.491 1050.432 1040.358 1010.612 1030.274 990.116 1060.411 970.265 1030.904 970.229 1040.079 1020.250 1000.185 1050.320 1050.510 1020.385 1040.548 1040.597 1060.394 99
PointNet++permissive0.339 1060.584 930.478 1060.458 1020.256 1070.360 1080.250 1000.247 1050.278 1070.261 1040.677 1070.183 1060.117 980.212 1060.145 1070.364 1020.346 1080.232 1080.548 1040.523 1070.252 106
Charles R. Qi, Li Yi, Hao Su, Leonidas J. Guibas: pointnet++: deep hierarchical feature learning on point sets in a metric space.
SSC-UNetpermissive0.308 1070.353 1060.290 1080.278 1080.166 1080.553 1050.169 1070.286 1030.147 1080.148 1080.908 950.182 1070.064 1030.023 1080.018 1090.354 1030.363 1060.345 1060.546 1060.685 960.278 104
ScanNetpermissive0.306 1080.203 1080.366 1070.501 980.311 1050.524 1060.211 1060.002 1090.342 1030.189 1070.786 1050.145 1080.102 990.245 1010.152 1060.318 1060.348 1070.300 1070.460 1070.437 1080.182 108
Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner: ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. CVPR'17
ERROR0.054 1090.000 1090.041 1090.172 1090.030 1090.062 1090.001 1090.035 1080.004 1090.051 1090.143 1090.019 1090.003 1080.041 1070.050 1080.003 1090.054 1090.018 1090.005 1090.264 1090.082 109


This table lists the benchmark results for the 3D semantic instance scenario.




Method Infoavg ap 50%bathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
OneFormer3D0.801 11.000 10.973 20.909 40.698 70.928 20.582 10.668 260.685 100.780 20.687 70.698 90.702 101.000 10.794 50.900 20.784 80.986 430.635 3
UniPerception0.800 21.000 10.930 50.872 80.727 20.862 150.454 100.764 130.820 10.746 50.706 50.750 10.772 70.926 350.764 90.818 180.826 10.997 330.660 2
ExtMask3D0.789 31.000 10.988 10.756 250.706 50.912 30.429 110.647 310.806 40.755 40.673 90.689 100.772 81.000 10.789 60.852 60.811 31.000 10.617 8
Queryformer0.787 41.000 10.933 40.601 390.754 10.886 80.558 30.661 280.767 50.665 100.716 30.639 160.808 21.000 10.844 10.897 30.804 41.000 10.624 5
MAFT0.786 51.000 10.894 100.807 150.694 90.893 60.486 60.674 240.740 60.786 10.704 60.727 30.739 91.000 10.707 150.849 80.756 151.000 10.685 1
Mask3D0.780 61.000 10.786 330.716 300.696 80.885 90.500 50.714 190.810 30.672 90.715 40.679 120.809 11.000 10.831 30.833 120.787 71.000 10.602 12
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
SPFormerpermissive0.770 70.903 460.903 70.806 160.609 210.886 70.568 20.815 60.705 90.711 60.655 100.652 150.685 151.000 10.789 70.809 190.776 111.000 10.583 17
Sun Jiahao, Qing Chunmei, Tan Junpeng, Xu Xiangmin: Superpoint Transformer for 3D Scene Instance Segmentation. AAAI 2023 [Oral]
SoftGroup++0.769 81.000 10.803 260.937 10.684 100.865 120.213 260.870 20.664 130.571 160.758 10.702 70.807 31.000 10.653 220.902 10.792 61.000 10.626 4
SIM3D0.766 91.000 10.948 30.582 450.599 230.882 100.510 40.701 210.632 170.772 30.685 80.687 110.782 61.000 10.833 20.756 290.798 51.000 10.622 6
SoftGrouppermissive0.761 101.000 10.808 220.845 100.716 30.862 140.243 230.824 40.655 150.620 110.734 20.699 80.791 50.981 290.716 130.844 90.769 121.000 10.594 15
Thang Vu, Kookhoi Kim, Tung M. Luu, Xuan Thanh Nguyen, Chang D. Yoo: SoftGroup for 3D Instance Segmentaiton on Point Clouds. CVPR 2022 [Oral]
ISBNetpermissive0.757 111.000 10.904 60.731 280.678 110.895 40.458 80.644 330.670 120.710 70.620 170.732 20.650 171.000 10.756 100.778 220.779 91.000 10.614 9
Tuan Duc Ngo, Binh-Son Hua, Khoi Nguyen: ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution. CVPR 2023
TD3Dpermissive0.751 121.000 10.774 340.867 90.621 170.934 10.404 120.706 200.812 20.605 140.633 150.626 170.690 141.000 10.640 240.820 150.777 101.000 10.612 10
Maksim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: Top-Down Beats Bottom-Up in 3D Instance Segmentation. WACV 2024
PBNetpermissive0.747 131.000 10.818 180.837 120.713 40.844 170.457 90.647 310.711 80.614 120.617 180.657 140.650 171.000 10.692 160.822 140.765 141.000 10.595 14
W.Zhao, Y.Yan, C.Yang, J.Ye,X.Yang,K.Huang: Divide and Conquer: 3D Instance Segmentation With Point-Wise Binarization. ICCV 2023
GraphCut0.732 141.000 10.788 310.724 290.642 150.859 160.248 220.787 110.618 190.596 150.653 120.722 50.583 371.000 10.766 80.861 40.825 21.000 10.504 28
IPCA-Inst0.731 151.000 10.788 320.884 70.698 60.788 320.252 210.760 140.646 160.511 240.637 140.665 130.804 41.000 10.644 230.778 230.747 171.000 10.561 21
TopoSeg0.725 161.000 10.806 250.933 20.668 130.758 360.272 200.734 180.630 180.549 200.654 110.606 180.697 130.966 320.612 280.839 100.754 161.000 10.573 18
DKNet0.718 171.000 10.814 190.782 190.619 180.872 110.224 240.751 160.569 230.677 80.585 220.724 40.633 280.981 290.515 380.819 160.736 181.000 10.617 7
Yizheng Wu, Min Shi, Shuaiyuan Du, Hao Lu, Zhiguo Cao, Weicai Zhong: 3D Instances as 1D Kernels. ECCV 2022
SSEC0.707 181.000 10.850 120.924 30.648 140.747 390.162 280.862 30.572 220.520 220.624 160.549 210.649 261.000 10.560 330.706 390.768 131.000 10.591 16
HAISpermissive0.699 191.000 10.849 130.820 130.675 120.808 260.279 180.757 150.465 290.517 230.596 200.559 200.600 311.000 10.654 210.767 250.676 220.994 390.560 22
Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang: Hierarchical Aggregation for 3D Instance Segmentation. ICCV 2021
SSTNetpermissive0.698 201.000 10.697 500.888 60.556 290.803 270.387 130.626 350.417 330.556 190.585 230.702 60.600 311.000 10.824 40.720 380.692 201.000 10.509 27
Zhihao Liang, Zhihao Li, Songcen Xu, Mingkui Tan, Kui Jia: Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks. ICCV2021
DualGroup0.694 211.000 10.799 280.811 140.622 160.817 210.376 140.805 90.590 210.487 270.568 260.525 250.650 170.835 450.600 290.829 130.655 241.000 10.526 24
DANCENET0.680 221.000 10.807 230.733 270.600 220.768 350.375 150.543 430.538 240.610 130.599 190.498 260.632 300.981 290.739 120.856 50.633 300.882 540.454 37
SphereSeg0.680 221.000 10.856 110.744 260.618 190.893 50.151 290.651 300.713 70.537 210.579 250.430 350.651 161.000 10.389 480.744 330.697 190.991 410.601 13
Box2Mask0.677 241.000 10.847 140.771 210.509 380.816 220.277 190.558 420.482 260.562 180.640 130.448 310.700 111.000 10.666 170.852 70.578 370.997 330.488 32
Julian Chibane, Francis Engelmann, Tuan Anh Tran, Gerard Pons-Moll: Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation Using Bounding Boxes. ECCV 2022
OccuSeg+instance0.672 251.000 10.758 420.682 330.576 270.842 180.477 70.504 480.524 250.567 170.585 240.451 300.557 391.000 10.751 110.797 200.563 401.000 10.467 36
Lei Han, Tian Zheng, Lan Xu, Lu Fang: OccuSeg: Occupancy-aware 3D Instance Segmentation. CVPR2020
Mask-Group0.664 261.000 10.822 170.764 240.616 200.815 230.139 330.694 230.597 200.459 310.566 270.599 190.600 310.516 550.715 140.819 170.635 281.000 10.603 11
Min Zhong, Xinghao Chen, Xiaokang Chen, Gang Zeng, Yunhe Wang: MaskGroup: Hierarchical Point Grouping and Masking for 3D Instance Segmentation. ICME 2022
INS-Conv-instance0.657 271.000 10.760 400.667 350.581 250.863 130.323 160.655 290.477 270.473 290.549 290.432 340.650 171.000 10.655 200.738 340.585 360.944 460.472 35
CSC-Pretrained0.648 281.000 10.810 200.768 220.523 360.813 240.143 320.819 50.389 360.422 400.511 330.443 320.650 171.000 10.624 260.732 350.634 291.000 10.375 44
PE0.645 291.000 10.773 360.798 180.538 310.786 330.088 410.799 100.350 400.435 380.547 300.545 220.646 270.933 340.562 320.761 280.556 450.997 330.501 30
Biao Zhang, Peter Wonka: Point Cloud Instance Segmentation using Probabilistic Embeddings. CVPR 2021
RPGN0.643 301.000 10.758 410.582 460.539 300.826 200.046 450.765 120.372 380.436 370.588 210.539 240.650 171.000 10.577 300.750 310.653 260.997 330.495 31
Shichao Dong, Guosheng Lin, Tzu-Yi Hung: Learning Regional Purity for Instance Segmentation on 3D Point Clouds. ECCV 2022
Dyco3Dcopyleft0.641 311.000 10.841 150.893 50.531 330.802 280.115 380.588 400.448 300.438 350.537 320.430 360.550 400.857 370.534 360.764 270.657 230.987 420.568 19
Tong He; Chunhua Shen; Anton van den Hengel: DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution. CVPR2021
GICN0.638 321.000 10.895 90.800 170.480 420.676 440.144 310.737 170.354 390.447 320.400 460.365 410.700 111.000 10.569 310.836 110.599 321.000 10.473 34
PointGroup0.636 331.000 10.765 370.624 370.505 400.797 290.116 370.696 220.384 370.441 330.559 280.476 280.596 341.000 10.666 170.756 300.556 440.997 330.513 26
Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia: PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation. CVPR 2020 [oral]
DD-UNet+Group0.635 340.667 480.797 300.714 310.562 280.774 340.146 300.810 80.429 320.476 280.546 310.399 380.633 281.000 10.632 250.722 370.609 311.000 10.514 25
H. Liu, R. Liu, K. Yang, J. Zhang, K. Peng, R. Stiefelhagen: HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor. ICCVW 2021
DENet0.629 351.000 10.797 290.608 380.589 240.627 480.219 250.882 10.310 420.402 450.383 480.396 390.650 171.000 10.663 190.543 560.691 211.000 10.568 20
3D-MPA0.611 361.000 10.833 160.765 230.526 350.756 370.136 350.588 400.470 280.438 360.432 420.358 430.650 170.857 370.429 440.765 260.557 431.000 10.430 39
Francis Engelmann, Martin Bokeloh, Alireza Fathi, Bastian Leibe, Matthias Nießner: 3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation. CVPR 2020
OSIS0.605 371.000 10.801 270.599 400.535 320.728 410.286 170.436 520.679 110.491 250.433 400.256 450.404 520.857 370.620 270.724 360.510 501.000 10.539 23
AOIA0.601 381.000 10.761 390.687 320.485 410.828 190.008 520.663 270.405 350.405 440.425 430.490 270.596 340.714 480.553 350.779 210.597 330.992 400.424 41
PCJC0.578 391.000 10.810 210.583 440.449 450.813 250.042 460.603 380.341 410.490 260.465 370.410 370.650 170.835 450.264 540.694 430.561 410.889 510.504 29
SSEN0.575 401.000 10.761 380.473 480.477 430.795 300.066 420.529 450.658 140.460 300.461 380.380 400.331 540.859 360.401 470.692 450.653 251.000 10.348 46
Dongsu Zhang, Junha Chun, Sang Kyun Cha, Young Min Kim: Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with Deep Metric Learning. Arxiv
RWSeg0.567 410.528 580.708 490.626 360.580 260.745 400.063 430.627 340.240 460.400 460.497 340.464 290.515 411.000 10.475 400.745 320.571 381.000 10.429 40
NeuralBF0.555 420.667 480.896 80.843 110.517 370.751 380.029 470.519 460.414 340.439 340.465 360.000 640.484 430.857 370.287 520.693 440.651 271.000 10.485 33
Weiwei Sun, Daniel Rebain, Renjie Liao, Vladimir Tankovich, Soroosh Yazdani, Kwang Moo Yi, Andrea Tagliasacchi: NeuralBF: Neural Bilateral Filtering for Top-down Instance Segmentation on Point Clouds. WACV 2023
MTML0.549 431.000 10.807 240.588 430.327 500.647 460.004 540.815 70.180 490.418 410.364 500.182 480.445 461.000 10.442 430.688 460.571 391.000 10.396 42
Jean Lahoud, Bernard Ghanem, Marc Pollefeys, Martin R. Oswald: 3D Instance Segmentation via Multi-task Metric Learning. ICCV 2019 [oral]
ClickSeg_Instance0.539 441.000 10.621 530.300 510.530 340.698 420.127 360.533 440.222 470.430 390.400 450.365 410.574 380.938 330.472 410.659 480.543 460.944 460.347 47
One_Thing_One_Clickpermissive0.529 450.667 480.718 450.777 200.399 460.683 430.000 570.669 250.138 520.391 470.374 490.539 230.360 530.641 520.556 340.774 240.593 340.997 330.251 52
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
Sparse R-CNN0.515 461.000 10.538 580.282 520.468 440.790 310.173 270.345 540.429 310.413 430.484 350.176 490.595 360.591 530.522 370.668 470.476 510.986 440.327 48
Occipital-SCS0.512 471.000 10.716 460.509 470.506 390.611 490.092 400.602 390.177 500.346 500.383 470.165 500.442 470.850 440.386 490.618 520.543 470.889 510.389 43
3D-BoNet0.488 481.000 10.672 520.590 420.301 520.484 590.098 390.620 360.306 430.341 510.259 540.125 520.434 490.796 470.402 460.499 580.513 490.909 500.439 38
Bo Yang, Jianan Wang, Ronald Clark, Qingyong Hu, Sen Wang, Andrew Markham, Niki Trigoni: Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds. NeurIPS 2019 Spotlight
PanopticFusion-inst0.478 490.667 480.712 480.595 410.259 550.550 550.000 570.613 370.175 510.250 560.434 390.437 330.411 510.857 370.485 390.591 550.267 610.944 460.359 45
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
SPG_WSIS0.470 500.667 480.685 510.677 340.372 480.562 530.000 570.482 490.244 450.316 530.298 510.052 590.442 480.857 370.267 530.702 400.559 421.000 10.287 50
SALoss-ResNet0.459 511.000 10.737 440.159 620.259 540.587 510.138 340.475 500.217 480.416 420.408 440.128 510.315 550.714 480.411 450.536 570.590 350.873 550.304 49
Zhidong Liang, Ming Yang, Hao Li, Chunxiang Wang: 3D Instance Embedding Learning With a Structure-Aware Loss Function for Point Cloud Segmentation. IEEE Robotics and Automation Letters (IROS2020)
MASCpermissive0.447 520.528 580.555 560.381 490.382 470.633 470.002 550.509 470.260 440.361 490.432 410.327 440.451 450.571 540.367 500.639 500.386 520.980 450.276 51
Chen Liu, Yasutaka Furukawa: MASC: Multi-scale Affinity with Sparse Convolution for 3D Instance Segmentation.
SegGroup_inspermissive0.445 530.667 480.773 350.185 590.317 510.656 450.000 570.407 530.134 530.381 480.267 530.217 470.476 440.714 480.452 420.629 510.514 481.000 10.222 55
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
3D-SISpermissive0.382 541.000 10.432 610.245 540.190 560.577 520.013 510.263 560.033 590.320 520.240 550.075 550.422 500.857 370.117 590.699 410.271 600.883 530.235 54
Ji Hou, Angela Dai, Matthias Niessner: 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans. CVPR 2019
Hier3Dcopyleft0.323 550.667 480.542 570.264 530.157 590.550 540.000 570.205 590.009 610.270 550.218 560.075 550.500 420.688 510.007 650.698 420.301 570.459 620.200 56
Tan: HCFS3D: Hierarchical Coupled Feature Selection Network for 3D Semantic and Instance Segmentation.
UNet-backbone0.319 560.667 480.715 470.233 550.189 570.479 600.008 520.218 570.067 580.201 580.173 570.107 530.123 600.438 560.150 560.615 530.355 530.916 490.093 64
R-PointNet0.306 570.500 600.405 620.311 500.348 490.589 500.054 440.068 620.126 540.283 540.290 520.028 600.219 580.214 590.331 510.396 620.275 580.821 570.245 53
Region-18class0.284 580.250 640.751 430.228 570.270 530.521 560.000 570.468 510.008 630.205 570.127 580.000 640.068 620.070 630.262 550.652 490.323 550.740 580.173 57
SemRegionNet-20cls0.250 590.333 610.613 540.229 560.163 580.493 570.000 570.304 550.107 550.147 610.100 600.052 580.231 560.119 610.039 610.445 600.325 540.654 590.141 60
tmp0.248 600.667 480.437 600.188 580.153 600.491 580.000 570.208 580.094 570.153 600.099 610.057 570.217 590.119 610.039 610.466 590.302 560.640 600.140 61
3D-BEVIS0.248 600.667 480.566 550.076 630.035 650.394 630.027 490.035 640.098 560.099 630.030 640.025 610.098 610.375 580.126 580.604 540.181 630.854 560.171 58
Cathrin Elich, Francis Engelmann, Jonas Schult, Theodora Kontogianni, Bastian Leibe: 3D-BEVIS: Birds-Eye-View Instance Segmentation.
Sem_Recon_ins0.227 620.764 470.486 590.069 640.098 620.426 620.017 500.067 630.015 600.172 590.100 590.096 540.054 640.183 600.135 570.366 630.260 620.614 610.168 59
ASIS0.199 630.333 610.253 640.167 610.140 610.438 610.000 570.177 600.008 620.121 620.069 620.004 630.231 570.429 570.036 630.445 610.273 590.333 640.119 63
Sgpn_scannet0.143 640.208 650.390 630.169 600.065 630.275 640.029 480.069 610.000 640.087 640.043 630.014 620.027 650.000 640.112 600.351 640.168 640.438 630.138 62
MaskRCNN 2d->3d Proj0.058 650.333 610.002 650.000 650.053 640.002 650.002 560.021 650.000 640.045 650.024 650.238 460.065 630.000 640.014 640.107 650.020 650.110 650.006 65


This table lists the benchmark results for the 2D semantic label scenario.


Method Infoavg ioubathtubbedbookshelfcabinetchaircountercurtaindeskdoorfloorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwallwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Virtual MVFusion (R)0.745 10.861 10.839 10.881 10.672 10.512 10.422 150.898 10.723 10.714 10.954 20.454 10.509 10.773 10.895 10.756 10.820 10.653 10.935 10.891 10.728 1
Abhijit Kundu, Xiaoqi Yin, Alireza Fathi, David Ross, Brian Brewington, Thomas Funkhouser, Caroline Pantofaru: Virtual Multi-view Fusion for 3D Semantic Segmentation. ECCV 2020
BPNet_2Dcopyleft0.670 20.822 30.795 30.836 20.659 20.481 20.451 110.769 30.656 30.567 30.931 30.395 40.390 40.700 30.534 30.689 90.770 20.574 30.865 60.831 30.675 4
Wenbo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia and Tien-Tsin Wong: Bidirectional Projection Network for Cross Dimension Scene Understanding. CVPR 2021 (Oral)
CU-Hybrid-2D Net0.636 30.825 20.820 20.179 200.648 30.463 30.549 20.742 60.676 20.628 20.961 10.420 20.379 50.684 60.381 150.732 20.723 30.599 20.827 130.851 20.634 6
CMX0.613 40.681 70.725 90.502 120.634 50.297 150.478 90.830 20.651 40.537 60.924 40.375 50.315 120.686 50.451 120.714 40.543 180.504 50.894 40.823 40.688 3
DMMF_3d0.605 50.651 80.744 70.782 30.637 40.387 40.536 30.732 70.590 60.540 50.856 180.359 90.306 130.596 110.539 20.627 180.706 40.497 70.785 180.757 160.476 19
MCA-Net0.595 60.533 170.756 60.746 40.590 80.334 70.506 60.670 120.587 70.500 100.905 80.366 80.352 80.601 100.506 60.669 150.648 70.501 60.839 120.769 120.516 18
RFBNet0.592 70.616 90.758 50.659 50.581 90.330 80.469 100.655 150.543 120.524 70.924 40.355 100.336 100.572 140.479 80.671 130.648 70.480 90.814 160.814 50.614 9
FAN_NV_RVC0.586 80.510 180.764 40.079 230.620 70.330 80.494 70.753 40.573 80.556 40.884 130.405 30.303 140.718 20.452 110.672 120.658 50.509 40.898 30.813 60.727 2
DCRedNet0.583 90.682 60.723 100.542 110.510 170.310 120.451 110.668 130.549 110.520 80.920 60.375 50.446 20.528 170.417 130.670 140.577 150.478 100.862 70.806 70.628 8
MIX6D_RVC0.582 100.695 40.687 140.225 180.632 60.328 100.550 10.748 50.623 50.494 130.890 110.350 120.254 200.688 40.454 100.716 30.597 140.489 80.881 50.768 130.575 12
SSMAcopyleft0.577 110.695 40.716 120.439 140.563 110.314 110.444 130.719 80.551 100.503 90.887 120.346 130.348 90.603 90.353 170.709 50.600 120.457 120.901 20.786 80.599 11
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. International Journal of Computer Vision, 2019
UNIV_CNP_RVC_UE0.566 120.569 160.686 160.435 150.524 140.294 160.421 160.712 90.543 120.463 150.872 140.320 140.363 70.611 80.477 90.686 100.627 90.443 150.862 70.775 110.639 5
EMSAFormer0.564 130.581 130.736 80.564 100.546 130.219 200.517 40.675 110.486 170.427 190.904 90.352 110.320 110.589 120.528 40.708 60.464 210.413 190.847 110.786 80.611 10
SN_RN152pyrx8_RVCcopyleft0.546 140.572 140.663 180.638 70.518 150.298 140.366 210.633 180.510 150.446 170.864 160.296 170.267 170.542 160.346 180.704 70.575 160.431 160.853 100.766 140.630 7
UDSSEG_RVC0.545 150.610 110.661 190.588 80.556 120.268 180.482 80.642 170.572 90.475 140.836 200.312 150.367 60.630 70.189 200.639 170.495 200.452 130.826 140.756 170.541 14
segfomer with 6d0.542 160.594 120.687 140.146 210.579 100.308 130.515 50.703 100.472 180.498 110.868 150.369 70.282 150.589 120.390 140.701 80.556 170.416 180.860 90.759 150.539 16
FuseNetpermissive0.535 170.570 150.681 170.182 190.512 160.290 170.431 140.659 140.504 160.495 120.903 100.308 160.428 30.523 180.365 160.676 110.621 110.470 110.762 190.779 100.541 14
Caner Hazirbas, Lingni Ma, Csaba Domokos, Daniel Cremers: FuseNet: Incorporating Depth into Semantic Segmentation via Fusion-based CNN Architecture. ACCV 2016
AdapNet++copyleft0.503 180.613 100.722 110.418 160.358 230.337 60.370 200.479 210.443 190.368 210.907 70.207 200.213 220.464 210.525 50.618 190.657 60.450 140.788 170.721 200.408 22
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. International Journal of Computer Vision, 2019
3DMV (2d proj)0.498 190.481 210.612 200.579 90.456 190.343 50.384 180.623 190.525 140.381 200.845 190.254 190.264 190.557 150.182 210.581 210.598 130.429 170.760 200.661 220.446 21
Angela Dai, Matthias Niessner: 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation. ECCV'18
MSeg1080_RVCpermissive0.485 200.505 190.709 130.092 220.427 200.241 190.411 170.654 160.385 230.457 160.861 170.053 230.279 160.503 190.481 70.645 160.626 100.365 210.748 21