Presenting the ScanNet200 Benchmark

We present the ScanNet200 benchmark, which studies an order of magnitude more class categories than previous version of ScanNet. The scene geometry is shared within the two tasks, but the parsing of surface annotation allows for a larger vocabulary and more realistic setting for in the wild 3D understanding methods.

The ScanNet200 benchmark includes both finer-grained categories as well as a large number of previously unaddressed classes. This induces a much more challenging setting regarding the diversity of naturally observed semantic classes seen in the raw ScanNet RGB-D observations, where the data also reflects naturally encountered class imbalances. The difference in category frequencies between ScanNet and ScanNet200 can be seen in the Figure above.

ScanNet200 Benchmark

This table lists the benchmark results for the ScanNet200 3D semantic label scenario.




Method Infoavg iouhead ioucommon ioutail ioualarm clockarmchairbackpackbagballbarbasketbathroom cabinetbathroom counterbathroom stallbathroom stall doorbathroom vanitybathtubbedbenchbicyclebinblackboardblanketblindsboardbookbookshelfbottlebowlboxbroombucketbulletin boardcabinetcalendarcandlecartcase of water bottlescd caseceilingceiling lightchairclockclosetcloset doorcloset rodcloset wallclothesclothes dryercoat rackcoffee kettlecoffee makercoffee tablecolumncomputer towercontainercopiercouchcountercratecupcurtaincushiondecorationdeskdining tabledish rackdishwasherdividerdoordoorframedresserdumbbelldustpanend tablefanfile cabinetfire alarmfire extinguisherfireplacefloorfolded chairfurnitureguitarguitar casehair dryerhandicap barhatheadphonesironing boardjacketkeyboardkeyboard pianokitchen cabinetkitchen counterladderlamplaptoplaundry basketlaundry detergentlaundry hamperledgelightlight switchluggagemachinemailboxmatmattressmicrowavemini fridgemirrormonitormousemusic standnightstandobjectoffice chairottomanovenpaperpaper bagpaper cutterpaper towel dispenserpaper towel rollpersonpianopicturepillarpillowpipeplantplateplungerposterpotted plantpower outletpower stripprinterprojectorprojector screenpurserackradiatorrailrange hoodrecycling binrefrigeratorscaleseatshelfshoeshowershower curtainshower curtain rodshower doorshower floorshower headshower wallsignsinksoap dishsoap dispensersofa chairspeakerstair railstairsstandstoolstorage binstorage containerstorage organizerstovestructurestuffed animalsuitcasetabletelephonetissue boxtoastertoaster oventoilettoilet papertoilet paper dispensertoilet paper holdertoilet seat cover dispensertoweltrash bintrash cantraytubetvtv standvacuum cleanerventwallwardrobewashing machinewater bottlewater coolerwater pitcherwhiteboardwindowwindowsill
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
DITR0.449 10.629 10.392 10.289 10.650 10.168 20.862 10.000 30.313 30.000 10.580 10.568 20.564 30.766 70.867 10.238 50.949 10.000 30.866 30.300 10.000 90.664 10.482 10.508 120.317 10.420 10.551 20.000 10.000 30.486 20.519 10.662 40.000 10.385 10.000 30.901 30.079 90.727 10.000 70.160 30.606 30.417 40.967 20.000 10.000 20.498 50.596 110.130 20.728 30.998 10.805 10.000 170.314 10.934 20.000 10.278 40.636 10.000 70.403 120.367 10.741 20.484 10.500 21.000 10.113 120.828 10.815 10.000 70.733 20.969 40.374 20.000 10.579 11.000 10.230 50.617 50.983 10.729 10.423 40.855 10.508 60.622 20.018 30.000 10.591 30.034 40.028 100.066 110.869 10.904 70.334 20.651 50.716 10.514 20.871 60.315 30.000 10.664 30.128 30.014 100.000 40.000 10.392 20.851 20.817 10.153 30.823 10.991 10.318 30.680 10.134 30.913 10.157 20.448 40.000 10.000 80.000 30.826 10.978 10.091 60.000 10.660 40.647 30.571 20.804 40.001 90.000 10.480 30.700 10.421 50.947 10.433 140.411 30.148 60.262 50.000 10.849 10.709 60.138 100.150 20.714 30.889 10.000 10.698 10.222 40.000 70.000 10.720 20.000 20.000 10.805 10.600 10.642 30.268 90.904 10.982 20.477 10.632 60.718 20.139 90.776 20.000 10.178 10.886 10.962 10.839 80.000 10.851 20.043 120.869 40.000 10.710 10.315 60.348 30.753 20.397 8
Karim Abou Zeid, Kadir Yilmaz, Daan de Geus, Alexander Hermans, David Adrian, Timm Linder, Bastian Leibe: DINO in the Room: Leveraging 2D Foundation Models for 3D Segmentation.
ALS-MinkowskiNetcopyleft0.414 20.610 20.322 30.271 20.542 20.153 30.159 110.000 30.000 70.000 10.404 40.503 50.532 60.672 160.804 50.285 10.888 20.000 30.900 20.226 20.087 20.598 40.342 50.671 10.217 100.087 30.449 40.000 10.000 30.253 30.477 61.000 10.000 10.118 50.000 30.905 10.071 130.710 20.076 20.047 160.665 10.376 80.981 10.000 10.000 20.466 70.632 70.113 40.769 10.956 40.795 20.031 90.314 10.936 10.000 10.390 20.601 30.000 70.458 80.366 20.719 30.440 50.564 10.699 40.314 10.464 70.784 20.200 10.283 60.973 10.142 90.000 10.250 70.285 60.220 70.718 10.752 60.723 20.460 10.248 150.475 100.463 130.000 40.000 10.446 80.021 50.025 110.285 10.000 40.972 10.149 80.769 10.230 30.535 10.879 20.252 80.000 10.693 10.129 20.000 140.000 40.000 10.447 10.958 10.662 90.159 20.598 30.780 110.344 20.646 30.106 60.893 30.135 30.455 30.000 10.194 30.259 10.726 30.475 40.000 90.000 10.741 10.865 10.571 20.817 30.445 30.000 10.506 20.630 30.230 120.916 20.728 10.635 11.000 10.252 60.000 10.804 20.697 70.137 110.043 70.717 20.807 30.000 10.510 130.245 20.000 70.000 10.709 30.000 20.000 10.703 20.572 40.646 20.223 100.531 50.984 10.397 30.813 10.798 10.135 120.800 10.000 10.097 20.832 20.752 80.842 70.000 10.852 10.149 90.846 100.000 10.666 50.359 50.252 80.777 10.690 2
Guangda Ji, Silvan Weder, Francis Engelmann, Marc Pollefeys, Hermann Blum: ARKit LabelMaker: A New Scale for Indoor 3D Scene Understanding. arxiv
PTv3 ScanNet2000.393 30.592 30.330 20.216 30.520 30.109 50.108 160.000 30.337 10.000 10.310 120.394 90.494 110.753 90.848 20.256 30.717 80.000 30.842 40.192 50.065 30.449 100.346 40.546 60.190 130.000 90.384 70.000 10.000 30.218 40.505 20.791 30.000 10.136 40.000 30.903 20.073 120.687 60.000 70.168 20.551 50.387 70.941 30.000 10.000 20.397 120.654 30.000 100.714 50.759 150.752 70.118 40.264 40.926 30.000 10.048 60.575 50.000 70.597 20.366 20.755 10.469 20.474 30.798 20.140 100.617 30.692 70.000 70.592 40.971 20.188 40.000 10.133 90.593 20.349 10.650 30.717 80.699 30.455 20.790 20.523 40.636 10.301 10.000 10.622 20.000 110.017 150.259 30.000 40.921 30.337 10.733 20.210 40.514 20.860 80.407 10.000 10.688 20.109 80.000 140.000 40.000 10.151 50.671 80.782 20.115 130.641 20.903 20.349 10.616 40.088 70.832 80.000 60.480 20.000 10.428 10.000 30.497 100.000 50.000 90.000 10.662 30.690 20.612 10.828 10.575 10.000 10.404 70.644 20.325 70.887 40.728 10.009 160.134 70.026 170.000 10.761 30.731 40.172 60.077 40.528 80.727 70.000 10.603 50.220 50.022 30.000 10.740 10.000 20.000 10.661 40.586 20.566 40.436 40.531 50.978 30.457 20.708 30.583 60.141 70.748 30.000 10.026 50.822 30.871 40.879 50.000 10.851 20.405 20.914 10.000 10.682 30.000 150.281 40.738 30.463 6
Xiaoyang Wu, Li Jiang, Peng-Shuai Wang, Zhijian Liu, Xihui Liu, Yu Qiao, Wanli Ouyang, Tong He, Hengshuang Zhao: Point Transformer V3: Simpler, Faster, Stronger. CVPR 2024 (Oral)
ODIN - Sem200permissive0.368 40.562 40.297 40.207 40.380 170.196 10.828 20.000 30.321 20.000 10.400 50.775 10.460 130.501 170.769 120.065 150.870 30.000 30.913 10.213 30.000 90.000 170.389 20.554 40.312 30.000 90.591 10.000 10.000 30.491 10.487 30.894 20.000 10.378 20.303 10.796 170.088 60.669 130.081 10.216 10.256 170.334 130.898 70.000 10.000 20.370 140.599 100.000 100.581 160.988 20.749 80.090 60.242 50.921 40.000 10.202 50.609 20.000 70.655 10.214 130.654 90.346 150.408 70.485 90.169 80.631 20.704 60.000 70.814 10.940 100.127 160.000 10.000 120.462 40.227 60.641 40.885 30.657 50.434 30.000 170.550 20.393 150.000 40.000 10.590 40.000 110.048 20.077 90.000 40.784 160.131 100.557 100.316 20.359 80.833 140.373 20.000 10.661 40.108 90.001 120.000 40.000 10.301 30.612 110.565 150.129 100.482 80.468 160.274 50.561 80.376 10.912 20.181 10.440 60.000 10.166 40.000 30.641 50.000 50.426 20.000 10.642 50.626 70.259 110.787 80.429 40.000 10.589 10.523 80.246 110.857 60.000 170.228 90.000 110.265 40.000 10.752 60.832 10.090 160.157 10.791 10.578 160.000 10.373 150.539 10.000 70.000 10.685 50.000 20.000 10.632 80.575 30.663 10.152 110.358 90.926 130.397 30.454 150.610 40.119 150.685 70.000 10.000 120.803 80.740 90.441 140.000 10.800 100.000 170.871 30.000 10.220 170.487 10.862 10.682 60.054 17
Ayush Jain, Pushkal Katara, Nikolaos Gkanatsios, Adam W. Harley, Gabriel Sarch, Kriti Aggarwal, Vishrav Chaudhary, Katerina Fragkiadaki: ODIN: A Single Model for 2D and 3D Segmentation. CVPR 2024
BFANet ScanNet200permissive0.360 50.553 70.293 50.193 50.483 100.096 60.266 60.000 30.000 70.000 10.298 130.255 120.661 10.810 50.810 30.194 100.785 70.000 30.000 170.161 60.000 90.494 90.382 30.574 30.258 50.000 90.372 90.000 10.000 30.043 140.436 80.000 110.000 10.239 30.000 30.901 30.105 10.689 40.025 40.128 40.614 20.436 10.493 170.000 10.000 20.526 40.546 130.109 50.651 140.953 50.753 60.101 50.143 130.897 50.000 10.431 10.469 150.000 70.522 60.337 50.661 60.459 30.409 60.666 50.102 140.508 60.757 40.000 70.060 140.970 30.497 10.000 10.376 30.511 30.262 40.688 20.921 20.617 100.321 120.590 60.491 90.556 40.000 40.000 10.481 50.093 10.043 30.284 20.000 40.875 140.135 90.669 40.124 130.394 60.849 110.298 40.000 10.476 170.088 130.042 70.000 40.000 10.254 40.653 100.741 60.215 10.573 50.852 50.266 100.654 20.056 120.835 60.000 60.492 10.000 10.000 80.000 30.612 90.000 50.000 90.000 10.616 60.469 170.460 50.698 140.516 20.000 10.378 80.563 40.476 40.863 50.574 90.330 60.000 110.282 30.000 10.760 40.710 50.233 10.000 100.641 50.814 20.000 10.585 100.053 110.000 70.000 10.629 100.000 20.000 10.678 30.528 130.534 50.129 140.596 40.973 40.264 120.772 20.526 100.139 90.707 40.000 10.000 120.764 140.591 160.848 60.000 10.827 40.338 30.806 120.000 10.568 90.151 100.358 20.659 100.510 4
Weiguang Zhao, Rui Zhang, Qiufeng Wang, Guangliang Cheng, Kaizhu Huang: BFANet: Revisiting 3D Semantic Segmentation with Boundary Feature Analysis. CVPR 2025
PonderV2 ScanNet2000.346 60.552 80.270 80.175 90.497 70.070 120.239 70.000 30.000 70.000 10.232 170.412 80.584 20.842 30.804 50.212 70.540 100.000 30.433 160.106 100.000 90.590 50.290 120.548 50.243 70.000 90.356 110.000 10.000 30.062 100.398 130.441 100.000 10.104 100.000 30.888 50.076 110.682 90.030 30.094 70.491 110.351 120.869 100.000 10.063 10.403 110.700 20.000 100.660 130.881 90.761 30.050 80.186 100.852 130.000 10.007 90.570 80.100 20.565 30.326 60.641 100.431 60.290 140.621 60.259 30.408 110.622 100.125 20.082 120.950 50.179 50.000 10.263 60.424 50.193 90.558 70.880 40.545 130.375 70.727 30.445 120.499 80.000 40.000 10.475 70.002 90.034 60.083 80.000 40.924 20.290 40.636 60.115 140.400 50.874 40.186 100.000 10.611 80.128 30.113 20.000 40.000 10.000 110.584 120.636 100.103 140.385 100.843 60.283 40.603 60.080 80.825 100.000 60.377 100.000 10.000 80.000 30.457 110.000 50.000 90.000 10.574 120.608 90.481 40.792 50.394 50.000 10.357 100.503 110.261 100.817 130.504 120.304 70.472 40.115 110.000 10.750 70.677 90.202 20.000 100.509 90.729 60.000 10.519 120.000 140.000 70.000 10.620 120.000 20.000 10.660 60.560 70.486 60.384 60.346 100.952 50.247 140.667 40.436 120.269 30.691 60.000 10.010 70.787 100.889 30.880 40.000 10.810 70.336 40.860 80.000 10.606 80.009 110.248 90.681 70.392 9
Haoyi Zhu, Honghui Yang, Xiaoyang Wu, Di Huang, Sha Zhang, Xianglong He, Tong He, Hengshuang Zhao, Chunhua Shen, Yu Qiao, Wanli Ouyang: PonderV2: Pave the Way for 3D Foundataion Model with A Universal Pre-training Paradigm.
CeCo0.340 70.551 90.247 130.181 60.475 120.057 150.142 120.000 30.000 70.000 10.387 60.463 60.499 90.924 20.774 110.213 60.257 130.000 30.546 150.100 110.006 80.615 20.177 170.534 70.246 60.000 90.400 50.000 10.338 10.006 160.484 50.609 50.000 10.083 110.000 30.873 90.089 50.661 140.000 70.048 150.560 40.408 60.892 80.000 10.000 20.586 10.616 80.000 100.692 80.900 80.721 120.162 10.228 60.860 110.000 10.000 110.575 50.083 30.550 40.347 40.624 130.410 100.360 90.740 30.109 130.321 150.660 80.000 70.121 90.939 130.143 80.000 10.400 20.003 130.190 110.564 60.652 100.615 110.421 50.304 130.579 10.547 50.000 40.000 10.296 140.000 110.030 90.096 70.000 40.916 40.037 130.551 120.171 90.376 70.865 70.286 50.000 10.633 50.102 120.027 80.011 30.000 10.000 110.474 140.742 50.133 70.311 130.824 80.242 130.503 140.068 90.828 90.000 60.429 70.000 10.063 50.000 30.781 20.000 50.000 90.000 10.665 20.633 60.450 60.818 20.000 100.000 10.429 50.532 70.226 130.825 110.510 110.377 50.709 20.079 140.000 10.753 50.683 80.102 150.063 50.401 160.620 130.000 10.619 30.000 140.000 70.000 10.595 130.000 20.000 10.345 140.564 60.411 80.603 10.384 80.945 90.266 110.643 50.367 140.304 10.663 100.000 10.010 70.726 150.767 70.898 30.000 10.784 130.435 10.861 70.000 10.447 110.000 150.257 70.656 110.377 10
Zhisheng Zhong, Jiequan Cui, Yibo Yang, Xiaoyang Wu, Xiaojuan Qi, Xiangyu Zhang, Jiaya Jia: Understanding Imbalanced Semantic Segmentation Through Neural Collapse. CVPR 2023
L3DETR-ScanNet_2000.336 80.533 110.279 60.155 100.508 50.073 110.101 170.000 30.058 60.000 10.294 140.233 140.548 40.927 10.788 100.264 20.463 110.000 30.638 120.098 130.014 70.411 120.226 130.525 100.225 90.010 70.397 60.000 10.000 30.192 60.380 140.598 60.000 10.117 60.000 30.883 60.082 80.689 40.000 70.032 170.549 60.417 40.910 50.000 10.000 20.448 80.613 90.000 100.697 70.960 30.759 40.158 20.293 30.883 70.000 10.312 30.583 40.079 40.422 110.068 170.660 70.418 70.298 120.430 120.114 110.526 50.776 30.051 30.679 30.946 60.152 70.000 10.183 80.000 150.211 80.511 100.409 160.565 120.355 80.448 80.512 50.557 30.000 40.000 10.420 90.000 110.007 170.104 60.000 40.125 170.330 30.514 150.146 120.321 130.860 80.174 110.000 10.629 60.075 140.000 140.000 40.000 10.002 100.671 80.712 70.141 60.339 120.856 40.261 120.529 100.067 100.835 60.000 60.369 120.000 10.259 20.000 30.629 60.000 50.487 10.000 10.579 110.646 40.107 170.720 110.122 70.000 10.333 140.505 100.303 90.908 30.503 130.565 20.074 80.324 10.000 10.740 80.661 110.109 130.000 100.427 130.563 170.000 10.579 110.108 80.000 70.000 10.664 60.000 20.000 10.641 70.539 110.416 70.515 20.256 110.940 120.312 60.209 170.620 30.138 110.636 110.000 10.000 120.775 130.861 50.765 120.000 10.801 90.119 110.860 80.000 10.687 20.001 140.192 140.679 90.699 1
Yanmin Wu, Qiankun Gao, Renrui Zhang, Jian Zhang: Language-Assisted 3D Scene Understanding. arXiv23.12
IMFSegNet0.334 90.532 130.251 110.179 70.486 90.041 160.139 130.003 10.283 40.000 10.274 150.191 150.457 140.704 140.795 70.197 90.830 60.000 30.710 90.055 160.064 40.518 60.305 100.458 170.216 120.027 50.284 130.000 10.000 30.044 120.406 100.561 70.000 10.080 120.000 30.873 90.021 150.683 80.000 70.076 90.494 100.363 90.648 160.000 10.000 20.425 90.649 40.000 100.668 120.908 70.740 110.010 140.206 80.862 100.000 10.000 110.560 90.000 70.359 130.237 110.631 120.408 110.411 40.322 150.246 40.439 100.599 130.047 40.213 70.940 100.139 110.000 10.369 50.124 100.188 120.495 110.624 110.626 80.320 140.595 40.495 80.496 100.000 40.000 10.340 120.014 60.032 70.135 50.000 40.903 80.277 60.612 80.196 70.344 120.848 130.260 60.000 10.574 130.073 160.062 40.000 40.000 10.091 60.839 30.776 30.123 120.392 90.756 120.274 50.518 120.029 160.842 40.000 60.357 130.000 10.035 70.000 30.444 120.793 20.245 50.000 10.512 160.512 150.159 150.713 130.000 100.000 10.336 130.484 120.569 20.852 90.615 60.120 120.068 100.228 80.000 10.733 100.773 20.190 40.000 100.608 60.792 40.000 10.597 70.000 140.025 20.000 10.573 170.000 20.000 10.508 110.555 80.363 100.139 120.610 20.947 80.305 70.594 90.527 90.009 170.633 130.000 10.060 30.820 50.604 150.799 90.000 10.799 110.034 140.784 130.000 10.618 60.424 20.134 160.646 130.214 14
GSTran0.334 100.533 120.250 120.179 80.487 80.041 160.139 130.003 10.273 50.000 10.273 160.189 160.465 120.704 140.794 80.198 80.831 50.000 30.712 80.055 160.063 50.518 60.306 90.459 160.217 100.028 40.282 140.000 10.000 30.044 120.405 110.558 80.000 10.080 120.000 30.873 90.020 160.684 70.000 70.075 120.496 90.363 90.651 150.000 10.000 20.425 90.648 50.000 100.669 110.914 60.741 100.009 150.200 90.864 90.000 10.000 110.560 90.000 70.357 140.233 120.633 110.408 110.411 40.320 160.242 50.440 90.598 140.047 40.205 80.940 100.139 110.000 10.372 40.138 90.191 100.495 110.618 130.624 90.321 120.595 40.496 70.499 80.000 40.000 10.340 120.014 60.032 70.136 40.000 40.903 80.279 50.601 90.198 50.345 110.849 110.260 60.000 10.573 140.072 170.060 50.000 40.000 10.089 70.838 40.775 40.125 110.381 110.752 130.274 50.517 130.032 150.841 50.000 60.354 140.000 10.047 60.000 30.439 130.787 30.252 40.000 10.512 160.507 160.158 160.717 120.000 100.000 10.337 120.483 130.570 10.853 80.614 70.121 110.070 90.229 70.000 10.732 110.773 20.193 30.000 100.606 70.791 50.000 10.593 90.000 140.010 50.000 10.574 160.000 20.000 10.507 120.554 90.361 110.136 130.608 30.948 70.304 80.593 100.533 80.011 160.634 120.000 10.060 30.821 40.613 130.797 100.000 10.799 110.036 130.782 140.000 10.609 70.423 30.133 170.647 120.213 15
OA-CNN-L_ScanNet2000.333 110.558 50.269 90.124 130.448 140.080 90.272 50.000 30.000 70.000 10.342 80.515 40.524 70.713 130.789 90.158 120.384 120.000 30.806 60.125 70.000 90.496 80.332 70.498 140.227 80.024 60.474 30.000 10.003 20.071 90.487 30.000 110.000 10.110 80.000 30.876 70.013 170.703 30.000 70.076 90.473 120.355 110.906 60.000 10.000 20.476 60.706 10.000 100.672 100.835 130.748 90.015 130.223 70.860 110.000 10.000 110.572 70.000 70.509 70.313 70.662 40.398 130.396 80.411 130.276 20.527 40.711 50.000 70.076 130.946 60.166 60.000 10.022 100.160 70.183 130.493 130.699 90.637 60.403 60.330 120.406 130.526 60.024 20.000 10.392 110.000 110.016 160.000 120.196 30.915 50.112 120.557 100.197 60.352 100.877 30.000 120.000 10.592 120.103 110.000 140.067 10.000 10.089 70.735 70.625 110.130 90.568 60.836 70.271 80.534 90.043 130.799 110.001 50.445 50.000 10.000 80.024 20.661 40.000 50.262 30.000 10.591 80.517 130.373 80.788 70.021 80.000 10.455 40.517 90.320 80.823 120.200 160.001 170.150 50.100 120.000 10.736 90.668 100.103 140.052 60.662 40.720 80.000 10.602 60.112 70.002 60.000 10.637 90.000 20.000 10.621 100.569 50.398 90.412 50.234 120.949 60.363 50.492 140.495 110.251 40.665 90.000 10.001 110.805 70.833 60.794 110.000 10.821 50.314 50.843 110.000 10.560 100.245 70.262 60.713 40.370 11
PPT-SpUNet-F.T.0.332 120.556 60.270 70.123 140.519 40.091 70.349 40.000 30.000 70.000 10.339 90.383 100.498 100.833 40.807 40.241 40.584 90.000 30.755 70.124 80.000 90.608 30.330 80.530 90.314 20.000 90.374 80.000 10.000 30.197 50.459 70.000 110.000 10.117 60.000 30.876 70.095 20.682 90.000 70.086 80.518 70.433 20.930 40.000 10.000 20.563 30.542 140.077 70.715 40.858 110.756 50.008 160.171 120.874 80.000 10.039 70.550 110.000 70.545 50.256 80.657 80.453 40.351 100.449 110.213 60.392 120.611 110.000 70.037 150.946 60.138 130.000 10.000 120.063 110.308 20.537 80.796 50.673 40.323 110.392 100.400 140.509 70.000 40.000 10.649 10.000 110.023 120.000 120.000 40.914 60.002 160.506 160.163 110.359 80.872 50.000 120.000 10.623 70.112 60.001 120.000 40.000 10.021 90.753 50.565 150.150 40.579 40.806 90.267 90.616 40.042 140.783 130.000 60.374 110.000 10.000 80.000 30.620 80.000 50.000 90.000 10.572 130.634 50.350 90.792 50.000 100.000 10.376 90.535 60.378 60.855 70.672 30.074 130.000 110.185 100.000 10.727 120.660 120.076 170.000 100.432 120.646 100.000 10.594 80.006 130.000 70.000 10.658 70.000 20.000 10.661 40.549 100.300 140.291 80.045 140.942 110.304 80.600 80.572 70.135 120.695 50.000 10.008 90.793 90.942 20.899 20.000 10.816 60.181 70.897 20.000 10.679 40.223 80.264 50.691 50.345 12
Xiaoyang Wu, Zhuotao Tian, Xin Wen, Bohao Peng, Xihui Liu, Kaicheng Yu, Hengshuang Zhao: Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt Training. CVPR 2024
OctFormer ScanNet200permissive0.326 130.539 100.265 100.131 120.499 60.110 40.522 30.000 30.000 70.000 10.318 110.427 70.455 150.743 110.765 130.175 110.842 40.000 30.828 50.204 40.033 60.429 110.335 60.601 20.312 30.000 90.357 100.000 10.000 30.047 110.423 90.000 110.000 10.105 90.000 30.873 90.079 90.670 120.000 70.117 50.471 130.432 30.829 110.000 10.000 20.584 20.417 170.089 60.684 90.837 120.705 160.021 120.178 110.892 60.000 10.028 80.505 130.000 70.457 90.200 140.662 40.412 90.244 150.496 80.000 170.451 80.626 90.000 70.102 110.943 90.138 130.000 10.000 120.149 80.291 30.534 90.722 70.632 70.331 100.253 140.453 110.487 110.000 40.000 10.479 60.000 110.022 130.000 120.000 40.900 100.128 110.684 30.164 100.413 40.854 100.000 120.000 10.512 160.074 150.003 110.000 40.000 10.000 110.469 150.613 120.132 80.529 70.871 30.227 160.582 70.026 170.787 120.000 60.339 150.000 10.000 80.000 30.626 70.000 50.029 80.000 10.587 90.612 80.411 70.724 100.000 100.000 10.407 60.552 50.513 30.849 100.655 40.408 40.000 110.296 20.000 10.686 150.645 140.145 80.022 80.414 140.633 110.000 10.637 20.224 30.000 70.000 10.650 80.000 20.000 10.622 90.535 120.343 120.483 30.230 130.943 100.289 100.618 70.596 50.140 80.679 80.000 10.022 60.783 110.620 120.906 10.000 10.806 80.137 100.865 50.000 10.378 120.000 150.168 150.680 80.227 13
Peng-Shuai Wang: OctFormer: Octree-based Transformers for 3D Point Clouds. SIGGRAPH 2023
AWCS0.305 140.508 140.225 140.142 110.463 130.063 130.195 90.000 30.000 70.000 10.467 30.551 30.504 80.773 60.764 140.142 130.029 170.000 30.626 130.100 110.000 90.360 130.179 150.507 130.137 150.006 80.300 120.000 10.000 30.172 80.364 150.512 90.000 10.056 140.000 30.865 130.093 40.634 170.000 70.071 130.396 140.296 160.876 90.000 10.000 20.373 130.436 160.063 90.749 20.877 100.721 120.131 30.124 140.804 150.000 10.000 110.515 120.010 60.452 100.252 90.578 140.417 80.179 170.484 100.171 70.337 140.606 120.000 70.115 100.937 140.142 90.000 10.008 110.000 150.157 160.484 140.402 170.501 150.339 90.553 70.529 30.478 120.000 40.000 10.404 100.001 100.022 130.077 90.000 40.894 120.219 70.628 70.093 150.305 140.886 10.233 90.000 10.603 90.112 60.023 90.000 40.000 10.000 110.741 60.664 80.097 150.253 140.782 100.264 110.523 110.154 20.707 160.000 60.411 80.000 10.000 80.000 30.332 160.000 50.000 90.000 10.602 70.595 100.185 130.656 160.159 60.000 10.355 110.424 150.154 150.729 150.516 100.220 100.620 30.084 130.000 10.707 140.651 130.173 50.014 90.381 170.582 140.000 10.619 30.049 120.000 70.000 10.702 40.000 20.000 10.302 160.489 150.317 130.334 70.392 70.922 140.254 130.533 130.394 130.129 140.613 150.000 10.000 120.820 50.649 110.749 130.000 10.782 140.282 60.863 60.000 10.288 150.006 120.220 110.633 140.542 3
: Long-Tailed 3D Semantic Segmentation with Adaptive Weight Constraint and Sampling. ICRA 2024
LGroundpermissive0.272 150.485 150.184 150.106 150.476 110.077 100.218 80.000 30.000 70.000 10.547 20.295 110.540 50.746 100.745 150.058 160.112 160.005 10.658 110.077 150.000 90.322 140.178 160.512 110.190 130.199 20.277 150.000 10.000 30.173 70.399 120.000 110.000 10.039 160.000 30.858 140.085 70.676 110.002 50.103 60.498 80.323 140.703 120.000 10.000 20.296 150.549 120.216 10.702 60.768 140.718 140.028 100.092 160.786 160.000 10.000 110.453 160.022 50.251 170.252 90.572 150.348 140.321 110.514 70.063 150.279 160.552 150.000 70.019 160.932 150.132 150.000 10.000 120.000 150.156 170.457 150.623 120.518 140.265 160.358 110.381 150.395 140.000 40.000 10.127 170.012 80.051 10.000 120.000 40.886 130.014 140.437 170.179 80.244 150.826 150.000 120.000 10.599 100.136 10.085 30.000 40.000 10.000 110.565 130.612 130.143 50.207 150.566 140.232 150.446 150.127 40.708 150.000 60.384 90.000 10.000 80.000 30.402 140.000 50.059 70.000 10.525 150.566 110.229 120.659 150.000 100.000 10.265 150.446 140.147 160.720 170.597 80.066 140.000 110.187 90.000 10.726 130.467 170.134 120.000 100.413 150.629 120.000 10.363 160.055 100.022 30.000 10.626 110.000 20.000 10.323 150.479 170.154 160.117 150.028 160.901 150.243 150.415 160.295 170.143 60.610 160.000 10.000 120.777 120.397 170.324 160.000 10.778 150.179 80.702 160.000 10.274 160.404 40.233 100.622 150.398 7
David Rozenberszki, Or Litany, Angela Dai: Language-Grounded Indoor 3D Semantic Segmentation in the Wild. arXiv
Minkowski 34Dpermissive0.253 160.463 160.154 170.102 160.381 160.084 80.134 150.000 30.000 70.000 10.386 70.141 170.279 170.737 120.703 160.014 170.164 150.000 30.663 100.092 140.000 90.224 150.291 110.531 80.056 170.000 90.242 160.000 10.000 30.013 150.331 160.000 110.000 10.035 170.001 20.858 140.059 140.650 160.000 70.056 140.353 150.299 150.670 130.000 10.000 20.284 160.484 150.071 80.594 150.720 160.710 150.027 110.068 170.813 140.000 10.005 100.492 140.164 10.274 160.111 160.571 160.307 170.293 130.307 170.150 90.163 170.531 160.002 60.545 50.932 150.093 170.000 10.000 120.002 140.159 150.368 170.581 150.440 170.228 170.406 90.282 170.294 160.000 40.000 10.189 160.060 20.036 50.000 120.000 40.897 110.000 170.525 140.025 170.205 170.771 170.000 120.000 10.593 110.108 90.044 60.000 40.000 10.000 110.282 170.589 140.094 160.169 160.466 170.227 160.419 170.125 50.757 140.002 40.334 160.000 10.000 80.000 30.357 150.000 50.000 90.000 10.582 100.513 140.337 100.612 170.000 100.000 10.250 160.352 170.136 170.724 160.655 40.280 80.000 110.046 160.000 10.606 170.559 150.159 70.102 30.445 100.655 90.000 10.310 170.117 60.000 70.000 10.581 150.026 10.000 10.265 170.483 160.084 170.097 170.044 150.865 170.142 170.588 110.351 150.272 20.596 170.000 10.003 100.622 160.720 100.096 170.000 10.771 160.016 150.772 150.000 10.302 140.194 90.214 120.621 160.197 16
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019
CSC-Pretrainpermissive0.249 170.455 170.171 160.079 170.418 150.059 140.186 100.000 30.000 70.000 10.335 100.250 130.316 160.766 70.697 170.142 130.170 140.003 20.553 140.112 90.097 10.201 160.186 140.476 150.081 160.000 90.216 170.000 10.000 30.001 170.314 170.000 110.000 10.055 150.000 30.832 160.094 30.659 150.002 50.076 90.310 160.293 170.664 140.000 10.000 20.175 170.634 60.130 20.552 170.686 170.700 170.076 70.110 150.770 170.000 10.000 110.430 170.000 70.319 150.166 150.542 170.327 160.205 160.332 140.052 160.375 130.444 170.000 70.012 170.930 170.203 30.000 10.000 120.046 120.175 140.413 160.592 140.471 160.299 150.152 160.340 160.247 170.000 40.000 10.225 150.058 30.037 40.000 120.207 20.862 150.014 140.548 130.033 160.233 160.816 160.000 120.000 10.542 150.123 50.121 10.019 20.000 10.000 110.463 160.454 170.045 170.128 170.557 150.235 140.441 160.063 110.484 170.000 60.308 170.000 10.000 80.000 30.318 170.000 50.000 90.000 10.545 140.543 120.164 140.734 90.000 100.000 10.215 170.371 160.198 140.743 140.205 150.062 150.000 110.079 140.000 10.683 160.547 160.142 90.000 100.441 110.579 150.000 10.464 140.098 90.041 10.000 10.590 140.000 20.000 10.373 130.494 140.174 150.105 160.001 170.895 160.222 160.537 120.307 160.180 50.625 140.000 10.000 120.591 170.609 140.398 150.000 10.766 170.014 160.638 170.000 10.377 130.004 130.206 130.609 170.465 5
Ji Hou, Benjamin Graham, Matthias Nießner, Saining Xie: Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts. CVPR 2021


This table lists the benchmark results for the ScanNet200 3D semantic instance scenario.




Method Infoavg ap 50%head ap 50%common ap 50%tail ap 50%alarm clockarmchairbackpackbagballbarbasketbathroom cabinetbathroom counterbathroom stallbathroom stall doorbathroom vanitybathtubbedbenchbicyclebinblackboardblanketblindsboardbookbookshelfbottlebowlboxbroombucketbulletin boardcabinetcalendarcandlecartcase of water bottlescd caseceilingceiling lightchairclockclosetcloset doorcloset rodcloset wallclothesclothes dryercoat rackcoffee kettlecoffee makercoffee tablecolumncomputer towercontainercopiercouchcountercratecupcurtaincushiondecorationdeskdining tabledish rackdishwasherdividerdoordoorframedresserdumbbelldustpanend tablefanfile cabinetfire alarmfire extinguisherfireplacefolded chairfurnitureguitarguitar casehair dryerhandicap barhatheadphonesironing boardjacketkeyboardkeyboard pianokitchen cabinetkitchen counterladderlamplaptoplaundry basketlaundry detergentlaundry hamperledgelightlight switchluggagemachinemailboxmatmattressmicrowavemini fridgemirrormonitormousemusic standnightstandobjectoffice chairottomanovenpaperpaper bagpaper cutterpaper towel dispenserpaper towel rollpersonpianopicturepillarpillowpipeplantplateplungerposterpotted plantpower outletpower stripprinterprojectorprojector screenpurserackradiatorrailrange hoodrecycling binrefrigeratorscaleseatshelfshoeshowershower curtainshower curtain rodshower doorshower floorshower headshower wallsignsinksoap dishsoap dispensersofa chairspeakerstair railstairsstandstoolstorage binstorage containerstorage organizerstovestructurestuffed animalsuitcasetabletelephonetissue boxtoastertoaster oventoilettoilet papertoilet paper dispensertoilet paper holdertoilet seat cover dispensertoweltrash bintrash cantraytubetvtv standvacuum cleanerventwardrobewashing machinewater bottlewater coolerwater pitcherwhiteboardwindowwindowsill
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Mask3D Scannet2000.388 10.542 10.357 20.237 20.610 10.091 20.125 60.000 10.000 20.000 10.065 30.668 10.451 11.000 10.955 10.640 10.500 20.039 10.125 30.063 30.409 10.311 20.291 10.609 40.266 20.000 10.163 20.000 10.008 10.044 30.496 21.000 10.000 10.018 30.000 20.756 10.573 10.808 20.000 20.010 20.042 40.130 40.552 20.042 10.000 11.000 10.725 40.750 10.883 11.000 10.832 40.024 30.107 20.614 30.226 10.250 10.628 20.792 10.677 30.400 10.741 20.278 20.511 20.077 60.111 20.313 30.715 20.302 10.017 40.200 20.000 10.188 10.000 20.178 30.736 21.000 10.615 10.514 10.409 20.380 60.600 10.000 10.000 10.400 20.013 20.254 10.381 10.000 10.123 50.400 10.839 20.258 20.463 10.926 10.265 20.000 10.857 20.099 10.021 20.500 10.027 10.028 21.000 10.502 60.016 20.076 50.500 10.612 10.578 10.005 30.597 30.194 20.497 10.000 10.500 10.000 20.323 50.000 11.000 10.000 10.748 10.708 20.050 50.890 21.000 10.008 20.151 40.301 21.000 11.000 10.792 30.945 11.000 10.511 10.004 20.753 10.776 30.287 20.020 20.003 50.974 30.033 10.412 60.000 20.000 20.000 20.667 20.000 10.000 10.491 20.676 20.352 20.335 10.060 30.822 60.527 31.000 10.517 20.606 10.853 20.000 10.004 10.806 11.000 10.727 10.000 10.042 20.739 20.000 10.399 30.391 10.504 20.591 10.571 1
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
ODIN - Ins200permissive0.381 20.507 20.375 10.237 10.484 40.108 10.500 10.000 10.125 10.000 10.058 40.647 20.385 20.667 40.853 20.542 31.000 10.000 31.000 10.093 20.000 20.028 50.274 20.682 20.550 10.000 10.269 10.000 10.000 20.714 10.566 11.000 10.000 10.500 10.125 10.585 30.066 30.653 60.083 10.049 10.264 10.227 10.667 10.000 20.000 10.278 60.723 50.250 30.786 51.000 10.744 60.039 20.209 10.494 50.000 30.250 10.446 30.500 30.750 10.200 30.780 10.333 10.602 10.469 30.163 10.406 20.530 40.000 20.668 10.200 20.000 10.000 30.500 10.313 10.769 11.000 10.511 20.196 20.286 30.393 50.337 20.000 10.000 10.600 10.000 30.174 30.226 20.000 10.579 20.200 30.887 10.750 10.428 20.782 30.438 10.000 10.795 30.063 30.003 30.500 10.000 20.333 11.000 10.742 20.083 10.585 10.417 40.448 60.496 20.055 20.734 10.472 10.174 50.000 10.250 30.000 20.688 10.000 11.000 10.000 10.631 30.667 30.275 10.694 61.000 10.000 30.328 10.422 10.000 51.000 10.500 40.638 30.000 20.391 30.000 30.582 30.800 10.208 50.000 30.246 20.667 50.000 30.638 10.167 10.000 20.000 20.778 10.000 10.000 10.563 10.614 30.841 10.333 20.250 20.938 50.569 10.500 40.695 10.264 40.863 10.000 10.000 20.550 51.000 10.668 20.000 10.000 30.667 30.000 10.333 40.333 20.665 10.434 30.264 2
TD3D Scannet200permissive0.320 30.501 30.264 30.164 30.506 30.062 30.500 10.000 10.000 20.000 10.208 10.431 30.252 41.000 10.733 40.587 20.000 30.008 20.000 40.106 10.000 20.356 10.123 50.686 10.101 30.000 10.152 30.000 10.000 20.226 20.280 40.000 30.000 10.250 20.000 20.619 20.061 40.841 10.000 20.000 30.167 20.194 20.333 30.000 20.000 10.667 20.820 10.250 30.790 41.000 10.879 20.077 10.094 40.708 10.217 20.049 30.634 10.792 10.331 50.033 60.716 30.159 30.396 30.331 50.099 30.415 10.842 10.000 20.458 20.542 10.000 10.101 20.000 20.218 20.513 30.500 30.458 30.104 30.516 10.456 10.268 50.000 10.000 10.400 20.022 10.233 20.143 30.000 10.677 10.400 10.504 60.095 40.083 60.890 20.061 30.000 10.906 10.076 20.231 10.125 30.000 20.003 30.792 40.881 10.000 30.098 40.125 50.498 50.459 30.063 10.715 20.000 30.241 40.000 10.396 20.063 10.605 20.000 10.000 30.000 10.448 60.629 40.202 30.967 10.250 30.038 10.192 20.185 30.083 41.000 11.000 10.857 20.000 20.470 20.012 10.565 40.798 20.621 10.111 10.500 11.000 10.017 20.509 20.000 20.008 11.000 10.525 30.000 10.000 10.332 40.679 10.264 30.333 20.267 11.000 10.549 20.299 60.387 30.328 30.744 50.000 10.000 20.435 61.000 10.283 50.000 10.196 10.817 10.000 10.472 10.222 40.123 50.560 20.156 3
Maksim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: Top-Down Beats Bottom-Up in 3D Instance Segmentation. WACV 2024
LGround Inst.permissive0.246 40.413 40.170 40.130 40.455 60.003 60.500 10.000 10.000 20.000 10.017 50.333 50.111 61.000 10.681 50.400 40.000 30.000 31.000 10.003 60.000 20.167 30.190 30.637 30.067 40.000 10.081 40.000 10.000 20.000 40.264 50.000 30.000 10.000 40.000 20.387 50.031 60.754 30.000 20.000 30.151 30.135 30.056 50.000 20.000 10.582 40.589 60.500 20.815 21.000 10.903 10.000 40.097 30.588 40.000 30.000 40.234 40.000 40.500 40.400 10.682 50.156 40.159 50.750 10.046 40.125 50.660 30.000 20.200 30.000 60.000 10.000 30.000 20.164 40.402 40.500 30.373 40.025 40.143 60.426 30.317 30.000 10.000 10.000 40.000 30.063 40.000 40.000 10.000 60.000 50.575 40.250 30.241 30.772 40.000 40.000 10.653 50.034 40.000 40.000 40.000 20.000 41.000 10.561 50.000 30.100 30.500 10.541 40.452 40.000 40.581 40.000 30.364 20.000 10.000 40.000 20.571 30.000 10.000 30.000 10.568 50.511 50.167 40.857 30.000 40.000 30.164 30.112 40.000 50.530 61.000 10.286 40.000 20.125 40.000 30.464 60.706 40.208 40.000 30.125 30.744 40.000 30.500 30.000 20.000 20.000 20.511 40.000 10.000 10.344 30.541 40.068 40.333 20.000 41.000 10.196 50.533 30.318 40.000 50.748 40.000 10.000 20.690 21.000 10.400 40.000 10.000 30.667 30.000 10.333 40.333 20.270 40.399 40.083 5
David Rozenberszki, Or Litany, Angela Dai: Language-Grounded Indoor 3D Semantic Segmentation in the Wild.
CSC-Pretrain Inst.permissive0.209 50.361 60.157 50.085 50.506 20.007 40.500 10.000 10.000 20.000 10.000 60.093 60.221 50.667 40.524 60.400 40.000 30.000 30.000 40.004 50.000 20.000 60.109 60.589 50.000 50.000 10.059 60.000 10.000 20.000 40.322 30.000 30.000 10.000 40.000 20.405 40.055 50.700 50.000 20.000 30.028 50.091 60.083 40.000 20.000 10.667 20.768 20.000 50.807 31.000 10.776 50.000 40.000 60.340 60.000 30.000 40.103 60.000 40.750 10.200 30.634 60.053 60.246 40.677 20.006 60.198 40.432 50.000 20.000 50.050 50.000 10.000 30.000 20.111 60.356 50.500 30.188 60.000 50.220 50.448 20.050 60.000 10.000 10.000 40.000 30.032 60.000 40.000 10.396 30.000 50.573 50.000 60.228 40.747 50.000 40.000 10.573 60.021 60.000 40.000 40.000 20.000 40.500 50.573 40.000 30.000 60.125 50.592 30.364 60.000 40.450 60.000 30.364 20.000 10.000 40.000 20.340 40.000 10.000 30.000 10.610 40.833 10.221 20.702 50.000 40.000 30.135 60.094 50.125 20.571 50.500 40.143 60.000 20.125 40.000 30.618 20.667 50.115 60.000 30.125 31.000 10.000 30.500 30.000 20.000 20.000 20.502 50.000 10.000 10.312 50.248 60.050 50.000 60.000 40.997 30.420 40.500 40.149 60.451 20.748 30.000 10.000 20.636 30.667 60.600 30.000 10.000 30.278 60.000 10.333 40.000 60.294 30.381 60.110 4
Ji Hou, Benjamin Graham, Matthias Nießner, Saining Xie: Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts. CVPR 2021
Minkowski 34D Inst.permissive0.203 60.369 50.134 60.078 60.479 50.003 50.500 10.000 10.000 20.000 10.100 20.371 40.300 30.667 40.746 30.400 40.000 30.000 30.000 40.031 40.000 20.074 40.165 40.413 60.000 50.000 10.070 50.000 10.000 20.000 40.221 60.000 30.000 10.000 40.000 20.372 60.070 20.706 40.000 20.000 30.000 60.123 50.033 60.000 20.000 10.422 50.732 30.000 50.778 61.000 10.845 30.000 40.090 50.636 20.000 30.000 40.158 50.000 40.250 60.050 50.693 40.123 50.051 60.385 40.009 50.118 60.406 60.000 20.000 50.200 20.000 10.000 30.000 20.133 50.307 60.500 30.251 50.000 50.281 40.402 40.317 30.000 10.000 10.000 40.000 30.060 50.000 40.000 10.396 30.200 30.669 30.021 50.218 50.720 60.000 40.000 10.696 40.025 50.000 40.000 40.000 20.000 40.125 60.596 30.000 30.191 20.500 10.595 20.369 50.000 40.500 50.000 30.143 60.000 10.000 40.000 20.226 60.000 10.000 30.000 10.701 20.511 50.000 60.851 40.000 40.000 30.150 50.052 60.100 30.981 40.500 40.286 40.000 20.000 60.000 30.545 50.522 60.250 30.000 30.000 60.522 60.000 30.500 30.000 20.000 20.000 20.282 60.000 10.000 10.178 60.382 50.018 60.056 50.000 40.997 30.107 60.677 20.313 50.000 50.726 60.000 10.000 20.583 40.903 50.200 60.000 10.000 30.333 50.000 10.442 20.083 50.109 60.387 50.000 6
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019


ScanNet Benchmark

This table lists the benchmark results for the 3D semantic label scenario.


Method Infoavg ioubathtubbedbookshelfcabinetchaircountercurtaindeskdoorfloorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwallwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
PTv3-PPT-ALCcopyleft0.798 10.911 110.812 220.854 80.770 120.856 150.555 170.943 10.660 260.735 20.979 10.606 70.492 10.792 40.934 40.841 20.819 60.716 90.947 100.906 10.822 1
Guangda Ji, Silvan Weder, Francis Engelmann, Marc Pollefeys, Hermann Blum: ARKit LabelMaker: A New Scale for Indoor 3D Scene Understanding. arxiv
DITR ScanNet0.797 20.727 760.869 10.882 10.785 60.868 70.578 50.943 10.744 10.727 30.979 10.627 20.364 90.824 10.949 20.779 150.844 10.757 10.982 10.905 20.802 3
Karim Abou Zeid, Kadir Yilmaz, Daan de Geus, Alexander Hermans, David Adrian, Timm Linder, Bastian Leibe: DINO in the Room: Leveraging 2D Foundation Models for 3D Segmentation.
PTv3 ScanNet0.794 30.941 30.813 210.851 110.782 70.890 20.597 10.916 60.696 110.713 50.979 10.635 10.384 30.793 30.907 100.821 50.790 360.696 140.967 40.903 30.805 2
Xiaoyang Wu, Li Jiang, Peng-Shuai Wang, Zhijian Liu, Xihui Liu, Yu Qiao, Wanli Ouyang, Tong He, Hengshuang Zhao: Point Transformer V3: Simpler, Faster, Stronger. CVPR 2024 (Oral)
PonderV20.785 40.978 10.800 300.833 290.788 40.853 200.545 210.910 90.713 30.705 60.979 10.596 90.390 20.769 150.832 450.821 50.792 350.730 20.975 20.897 60.785 7
Haoyi Zhu, Honghui Yang, Xiaoyang Wu, Di Huang, Sha Zhang, Xianglong He, Tong He, Hengshuang Zhao, Chunhua Shen, Yu Qiao, Wanli Ouyang: PonderV2: Pave the Way for 3D Foundataion Model with A Universal Pre-training Paradigm.
Mix3Dpermissive0.781 50.964 20.855 20.843 200.781 80.858 130.575 80.831 380.685 170.714 40.979 10.594 100.310 300.801 20.892 190.841 20.819 60.723 60.940 150.887 80.725 28
Alexey Nekrasov, Jonas Schult, Or Litany, Bastian Leibe, Francis Engelmann: Mix3D: Out-of-Context Data Augmentation for 3D Scenes. 3DV 2021 (Oral)
Swin3Dpermissive0.779 60.861 230.818 160.836 260.790 30.875 40.576 70.905 100.704 70.739 10.969 120.611 30.349 120.756 250.958 10.702 510.805 190.708 100.916 390.898 50.801 4
TTT-KD0.773 70.646 970.818 160.809 410.774 100.878 30.581 30.943 10.687 150.704 70.978 60.607 60.336 190.775 110.912 80.838 40.823 40.694 150.967 40.899 40.794 6
Lisa Weijler, Muhammad Jehanzeb Mirza, Leon Sick, Can Ekkazan, Pedro Hermosilla: TTT-KD: Test-Time Training for 3D Semantic Segmentation through Knowledge Distillation from Foundation Models.
ResLFE_HDS0.772 80.939 40.824 70.854 80.771 110.840 350.564 130.900 120.686 160.677 140.961 180.537 360.348 130.769 150.903 120.785 130.815 90.676 260.939 160.880 130.772 11
PPT-SpUNet-Joint0.766 90.932 50.794 360.829 310.751 260.854 180.540 250.903 110.630 390.672 170.963 160.565 260.357 100.788 50.900 140.737 310.802 200.685 200.950 80.887 80.780 8
Xiaoyang Wu, Zhuotao Tian, Xin Wen, Bohao Peng, Xihui Liu, Kaicheng Yu, Hengshuang Zhao: Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt Training. CVPR 2024
OctFormerpermissive0.766 90.925 70.808 260.849 130.786 50.846 300.566 120.876 190.690 130.674 160.960 190.576 220.226 720.753 270.904 110.777 160.815 90.722 70.923 310.877 160.776 10
Peng-Shuai Wang: OctFormer: Octree-based Transformers for 3D Point Clouds. SIGGRAPH 2023
CU-Hybrid Net0.764 110.924 80.819 140.840 230.757 210.853 200.580 40.848 300.709 50.643 270.958 230.587 160.295 380.753 270.884 230.758 230.815 90.725 50.927 270.867 270.743 19
OccuSeg+Semantic0.764 110.758 610.796 340.839 240.746 300.907 10.562 140.850 290.680 190.672 170.978 60.610 40.335 210.777 90.819 490.847 10.830 30.691 170.972 30.885 100.727 26
O-CNNpermissive0.762 130.924 80.823 80.844 190.770 120.852 220.577 60.847 320.711 40.640 310.958 230.592 110.217 780.762 200.888 200.758 230.813 130.726 40.932 250.868 260.744 18
Peng-Shuai Wang, Yang Liu, Yu-Xiao Guo, Chun-Yu Sun, Xin Tong: O-CNN: Octree-based Convolutional Neural Networks for 3D Shape Analysis. SIGGRAPH 2017
DiffSegNet0.758 140.725 780.789 410.843 200.762 170.856 150.562 140.920 40.657 290.658 210.958 230.589 140.337 180.782 60.879 240.787 110.779 410.678 220.926 290.880 130.799 5
DTC0.757 150.843 290.820 120.847 160.791 20.862 110.511 380.870 220.707 60.652 230.954 400.604 80.279 480.760 210.942 30.734 320.766 500.701 130.884 610.874 220.736 20
OA-CNN-L_ScanNet200.756 160.783 470.826 60.858 60.776 90.837 390.548 200.896 150.649 310.675 150.962 170.586 170.335 210.771 140.802 540.770 190.787 380.691 170.936 200.880 130.761 13
ConDaFormer0.755 170.927 60.822 100.836 260.801 10.849 250.516 350.864 260.651 300.680 130.958 230.584 190.282 450.759 230.855 350.728 340.802 200.678 220.880 660.873 230.756 16
Lunhao Duan, Shanshan Zhao, Nan Xue, Mingming Gong, Guisong Xia, Dacheng Tao: ConDaFormer : Disassembled Transformer with Local Structure Enhancement for 3D Point Cloud Understanding. Neurips, 2023
LSK3DNetpermissive0.755 170.899 160.823 80.843 200.764 160.838 380.584 20.845 330.717 20.638 330.956 300.580 210.229 710.640 480.900 140.750 260.813 130.729 30.920 350.872 240.757 14
Tuo Feng, Wenguan Wang, Fan Ma, Yi Yang: LSK3DNet: Towards Effective and Efficient 3D Perception with Large Sparse Kernels. CVPR 2024
PNE0.755 170.786 450.835 50.834 280.758 190.849 250.570 100.836 370.648 320.668 190.978 60.581 200.367 70.683 390.856 330.804 80.801 240.678 220.961 60.889 70.716 35
P. Hermosilla: Point Neighborhood Embeddings.
PointTransformerV20.752 200.742 680.809 250.872 20.758 190.860 120.552 180.891 170.610 460.687 80.960 190.559 300.304 330.766 180.926 60.767 200.797 280.644 380.942 130.876 190.722 31
Xiaoyang Wu, Yixing Lao, Li Jiang, Xihui Liu, Hengshuang Zhao: Point Transformer V2: Grouped Vector Attention and Partition-based Pooling. NeurIPS 2022
DMF-Net0.752 200.906 140.793 380.802 470.689 450.825 520.556 160.867 230.681 180.602 500.960 190.555 320.365 80.779 80.859 300.747 270.795 320.717 80.917 380.856 350.764 12
C.Yang, Y.Yan, W.Zhao, J.Ye, X.Yang, A.Hussain, B.Dong, K.Huang: Towards Deeper and Better Multi-view Feature Fusion for 3D Semantic Segmentation. ICONIP 2023
PointConvFormer0.749 220.793 430.790 390.807 430.750 280.856 150.524 310.881 180.588 580.642 300.977 100.591 120.274 510.781 70.929 50.804 80.796 290.642 390.947 100.885 100.715 36
Wenxuan Wu, Qi Shan, Li Fuxin: PointConvFormer: Revenge of the Point-based Convolution.
BPNetcopyleft0.749 220.909 120.818 160.811 390.752 240.839 370.485 530.842 340.673 210.644 260.957 280.528 420.305 320.773 120.859 300.788 100.818 80.693 160.916 390.856 350.723 30
Wenbo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia, Tien-Tsin Wong: Bidirectional Projection Network for Cross Dimension Scene Understanding. CVPR 2021 (Oral)
MSP0.748 240.623 1000.804 280.859 50.745 310.824 540.501 420.912 80.690 130.685 100.956 300.567 250.320 270.768 170.918 70.720 390.802 200.676 260.921 330.881 120.779 9
StratifiedFormerpermissive0.747 250.901 150.803 290.845 180.757 210.846 300.512 370.825 410.696 110.645 250.956 300.576 220.262 620.744 330.861 290.742 290.770 480.705 110.899 510.860 320.734 21
Xin Lai*, Jianhui Liu*, Li Jiang, Liwei Wang, Hengshuang Zhao, Shu Liu, Xiaojuan Qi, Jiaya Jia: Stratified Transformer for 3D Point Cloud Segmentation. CVPR 2022
VMNetpermissive0.746 260.870 210.838 30.858 60.729 360.850 240.501 420.874 200.587 590.658 210.956 300.564 270.299 350.765 190.900 140.716 420.812 150.631 440.939 160.858 330.709 37
Zeyu HU, Xuyang Bai, Jiaxiang Shang, Runze Zhang, Jiayu Dong, Xin Wang, Guangyuan Sun, Hongbo Fu, Chiew-Lan Tai: VMNet: Voxel-Mesh Network for Geodesic-Aware 3D Semantic Segmentation. ICCV 2021 (Oral)
Virtual MVFusion0.746 260.771 550.819 140.848 150.702 430.865 100.397 900.899 130.699 90.664 200.948 620.588 150.330 230.746 320.851 390.764 210.796 290.704 120.935 210.866 280.728 24
Abhijit Kundu, Xiaoqi Yin, Alireza Fathi, David Ross, Brian Brewington, Thomas Funkhouser, Caroline Pantofaru: Virtual Multi-view Fusion for 3D Semantic Segmentation. ECCV 2020
DiffSeg3D20.745 280.725 780.814 200.837 250.751 260.831 460.514 360.896 150.674 200.684 110.960 190.564 270.303 340.773 120.820 480.713 450.798 270.690 190.923 310.875 200.757 14
ODINpermissive0.744 290.658 930.752 640.870 30.714 400.843 330.569 110.919 50.703 80.622 400.949 590.591 120.343 150.736 340.784 560.816 70.838 20.672 310.918 370.854 390.725 28
Ayush Jain, Pushkal Katara, Nikolaos Gkanatsios, Adam W. Harley, Gabriel Sarch, Kriti Aggarwal, Vishrav Chaudhary, Katerina Fragkiadaki: ODIN: A Single Model for 2D and 3D Segmentation. CVPR 2024
Retro-FPN0.744 290.842 300.800 300.767 610.740 320.836 410.541 230.914 70.672 220.626 370.958 230.552 330.272 530.777 90.886 220.696 520.801 240.674 290.941 140.858 330.717 33
Peng Xiang*, Xin Wen*, Yu-Shen Liu, Hui Zhang, Yi Fang, Zhizhong Han: Retrospective Feature Pyramid Network for Point Cloud Semantic Segmentation. ICCV 2023
EQ-Net0.743 310.620 1010.799 330.849 130.730 350.822 560.493 500.897 140.664 230.681 120.955 340.562 290.378 40.760 210.903 120.738 300.801 240.673 300.907 430.877 160.745 17
Zetong Yang*, Li Jiang*, Yanan Sun, Bernt Schiele, Jiaya JIa: A Unified Query-based Paradigm for Point Cloud Understanding. CVPR 2022
SAT0.742 320.860 240.765 550.819 340.769 140.848 270.533 270.829 390.663 240.631 360.955 340.586 170.274 510.753 270.896 170.729 330.760 560.666 330.921 330.855 370.733 22
LRPNet0.742 320.816 380.806 270.807 430.752 240.828 500.575 80.839 360.699 90.637 340.954 400.520 450.320 270.755 260.834 430.760 220.772 450.676 260.915 410.862 300.717 33
LargeKernel3D0.739 340.909 120.820 120.806 450.740 320.852 220.545 210.826 400.594 570.643 270.955 340.541 350.263 610.723 370.858 320.775 180.767 490.678 220.933 230.848 430.694 42
Yukang Chen*, Jianhui Liu*, Xiangyu Zhang, Xiaojuan Qi, Jiaya Jia: LargeKernel3D: Scaling up Kernels in 3D Sparse CNNs. CVPR 2023
RPN0.736 350.776 510.790 390.851 110.754 230.854 180.491 520.866 240.596 560.686 90.955 340.536 370.342 160.624 550.869 260.787 110.802 200.628 450.927 270.875 200.704 39
MinkowskiNetpermissive0.736 350.859 250.818 160.832 300.709 410.840 350.521 330.853 280.660 260.643 270.951 510.544 340.286 430.731 350.893 180.675 600.772 450.683 210.874 720.852 410.727 26
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019
IPCA0.731 370.890 170.837 40.864 40.726 370.873 50.530 300.824 420.489 920.647 240.978 60.609 50.336 190.624 550.733 640.758 230.776 430.570 700.949 90.877 160.728 24
online3d0.727 380.715 830.777 480.854 80.748 290.858 130.497 470.872 210.572 650.639 320.957 280.523 430.297 370.750 300.803 530.744 280.810 160.587 660.938 180.871 250.719 32
PointTransformer++0.725 390.727 760.811 240.819 340.765 150.841 340.502 410.814 470.621 420.623 390.955 340.556 310.284 440.620 570.866 270.781 140.757 600.648 360.932 250.862 300.709 37
SparseConvNet0.725 390.647 960.821 110.846 170.721 380.869 60.533 270.754 630.603 520.614 420.955 340.572 240.325 250.710 380.870 250.724 370.823 40.628 450.934 220.865 290.683 45
MatchingNet0.724 410.812 400.812 220.810 400.735 340.834 430.495 490.860 270.572 650.602 500.954 400.512 470.280 470.757 240.845 410.725 360.780 400.606 550.937 190.851 420.700 41
INS-Conv-semantic0.717 420.751 640.759 580.812 380.704 420.868 70.537 260.842 340.609 480.608 460.953 440.534 390.293 390.616 580.864 280.719 410.793 330.640 400.933 230.845 470.663 50
PointMetaBase0.714 430.835 310.785 430.821 320.684 470.846 300.531 290.865 250.614 430.596 540.953 440.500 500.246 670.674 400.888 200.692 530.764 520.624 470.849 870.844 480.675 47
contrastBoundarypermissive0.705 440.769 580.775 490.809 410.687 460.820 590.439 780.812 480.661 250.591 560.945 700.515 460.171 970.633 520.856 330.720 390.796 290.668 320.889 580.847 440.689 43
Liyao Tang, Yibing Zhan, Zhe Chen, Baosheng Yu, Dacheng Tao: Contrastive Boundary Learning for Point Cloud Segmentation. CVPR2022
ClickSeg_Semantic0.703 450.774 530.800 300.793 520.760 180.847 290.471 570.802 510.463 990.634 350.968 140.491 530.271 550.726 360.910 90.706 470.815 90.551 820.878 670.833 490.570 82
RFCR0.702 460.889 180.745 690.813 370.672 500.818 630.493 500.815 460.623 400.610 440.947 640.470 620.249 660.594 620.848 400.705 480.779 410.646 370.892 560.823 550.611 65
Jingyu Gong, Jiachen Xu, Xin Tan, Haichuan Song, Yanyun Qu, Yuan Xie, Lizhuang Ma: Omni-Supervised Point Cloud Segmentation via Gradual Receptive Field Component Reasoning. CVPR2021
One Thing One Click0.701 470.825 350.796 340.723 680.716 390.832 450.433 800.816 440.634 370.609 450.969 120.418 880.344 140.559 740.833 440.715 430.808 180.560 760.902 480.847 440.680 46
JSENetpermissive0.699 480.881 200.762 560.821 320.667 510.800 760.522 320.792 540.613 440.607 470.935 900.492 520.205 840.576 670.853 370.691 540.758 580.652 350.872 750.828 520.649 54
Zeyu HU, Mingmin Zhen, Xuyang BAI, Hongbo Fu, Chiew-lan Tai: JSENet: Joint Semantic Segmentation and Edge Detection Network for 3D Point Clouds. ECCV 2020
One-Thing-One-Click0.693 490.743 670.794 360.655 910.684 470.822 560.497 470.719 730.622 410.617 410.977 100.447 750.339 170.750 300.664 810.703 500.790 360.596 590.946 120.855 370.647 55
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
PicassoNet-IIpermissive0.692 500.732 720.772 500.786 530.677 490.866 90.517 340.848 300.509 850.626 370.952 490.536 370.225 740.545 800.704 710.689 570.810 160.564 750.903 470.854 390.729 23
Huan Lei, Naveed Akhtar, Mubarak Shah, and Ajmal Mian: Geometric feature learning for 3D meshes.
Feature_GeometricNetpermissive0.690 510.884 190.754 620.795 500.647 580.818 630.422 820.802 510.612 450.604 480.945 700.462 650.189 920.563 730.853 370.726 350.765 510.632 430.904 450.821 580.606 69
Kangcheng Liu, Ben M. Chen: https://arxiv.org/abs/2012.09439. arXiv Preprint
FusionNet0.688 520.704 850.741 730.754 650.656 530.829 480.501 420.741 680.609 480.548 630.950 550.522 440.371 50.633 520.756 590.715 430.771 470.623 480.861 830.814 610.658 51
Feihu Zhang, Jin Fang, Benjamin Wah, Philip Torr: Deep FusionNet for Point Cloud Semantic Segmentation. ECCV 2020
Feature-Geometry Netpermissive0.685 530.866 220.748 660.819 340.645 600.794 790.450 680.802 510.587 590.604 480.945 700.464 640.201 870.554 760.840 420.723 380.732 710.602 570.907 430.822 570.603 72
VACNN++0.684 540.728 750.757 610.776 580.690 440.804 740.464 620.816 440.577 640.587 570.945 700.508 490.276 500.671 410.710 690.663 650.750 640.589 640.881 640.832 510.653 53
DGNet0.684 540.712 840.784 440.782 570.658 520.835 420.499 460.823 430.641 340.597 530.950 550.487 550.281 460.575 680.619 850.647 730.764 520.620 500.871 780.846 460.688 44
KP-FCNN0.684 540.847 280.758 600.784 550.647 580.814 660.473 560.772 570.605 500.594 550.935 900.450 730.181 950.587 630.805 520.690 550.785 390.614 510.882 630.819 590.632 61
H. Thomas, C. Qi, J. Deschaud, B. Marcotegui, F. Goulette, L. Guibas.: KPConv: Flexible and Deformable Convolution for Point Clouds. ICCV 2019
Superpoint Network0.683 570.851 270.728 770.800 490.653 550.806 720.468 590.804 490.572 650.602 500.946 670.453 720.239 700.519 850.822 460.689 570.762 550.595 610.895 540.827 530.630 62
PointContrast_LA_SEM0.683 570.757 620.784 440.786 530.639 620.824 540.408 850.775 560.604 510.541 650.934 940.532 400.269 570.552 770.777 570.645 760.793 330.640 400.913 420.824 540.671 48
VI-PointConv0.676 590.770 570.754 620.783 560.621 660.814 660.552 180.758 610.571 680.557 610.954 400.529 410.268 590.530 830.682 750.675 600.719 740.603 560.888 590.833 490.665 49
Xingyi Li, Wenxuan Wu, Xiaoli Z. Fern, Li Fuxin: The Devils in the Point Clouds: Studying the Robustness of Point Cloud Convolutions.
ROSMRF3D0.673 600.789 440.748 660.763 630.635 640.814 660.407 870.747 650.581 630.573 580.950 550.484 560.271 550.607 590.754 600.649 700.774 440.596 590.883 620.823 550.606 69
SALANet0.670 610.816 380.770 530.768 600.652 560.807 710.451 650.747 650.659 280.545 640.924 1000.473 610.149 1070.571 700.811 510.635 800.746 650.623 480.892 560.794 740.570 82
O3DSeg0.668 620.822 360.771 520.496 1110.651 570.833 440.541 230.761 600.555 740.611 430.966 150.489 540.370 60.388 1040.580 880.776 170.751 620.570 700.956 70.817 600.646 56
PointASNLpermissive0.666 630.703 860.781 460.751 670.655 540.830 470.471 570.769 580.474 950.537 670.951 510.475 600.279 480.635 500.698 740.675 600.751 620.553 810.816 940.806 650.703 40
Xu Yan, Chaoda Zheng, Zhen Li, Sheng Wang, Shuguang Cui: PointASNL: Robust Point Clouds Processing using Nonlocal Neural Networks with Adaptive Sampling. CVPR 2020
PointConvpermissive0.666 630.781 480.759 580.699 760.644 610.822 560.475 550.779 550.564 710.504 820.953 440.428 820.203 860.586 650.754 600.661 660.753 610.588 650.902 480.813 630.642 57
Wenxuan Wu, Zhongang Qi, Li Fuxin: PointConv: Deep Convolutional Networks on 3D Point Clouds. CVPR 2019
PPCNN++permissive0.663 650.746 650.708 800.722 690.638 630.820 590.451 650.566 1010.599 540.541 650.950 550.510 480.313 290.648 460.819 490.616 850.682 890.590 630.869 790.810 640.656 52
Pyunghwan Ahn, Juyoung Yang, Eojindl Yi, Chanho Lee, Junmo Kim: Projection-based Point Convolution for Efficient Point Cloud Segmentation. IEEE Access
DCM-Net0.658 660.778 490.702 830.806 450.619 670.813 690.468 590.693 810.494 880.524 730.941 820.449 740.298 360.510 870.821 470.675 600.727 730.568 730.826 920.803 670.637 59
Jonas Schult*, Francis Engelmann*, Theodora Kontogianni, Bastian Leibe: DualConvMesh-Net: Joint Geodesic and Euclidean Convolutions on 3D Meshes. CVPR 2020 [Oral]
HPGCNN0.656 670.698 880.743 710.650 920.564 840.820 590.505 400.758 610.631 380.479 860.945 700.480 580.226 720.572 690.774 580.690 550.735 690.614 510.853 860.776 890.597 75
Jisheng Dang, Qingyong Hu, Yulan Guo, Jun Yang: HPGCNN.
SAFNet-segpermissive0.654 680.752 630.734 750.664 890.583 790.815 650.399 890.754 630.639 350.535 690.942 800.470 620.309 310.665 420.539 910.650 690.708 790.635 420.857 850.793 760.642 57
Linqing Zhao, Jiwen Lu, Jie Zhou: Similarity-Aware Fusion Network for 3D Semantic Segmentation. IROS 2021
RandLA-Netpermissive0.645 690.778 490.731 760.699 760.577 800.829 480.446 700.736 690.477 940.523 750.945 700.454 690.269 570.484 940.749 630.618 830.738 670.599 580.827 910.792 790.621 64
PointConv-SFPN0.641 700.776 510.703 820.721 700.557 870.826 510.451 650.672 860.563 720.483 850.943 790.425 850.162 1020.644 470.726 650.659 670.709 780.572 690.875 700.786 840.559 88
MVPNetpermissive0.641 700.831 320.715 780.671 860.590 750.781 850.394 910.679 830.642 330.553 620.937 870.462 650.256 630.649 450.406 1040.626 810.691 860.666 330.877 680.792 790.608 68
Maximilian Jaritz, Jiayuan Gu, Hao Su: Multi-view PointNet for 3D Scene Understanding. GMDL Workshop, ICCV 2019
PointMRNet0.640 720.717 820.701 840.692 790.576 810.801 750.467 610.716 740.563 720.459 920.953 440.429 810.169 990.581 660.854 360.605 860.710 760.550 830.894 550.793 760.575 80
FPConvpermissive0.639 730.785 460.760 570.713 740.603 700.798 770.392 930.534 1060.603 520.524 730.948 620.457 670.250 650.538 810.723 670.598 900.696 840.614 510.872 750.799 690.567 85
Yiqun Lin, Zizheng Yan, Haibin Huang, Dong Du, Ligang Liu, Shuguang Cui, Xiaoguang Han: FPConv: Learning Local Flattening for Point Convolution. CVPR 2020
PD-Net0.638 740.797 420.769 540.641 970.590 750.820 590.461 630.537 1050.637 360.536 680.947 640.388 950.206 830.656 430.668 790.647 730.732 710.585 670.868 800.793 760.473 108
PointSPNet0.637 750.734 710.692 910.714 730.576 810.797 780.446 700.743 670.598 550.437 970.942 800.403 910.150 1060.626 540.800 550.649 700.697 830.557 790.846 880.777 880.563 86
SConv0.636 760.830 330.697 870.752 660.572 830.780 870.445 720.716 740.529 780.530 700.951 510.446 760.170 980.507 890.666 800.636 790.682 890.541 890.886 600.799 690.594 76
Supervoxel-CNN0.635 770.656 940.711 790.719 710.613 680.757 960.444 750.765 590.534 770.566 590.928 980.478 590.272 530.636 490.531 930.664 640.645 990.508 970.864 820.792 790.611 65
joint point-basedpermissive0.634 780.614 1020.778 470.667 880.633 650.825 520.420 830.804 490.467 970.561 600.951 510.494 510.291 400.566 710.458 990.579 960.764 520.559 780.838 890.814 610.598 74
Hung-Yueh Chiang, Yen-Liang Lin, Yueh-Cheng Liu, Winston H. Hsu: A Unified Point-Based Framework for 3D Segmentation. 3DV 2019
PointMTL0.632 790.731 730.688 940.675 830.591 740.784 840.444 750.565 1020.610 460.492 830.949 590.456 680.254 640.587 630.706 700.599 890.665 950.612 540.868 800.791 820.579 79
3DSM_DMMF0.631 800.626 990.745 690.801 480.607 690.751 970.506 390.729 720.565 700.491 840.866 1140.434 770.197 900.595 610.630 840.709 460.705 810.560 760.875 700.740 990.491 103
PointNet2-SFPN0.631 800.771 550.692 910.672 840.524 930.837 390.440 770.706 790.538 760.446 940.944 760.421 870.219 770.552 770.751 620.591 920.737 680.543 880.901 500.768 910.557 89
APCF-Net0.631 800.742 680.687 960.672 840.557 870.792 820.408 850.665 880.545 750.508 790.952 490.428 820.186 930.634 510.702 720.620 820.706 800.555 800.873 730.798 710.581 78
Haojia, Lin: Adaptive Pyramid Context Fusion for Point Cloud Perception. GRSL
FusionAwareConv0.630 830.604 1040.741 730.766 620.590 750.747 980.501 420.734 700.503 870.527 710.919 1040.454 690.323 260.550 790.420 1030.678 590.688 870.544 860.896 530.795 730.627 63
Jiazhao Zhang, Chenyang Zhu, Lintao Zheng, Kai Xu: Fusion-Aware Point Convolution for Online Semantic 3D Scene Segmentation. CVPR 2020
DenSeR0.628 840.800 410.625 1060.719 710.545 900.806 720.445 720.597 960.448 1020.519 770.938 860.481 570.328 240.489 930.499 980.657 680.759 570.592 620.881 640.797 720.634 60
SegGroup_sempermissive0.627 850.818 370.747 680.701 750.602 710.764 930.385 970.629 930.490 900.508 790.931 970.409 900.201 870.564 720.725 660.618 830.692 850.539 900.873 730.794 740.548 92
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
SIConv0.625 860.830 330.694 890.757 640.563 850.772 910.448 690.647 910.520 810.509 780.949 590.431 800.191 910.496 910.614 860.647 730.672 930.535 930.876 690.783 850.571 81
dtc_net0.625 860.703 860.751 650.794 510.535 910.848 270.480 540.676 850.528 790.469 890.944 760.454 690.004 1190.464 960.636 830.704 490.758 580.548 850.924 300.787 830.492 102
Weakly-Openseg v30.625 860.924 80.787 420.620 990.555 890.811 700.393 920.666 870.382 1100.520 760.953 440.250 1140.208 810.604 600.670 770.644 770.742 660.538 910.919 360.803 670.513 100
HPEIN0.618 890.729 740.668 970.647 940.597 730.766 920.414 840.680 820.520 810.525 720.946 670.432 780.215 790.493 920.599 870.638 780.617 1040.570 700.897 520.806 650.605 71
Li Jiang, Hengshuang Zhao, Shu Liu, Xiaoyong Shen, Chi-Wing Fu, Jiaya Jia: Hierarchical Point-Edge Interaction Network for Point Cloud Semantic Segmentation. ICCV 2019
SPH3D-GCNpermissive0.610 900.858 260.772 500.489 1120.532 920.792 820.404 880.643 920.570 690.507 810.935 900.414 890.046 1160.510 870.702 720.602 880.705 810.549 840.859 840.773 900.534 95
Huan Lei, Naveed Akhtar, and Ajmal Mian: Spherical Kernel for Efficient Graph Convolution on 3D Point Clouds. TPAMI 2020
AttAN0.609 910.760 600.667 980.649 930.521 940.793 800.457 640.648 900.528 790.434 990.947 640.401 920.153 1050.454 970.721 680.648 720.717 750.536 920.904 450.765 920.485 104
Gege Zhang, Qinghua Ma, Licheng Jiao, Fang Liu and Qigong Sun: AttAN: Attention Adversarial Networks for 3D Point Cloud Semantic Segmentation. IJCAI2020
wsss-transformer0.600 920.634 980.743 710.697 780.601 720.781 850.437 790.585 990.493 890.446 940.933 950.394 930.011 1180.654 440.661 820.603 870.733 700.526 940.832 900.761 940.480 105
LAP-D0.594 930.720 800.692 910.637 980.456 1030.773 900.391 950.730 710.587 590.445 960.940 840.381 960.288 410.434 1000.453 1010.591 920.649 970.581 680.777 980.749 980.610 67
DPC0.592 940.720 800.700 850.602 1030.480 990.762 950.380 980.713 770.585 620.437 970.940 840.369 980.288 410.434 1000.509 970.590 940.639 1020.567 740.772 990.755 960.592 77
Francis Engelmann, Theodora Kontogianni, Bastian Leibe: Dilated Point Convolutions: On the Receptive Field Size of Point Convolutions on 3D Point Clouds. ICRA 2020
CCRFNet0.589 950.766 590.659 1010.683 810.470 1020.740 1000.387 960.620 950.490 900.476 870.922 1020.355 1010.245 680.511 860.511 960.571 970.643 1000.493 1010.872 750.762 930.600 73
ROSMRF0.580 960.772 540.707 810.681 820.563 850.764 930.362 1000.515 1070.465 980.465 910.936 890.427 840.207 820.438 980.577 890.536 1000.675 920.486 1020.723 1050.779 860.524 97
SD-DETR0.576 970.746 650.609 1100.445 1160.517 950.643 1110.366 990.714 760.456 1000.468 900.870 1130.432 780.264 600.558 750.674 760.586 950.688 870.482 1030.739 1030.733 1010.537 94
SQN_0.1%0.569 980.676 900.696 880.657 900.497 960.779 880.424 810.548 1030.515 830.376 1040.902 1110.422 860.357 100.379 1050.456 1000.596 910.659 960.544 860.685 1080.665 1120.556 90
TextureNetpermissive0.566 990.672 920.664 990.671 860.494 970.719 1010.445 720.678 840.411 1080.396 1020.935 900.356 1000.225 740.412 1020.535 920.565 980.636 1030.464 1050.794 970.680 1090.568 84
Jingwei Huang, Haotian Zhang, Li Yi, Thomas Funkerhouser, Matthias Niessner, Leonidas Guibas: TextureNet: Consistent Local Parametrizations for Learning from High-Resolution Signals on Meshes. CVPR
DVVNet0.562 1000.648 950.700 850.770 590.586 780.687 1050.333 1040.650 890.514 840.475 880.906 1080.359 990.223 760.340 1070.442 1020.422 1110.668 940.501 980.708 1060.779 860.534 95
Pointnet++ & Featurepermissive0.557 1010.735 700.661 1000.686 800.491 980.744 990.392 930.539 1040.451 1010.375 1050.946 670.376 970.205 840.403 1030.356 1070.553 990.643 1000.497 990.824 930.756 950.515 98
GMLPs0.538 1020.495 1120.693 900.647 940.471 1010.793 800.300 1070.477 1080.505 860.358 1060.903 1100.327 1040.081 1130.472 950.529 940.448 1090.710 760.509 950.746 1010.737 1000.554 91
PanopticFusion-label0.529 1030.491 1130.688 940.604 1020.386 1080.632 1120.225 1180.705 800.434 1050.293 1120.815 1160.348 1020.241 690.499 900.669 780.507 1020.649 970.442 1110.796 960.602 1160.561 87
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
subcloud_weak0.516 1040.676 900.591 1130.609 1000.442 1040.774 890.335 1030.597 960.422 1070.357 1070.932 960.341 1030.094 1120.298 1090.528 950.473 1070.676 910.495 1000.602 1140.721 1040.349 116
Online SegFusion0.515 1050.607 1030.644 1040.579 1050.434 1050.630 1130.353 1010.628 940.440 1030.410 1000.762 1190.307 1060.167 1000.520 840.403 1050.516 1010.565 1070.447 1090.678 1090.701 1060.514 99
Davide Menini, Suryansh Kumar, Martin R. Oswald, Erik Sandstroem, Cristian Sminchisescu, Luc van Gool: A Real-Time Learning Framework for Joint 3D Reconstruction and Semantic Segmentation. Robotics and Automation Letters Submission
3DMV, FTSDF0.501 1060.558 1080.608 1110.424 1180.478 1000.690 1040.246 1140.586 980.468 960.450 930.911 1060.394 930.160 1030.438 980.212 1140.432 1100.541 1120.475 1040.742 1020.727 1020.477 106
PCNN0.498 1070.559 1070.644 1040.560 1070.420 1070.711 1030.229 1160.414 1090.436 1040.352 1080.941 820.324 1050.155 1040.238 1140.387 1060.493 1030.529 1130.509 950.813 950.751 970.504 101
3DMV0.484 1080.484 1140.538 1160.643 960.424 1060.606 1160.310 1050.574 1000.433 1060.378 1030.796 1170.301 1070.214 800.537 820.208 1150.472 1080.507 1160.413 1140.693 1070.602 1160.539 93
Angela Dai, Matthias Niessner: 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation. ECCV'18
PointCNN with RGBpermissive0.458 1090.577 1060.611 1090.356 1200.321 1160.715 1020.299 1090.376 1130.328 1160.319 1100.944 760.285 1090.164 1010.216 1170.229 1120.484 1050.545 1110.456 1070.755 1000.709 1050.475 107
Yangyan Li, Rui Bu, Mingchao Sun, Baoquan Chen: PointCNN. NeurIPS 2018
FCPNpermissive0.447 1100.679 890.604 1120.578 1060.380 1090.682 1060.291 1100.106 1200.483 930.258 1180.920 1030.258 1130.025 1170.231 1160.325 1080.480 1060.560 1090.463 1060.725 1040.666 1110.231 120
Dario Rethage, Johanna Wald, Jürgen Sturm, Nassir Navab, Federico Tombari: Fully-Convolutional Point Networks for Large-Scale Point Clouds. ECCV 2018
DGCNN_reproducecopyleft0.446 1110.474 1150.623 1070.463 1140.366 1110.651 1090.310 1050.389 1120.349 1140.330 1090.937 870.271 1110.126 1090.285 1100.224 1130.350 1160.577 1060.445 1100.625 1120.723 1030.394 112
Yue Wang, Yongbin Sun, Ziwei Liu, Sanjay E. Sarma, Michael M. Bronstein, Justin M. Solomon: Dynamic Graph CNN for Learning on Point Clouds. TOG 2019
PNET20.442 1120.548 1090.548 1150.597 1040.363 1120.628 1140.300 1070.292 1150.374 1110.307 1110.881 1120.268 1120.186 930.238 1140.204 1160.407 1120.506 1170.449 1080.667 1100.620 1150.462 110
SurfaceConvPF0.442 1120.505 1110.622 1080.380 1190.342 1140.654 1080.227 1170.397 1110.367 1120.276 1140.924 1000.240 1150.198 890.359 1060.262 1100.366 1130.581 1050.435 1120.640 1110.668 1100.398 111
Hao Pan, Shilin Liu, Yang Liu, Xin Tong: Convolutional Neural Networks on 3D Surfaces Using Parallel Frames.
Tangent Convolutionspermissive0.438 1140.437 1170.646 1030.474 1130.369 1100.645 1100.353 1010.258 1170.282 1190.279 1130.918 1050.298 1080.147 1080.283 1110.294 1090.487 1040.562 1080.427 1130.619 1130.633 1140.352 115
Maxim Tatarchenko, Jaesik Park, Vladlen Koltun, Qian-Yi Zhou: Tangent convolutions for dense prediction in 3d. CVPR 2018
3DWSSS0.425 1150.525 1100.647 1020.522 1080.324 1150.488 1200.077 1210.712 780.353 1130.401 1010.636 1210.281 1100.176 960.340 1070.565 900.175 1200.551 1100.398 1150.370 1210.602 1160.361 114
SPLAT Netcopyleft0.393 1160.472 1160.511 1170.606 1010.311 1170.656 1070.245 1150.405 1100.328 1160.197 1190.927 990.227 1170.000 1210.001 1220.249 1110.271 1190.510 1140.383 1170.593 1150.699 1070.267 118
Hang Su, Varun Jampani, Deqing Sun, Subhransu Maji, Evangelos Kalogerakis, Ming-Hsuan Yang, Jan Kautz: SPLATNet: Sparse Lattice Networks for Point Cloud Processing. CVPR 2018
ScanNet+FTSDF0.383 1170.297 1190.491 1180.432 1170.358 1130.612 1150.274 1120.116 1190.411 1080.265 1150.904 1090.229 1160.079 1140.250 1120.185 1170.320 1170.510 1140.385 1160.548 1160.597 1190.394 112
PointNet++permissive0.339 1180.584 1050.478 1190.458 1150.256 1190.360 1210.250 1130.247 1180.278 1200.261 1170.677 1200.183 1180.117 1100.212 1180.145 1190.364 1140.346 1210.232 1210.548 1160.523 1200.252 119
Charles R. Qi, Li Yi, Hao Su, Leonidas J. Guibas: pointnet++: deep hierarchical feature learning on point sets in a metric space.
GrowSP++0.323 1190.114 1210.589 1140.499 1100.147 1210.555 1170.290 1110.336 1140.290 1180.262 1160.865 1150.102 1210.000 1210.037 1200.000 1220.000 1220.462 1180.381 1180.389 1200.664 1130.473 108
SSC-UNetpermissive0.308 1200.353 1180.290 1210.278 1210.166 1200.553 1180.169 1200.286 1160.147 1210.148 1210.908 1070.182 1190.064 1150.023 1210.018 1210.354 1150.363 1190.345 1190.546 1180.685 1080.278 117
ScanNetpermissive0.306 1210.203 1200.366 1200.501 1090.311 1170.524 1190.211 1190.002 1220.342 1150.189 1200.786 1180.145 1200.102 1110.245 1130.152 1180.318 1180.348 1200.300 1200.460 1190.437 1210.182 121
Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner: ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. CVPR'17
ERROR0.054 1220.000 1220.041 1220.172 1220.030 1220.062 1230.001 1220.035 1210.004 1220.051 1220.143 1220.019 1220.003 1200.041 1190.050 1200.003 1210.054 1220.018 1220.005 1230.264 1220.082 122
MVF-GNN0.014 1230.000 1220.000 1230.000 1230.007 1230.086 1220.000 1230.000 1230.001 1230.000 1230.029 1230.001 1230.000 1210.000 1230.000 1220.000 1220.000 1230.018 1220.015 1220.115 1230.000 123


This table lists the benchmark results for the 3D semantic instance scenario.




Method Infoavg ap 50%bathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Competitor-MAFT0.816 11.000 10.983 30.872 100.718 50.941 10.588 40.652 390.819 20.776 30.720 50.780 50.769 121.000 10.797 110.813 290.798 81.000 10.659 4
PointRel0.816 11.000 10.971 80.908 60.743 20.923 80.573 80.714 220.695 180.734 100.747 20.725 120.809 11.000 10.814 90.899 30.820 41.000 10.610 18
: Relation3D (PointRel): Enhancing Relation Modeling for Point Cloud Instance Segmentation.
Spherical Mask(CtoF)0.812 31.000 10.973 70.852 140.718 60.917 100.574 60.677 300.748 110.729 140.715 80.795 20.809 11.000 10.831 40.854 90.787 121.000 10.638 7
EV3D0.811 41.000 10.968 90.852 140.717 70.921 90.574 70.677 300.748 110.730 130.703 130.795 20.809 11.000 10.831 40.854 90.778 161.000 10.638 8
SIM3D0.803 51.000 10.967 100.863 130.692 190.924 70.552 120.732 210.667 230.732 120.662 170.796 10.789 91.000 10.803 100.864 60.766 211.000 10.643 6
OneFormer3Dcopyleft0.801 61.000 10.973 60.909 50.698 150.928 50.582 50.668 350.685 190.780 20.687 150.698 200.702 151.000 10.794 130.900 20.784 140.986 530.635 9
Maxim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: OneFormer3D: One Transformer for Unified Point Cloud Segmentation.
UniPerception0.800 71.000 10.930 120.872 100.727 40.862 250.454 200.764 130.820 10.746 70.706 110.750 70.772 100.926 470.764 190.818 270.826 20.997 400.660 3
Competitor-SPFormer0.800 71.000 10.986 20.845 160.705 130.915 110.532 140.733 200.757 100.733 110.708 100.698 190.648 370.981 400.890 10.830 190.796 90.997 400.644 5
InsSSM0.799 91.000 10.915 140.710 420.729 30.925 60.664 10.670 330.770 70.766 40.739 30.737 80.700 161.000 10.792 140.829 210.815 50.997 400.625 11
Lei Yao, Yi Wang, Moyun Liu, Lap-Pui Chau: SGIFormer: Semantic-guided and Geometric-enhanced Interleaving Transformer for 3D Instance Segmentation. TCSVT, 2024
DCD0.798 101.000 10.878 210.792 280.693 180.936 20.596 20.685 290.663 250.736 80.717 60.788 40.693 211.000 10.825 70.840 150.837 11.000 10.689 1
TST3D0.795 111.000 10.929 130.918 40.709 100.884 200.596 30.704 250.769 80.734 90.644 220.699 180.751 131.000 10.794 120.876 50.757 240.997 400.550 34
Duc Tran Dang Trung, Byeongkeun Kang, Yeejin Lee: MSTA3D: Multi-scale Twin-attention for 3D Instance Segmentation. ACM Multimedia 2024
MG-Former0.791 121.000 10.980 50.837 190.626 270.897 130.543 130.759 150.800 60.766 50.659 180.769 60.697 191.000 10.791 150.707 500.791 111.000 10.610 17
ExtMask3D0.789 131.000 10.988 10.756 350.706 120.912 120.429 210.647 410.806 50.755 60.673 160.689 210.772 111.000 10.789 160.852 110.811 61.000 10.617 14
Queryformer0.787 141.000 10.933 110.601 520.754 10.886 180.558 110.661 370.767 90.665 200.716 70.639 270.808 51.000 10.844 30.897 40.804 71.000 10.624 12
MAFT0.786 151.000 10.894 190.807 230.694 170.893 160.486 160.674 320.740 130.786 10.704 120.727 110.739 141.000 10.707 260.849 130.756 251.000 10.685 2
KmaxOneFormerNetpermissive0.783 160.903 570.981 40.794 270.706 110.931 40.561 100.701 260.706 160.727 150.697 140.731 100.689 231.000 10.856 20.750 410.761 231.000 10.599 22
Mask3D0.780 171.000 10.786 450.716 400.696 160.885 190.500 150.714 220.810 40.672 190.715 80.679 220.809 11.000 10.831 40.833 180.787 121.000 10.602 20
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
SPFormerpermissive0.770 180.903 570.903 160.806 240.609 340.886 170.568 90.815 60.705 170.711 160.655 190.652 260.685 241.000 10.789 170.809 300.776 181.000 10.583 26
Sun Jiahao, Qing Chunmei, Tan Junpeng, Xu Xiangmin: Superpoint Transformer for 3D Scene Instance Segmentation. AAAI 2023 [Oral]
SoftGroup++0.769 191.000 10.803 380.937 10.684 200.865 220.213 370.870 20.664 240.571 270.758 10.702 160.807 61.000 10.653 330.902 10.792 101.000 10.626 10
SoftGrouppermissive0.761 201.000 10.808 340.845 160.716 80.862 240.243 340.824 40.655 270.620 210.734 40.699 170.791 80.981 400.716 230.844 140.769 191.000 10.594 24
Thang Vu, Kookhoi Kim, Tung M. Luu, Xuan Thanh Nguyen, Chang D. Yoo: SoftGroup for 3D Instance Segmentaiton on Point Clouds. CVPR 2022 [Oral]
ISBNetpermissive0.757 211.000 10.904 150.731 380.678 210.895 140.458 180.644 430.670 220.710 170.620 270.732 90.650 271.000 10.756 200.778 330.779 151.000 10.614 15
Tuan Duc Ngo, Binh-Son Hua, Khoi Nguyen: ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution. CVPR 2023
TD3Dpermissive0.751 221.000 10.774 460.867 120.621 290.934 30.404 220.706 240.812 30.605 240.633 250.626 280.690 221.000 10.640 350.820 240.777 171.000 10.612 16
Maksim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: Top-Down Beats Bottom-Up in 3D Instance Segmentation. WACV 2024
PBNetpermissive0.747 231.000 10.818 300.837 200.713 90.844 270.457 190.647 410.711 150.614 220.617 290.657 250.650 271.000 10.692 270.822 230.765 221.000 10.595 23
Weiguang Zhao, Yuyao Yan, Chaolong Yang, Jianan Ye, Xi Yang, Kaizhu Huang: Divide and Conquer: 3D Instance Segmentation With Point-Wise Binarization. ICCV 2023
GraphCut0.732 241.000 10.788 430.724 390.642 260.859 260.248 330.787 110.618 300.596 250.653 210.722 140.583 491.000 10.766 180.861 70.825 31.000 10.504 40
IPCA-Inst0.731 251.000 10.788 440.884 90.698 140.788 430.252 320.760 140.646 280.511 350.637 240.665 240.804 71.000 10.644 340.778 340.747 271.000 10.561 30
TopoSeg0.725 261.000 10.806 370.933 20.668 230.758 480.272 310.734 190.630 290.549 310.654 200.606 290.697 200.966 440.612 390.839 160.754 261.000 10.573 27
DKNet0.718 271.000 10.814 310.782 290.619 310.872 210.224 350.751 170.569 340.677 180.585 340.724 130.633 390.981 400.515 490.819 250.736 281.000 10.617 13
Yizheng Wu, Min Shi, Shuaiyuan Du, Hao Lu, Zhiguo Cao, Weicai Zhong: 3D Instances as 1D Kernels. ECCV 2022
SSEC0.707 281.000 10.850 230.924 30.648 240.747 510.162 390.862 30.572 330.520 330.624 260.549 320.649 361.000 10.560 440.706 510.768 201.000 10.591 25
HAISpermissive0.699 291.000 10.849 240.820 210.675 220.808 370.279 290.757 160.465 400.517 340.596 310.559 310.600 431.000 10.654 320.767 360.676 320.994 490.560 31
Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang: Hierarchical Aggregation for 3D Instance Segmentation. ICCV 2021
SSTNetpermissive0.698 301.000 10.697 620.888 80.556 410.803 380.387 230.626 450.417 450.556 300.585 350.702 150.600 431.000 10.824 80.720 490.692 301.000 10.509 39
Zhihao Liang, Zhihao Li, Songcen Xu, Mingkui Tan, Kui Jia: Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks. ICCV2021
DualGroup0.694 311.000 10.799 400.811 220.622 280.817 320.376 240.805 90.590 320.487 390.568 380.525 360.650 270.835 570.600 400.829 200.655 351.000 10.526 36
ODIN - Inspermissive0.693 321.000 10.880 200.647 470.620 300.779 450.336 260.501 600.681 200.577 260.595 320.679 230.683 251.000 10.709 250.816 280.637 390.770 690.557 32
Ayush Jain, Pushkal Katara, Nikolaos Gkanatsios, Adam W. Harley, Gabriel Sarch, Kriti Aggarwal, Vishrav Chaudhary, Katerina Fragkiadaki: ODIN: A Single Model for 2D and 3D Segmentation. CVPR 2024
DANCENET0.680 331.000 10.807 350.733 370.600 350.768 470.375 250.543 530.538 350.610 230.599 300.498 370.632 410.981 400.739 220.856 80.633 420.882 640.454 49
SphereSeg0.680 331.000 10.856 220.744 360.618 320.893 150.151 400.651 400.713 140.537 320.579 370.430 460.651 261.000 10.389 600.744 440.697 290.991 510.601 21
Box2Mask0.677 351.000 10.847 250.771 310.509 500.816 330.277 300.558 520.482 370.562 290.640 230.448 420.700 161.000 10.666 280.852 120.578 490.997 400.488 44
Julian Chibane, Francis Engelmann, Tuan Anh Tran, Gerard Pons-Moll: Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation Using Bounding Boxes. ECCV 2022
OccuSeg+instance0.672 361.000 10.758 540.682 440.576 390.842 280.477 170.504 590.524 360.567 280.585 360.451 410.557 511.000 10.751 210.797 310.563 521.000 10.467 48
Lei Han, Tian Zheng, Lan Xu, Lu Fang: OccuSeg: Occupancy-aware 3D Instance Segmentation. CVPR2020
Mask-Group0.664 371.000 10.822 290.764 340.616 330.815 340.139 440.694 280.597 310.459 430.566 390.599 300.600 430.516 670.715 240.819 260.635 401.000 10.603 19
Min Zhong, Xinghao Chen, Xiaokang Chen, Gang Zeng, Yunhe Wang: MaskGroup: Hierarchical Point Grouping and Masking for 3D Instance Segmentation. ICME 2022
INS-Conv-instance0.657 381.000 10.760 520.667 460.581 370.863 230.323 270.655 380.477 380.473 410.549 410.432 450.650 271.000 10.655 310.738 450.585 480.944 560.472 47
CSC-Pretrained0.648 391.000 10.810 320.768 320.523 480.813 350.143 430.819 50.389 480.422 520.511 450.443 430.650 271.000 10.624 370.732 460.634 411.000 10.375 56
PE0.645 401.000 10.773 480.798 260.538 430.786 440.088 520.799 100.350 520.435 500.547 420.545 330.646 380.933 460.562 430.761 390.556 570.997 400.501 42
Biao Zhang, Peter Wonka: Point Cloud Instance Segmentation using Probabilistic Embeddings. CVPR 2021
RPGN0.643 411.000 10.758 530.582 580.539 420.826 310.046 570.765 120.372 500.436 490.588 330.539 350.650 271.000 10.577 410.750 420.653 370.997 400.495 43
Shichao Dong, Guosheng Lin, Tzu-Yi Hung: Learning Regional Purity for Instance Segmentation on 3D Point Clouds. ECCV 2022
Dyco3Dcopyleft0.641 421.000 10.841 260.893 70.531 450.802 390.115 490.588 500.448 420.438 470.537 440.430 470.550 520.857 490.534 470.764 380.657 340.987 520.568 28
Tong He; Chunhua Shen; Anton van den Hengel: DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution. CVPR2021
GICN0.638 431.000 10.895 180.800 250.480 540.676 560.144 420.737 180.354 510.447 440.400 580.365 530.700 161.000 10.569 420.836 170.599 441.000 10.473 46
PointGroup0.636 441.000 10.765 490.624 490.505 520.797 400.116 480.696 270.384 490.441 450.559 400.476 390.596 461.000 10.666 280.756 400.556 560.997 400.513 38
Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia: PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation. CVPR 2020 [oral]
DD-UNet+Group0.635 450.667 600.797 420.714 410.562 400.774 460.146 410.810 80.429 440.476 400.546 430.399 490.633 391.000 10.632 360.722 480.609 431.000 10.514 37
H. Liu, R. Liu, K. Yang, J. Zhang, K. Peng, R. Stiefelhagen: HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor. ICCVW 2021
Mask3D_evaluation0.631 461.000 10.829 280.606 510.646 250.836 290.068 530.511 570.462 410.507 360.619 280.389 510.610 421.000 10.432 550.828 220.673 330.788 680.552 33
DENet0.629 471.000 10.797 410.608 500.589 360.627 600.219 360.882 10.310 540.402 570.383 600.396 500.650 271.000 10.663 300.543 680.691 311.000 10.568 29
3D-MPA0.611 481.000 10.833 270.765 330.526 470.756 490.136 460.588 500.470 390.438 480.432 540.358 550.650 270.857 490.429 560.765 370.557 551.000 10.430 51
Francis Engelmann, Martin Bokeloh, Alireza Fathi, Bastian Leibe, Matthias Nießner: 3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation. CVPR 2020
OSIS0.605 491.000 10.801 390.599 530.535 440.728 530.286 280.436 640.679 210.491 370.433 520.256 570.404 640.857 490.620 380.724 470.510 621.000 10.539 35
AOIA0.601 501.000 10.761 510.687 430.485 530.828 300.008 640.663 360.405 470.405 560.425 550.490 380.596 460.714 600.553 460.779 320.597 450.992 500.424 53
PCJC0.578 511.000 10.810 330.583 570.449 570.813 360.042 580.603 480.341 530.490 380.465 490.410 480.650 270.835 570.264 660.694 550.561 530.889 610.504 41
SSEN0.575 521.000 10.761 500.473 600.477 550.795 410.066 540.529 550.658 260.460 420.461 500.380 520.331 660.859 480.401 590.692 570.653 361.000 10.348 58
Dongsu Zhang, Junha Chun, Sang Kyun Cha, Young Min Kim: Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with Deep Metric Learning. Arxiv
RWSeg0.567 530.528 700.708 610.626 480.580 380.745 520.063 550.627 440.240 580.400 580.497 460.464 400.515 531.000 10.475 510.745 430.571 501.000 10.429 52
NeuralBF0.555 540.667 600.896 170.843 180.517 490.751 500.029 590.519 560.414 460.439 460.465 480.000 760.484 550.857 490.287 640.693 560.651 381.000 10.485 45
Weiwei Sun, Daniel Rebain, Renjie Liao, Vladimir Tankovich, Soroosh Yazdani, Kwang Moo Yi, Andrea Tagliasacchi: NeuralBF: Neural Bilateral Filtering for Top-down Instance Segmentation on Point Clouds. WACV 2023
MTML0.549 551.000 10.807 360.588 560.327 620.647 580.004 660.815 70.180 610.418 530.364 620.182 600.445 581.000 10.442 540.688 580.571 511.000 10.396 54
Jean Lahoud, Bernard Ghanem, Marc Pollefeys, Martin R. Oswald: 3D Instance Segmentation via Multi-task Metric Learning. ICCV 2019 [oral]
ClickSeg_Instance0.539 561.000 10.621 650.300 630.530 460.698 540.127 470.533 540.222 590.430 510.400 570.365 530.574 500.938 450.472 520.659 600.543 580.944 560.347 59
One_Thing_One_Clickpermissive0.529 570.667 600.718 570.777 300.399 580.683 550.000 690.669 340.138 640.391 590.374 610.539 340.360 650.641 640.556 450.774 350.593 460.997 400.251 64
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
Sparse R-CNN0.515 581.000 10.538 700.282 640.468 560.790 420.173 380.345 660.429 430.413 550.484 470.176 610.595 480.591 650.522 480.668 590.476 630.986 540.327 60
Occipital-SCS0.512 591.000 10.716 580.509 590.506 510.611 610.092 510.602 490.177 620.346 620.383 590.165 620.442 590.850 560.386 610.618 640.543 590.889 610.389 55
3D-BoNet0.488 601.000 10.672 640.590 550.301 640.484 710.098 500.620 460.306 550.341 630.259 660.125 640.434 610.796 590.402 580.499 700.513 610.909 600.439 50
Bo Yang, Jianan Wang, Ronald Clark, Qingyong Hu, Sen Wang, Andrew Markham, Niki Trigoni: Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds. NeurIPS 2019 Spotlight
PanopticFusion-inst0.478 610.667 600.712 600.595 540.259 670.550 670.000 690.613 470.175 630.250 680.434 510.437 440.411 630.857 490.485 500.591 670.267 730.944 560.359 57
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
SPG_WSIS0.470 620.667 600.685 630.677 450.372 600.562 650.000 690.482 610.244 570.316 650.298 630.052 710.442 600.857 490.267 650.702 520.559 541.000 10.287 62
SALoss-ResNet0.459 631.000 10.737 560.159 740.259 660.587 630.138 450.475 620.217 600.416 540.408 560.128 630.315 670.714 600.411 570.536 690.590 470.873 650.304 61
Zhidong Liang, Ming Yang, Hao Li, Chunxiang Wang: 3D Instance Embedding Learning With a Structure-Aware Loss Function for Point Cloud Segmentation. IEEE Robotics and Automation Letters (IROS2020)
MASCpermissive0.447 640.528 700.555 680.381 610.382 590.633 590.002 670.509 580.260 560.361 610.432 530.327 560.451 570.571 660.367 620.639 620.386 640.980 550.276 63
Chen Liu, Yasutaka Furukawa: MASC: Multi-scale Affinity with Sparse Convolution for 3D Instance Segmentation.
SegGroup_inspermissive0.445 650.667 600.773 470.185 710.317 630.656 570.000 690.407 650.134 650.381 600.267 650.217 590.476 560.714 600.452 530.629 630.514 601.000 10.222 67
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
3D-SISpermissive0.382 661.000 10.432 730.245 660.190 680.577 640.013 630.263 680.033 710.320 640.240 670.075 670.422 620.857 490.117 710.699 530.271 720.883 630.235 66
Ji Hou, Angela Dai, Matthias Niessner: 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans. CVPR 2019
Hier3Dcopyleft0.323 670.667 600.542 690.264 650.157 710.550 660.000 690.205 710.009 730.270 670.218 680.075 670.500 540.688 630.007 770.698 540.301 690.459 740.200 68
Tan: HCFS3D: Hierarchical Coupled Feature Selection Network for 3D Semantic and Instance Segmentation.
UNet-backbone0.319 680.667 600.715 590.233 670.189 690.479 720.008 640.218 690.067 700.201 700.173 690.107 650.123 720.438 680.150 680.615 650.355 650.916 590.093 76
R-PointNet0.306 690.500 720.405 740.311 620.348 610.589 620.054 560.068 740.126 660.283 660.290 640.028 720.219 700.214 710.331 630.396 740.275 700.821 670.245 65
Region-18class0.284 700.250 760.751 550.228 690.270 650.521 680.000 690.468 630.008 750.205 690.127 700.000 760.068 740.070 750.262 670.652 610.323 670.740 700.173 69
SemRegionNet-20cls0.250 710.333 730.613 660.229 680.163 700.493 690.000 690.304 670.107 670.147 730.100 720.052 700.231 680.119 730.039 730.445 720.325 660.654 710.141 72
tmp0.248 720.667 600.437 720.188 700.153 720.491 700.000 690.208 700.094 690.153 720.099 730.057 690.217 710.119 730.039 730.466 710.302 680.640 720.140 73
3D-BEVIS0.248 720.667 600.566 670.076 750.035 770.394 750.027 610.035 760.098 680.099 750.030 760.025 730.098 730.375 700.126 700.604 660.181 750.854 660.171 70
Cathrin Elich, Francis Engelmann, Jonas Schult, Theodora Kontogianni, Bastian Leibe: 3D-BEVIS: Birds-Eye-View Instance Segmentation.
Sem_Recon_ins0.227 740.764 590.486 710.069 760.098 740.426 740.017 620.067 750.015 720.172 710.100 710.096 660.054 760.183 720.135 690.366 750.260 740.614 730.168 71
ASIS0.199 750.333 730.253 760.167 730.140 730.438 730.000 690.177 720.008 740.121 740.069 740.004 750.231 690.429 690.036 750.445 730.273 710.333 760.119 75
Sgpn_scannet0.143 760.208 770.390 750.169 720.065 750.275 760.029 600.069 730.000 760.087 760.043 750.014 740.027 770.000 760.112 720.351 760.168 760.438 750.138 74
MaskRCNN 2d->3d Proj0.058 770.333 730.002 770.000 770.053 760.002 770.002 680.021 770.000 760.045 770.024 770.238 580.065 750.000 760.014 760.107 770.020 770.110 770.006 77


This table lists the benchmark results for the 2D semantic label scenario.


Method Infoavg ioubathtubbedbookshelfcabinetchaircountercurtaindeskdoorfloorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwallwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Virtual MVFusion (R)0.745 10.861 10.839 10.881 10.672 20.512 10.422 170.898 10.723 10.714 10.954 20.454 10.509 10.773 10.895 10.756 10.820 10.653 10.935 10.891 10.728 1
Abhijit Kundu, Xiaoqi Yin, Alireza Fathi, David Ross, Brian Brewington, Thomas Funkhouser, Caroline Pantofaru: Virtual Multi-view Fusion for 3D Semantic Segmentation. ECCV 2020
BPNet_2Dcopyleft0.670 20.822 30.795 30.836 20.659 30.481 20.451 130.769 40.656 30.567 40.931 30.395 60.390 50.700 40.534 40.689 100.770 20.574 30.865 90.831 30.675 5
Wenbo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia and Tien-Tsin Wong: Bidirectional Projection Network for Cross Dimension Scene Understanding. CVPR 2021 (Oral)
MVF-GNN(2D)0.636 30.606 140.794 40.434 160.688 10.337 80.464 120.798 30.632 50.589 30.908 80.420 20.329 120.743 20.594 20.738 20.676 50.527 40.906 20.818 60.715 3
CU-Hybrid-2D Net0.636 30.825 20.820 20.179 230.648 40.463 30.549 20.742 70.676 20.628 20.961 10.420 20.379 60.684 80.381 180.732 30.723 30.599 20.827 160.851 20.634 7
CMX0.613 50.681 80.725 120.502 120.634 60.297 180.478 100.830 20.651 40.537 70.924 40.375 70.315 140.686 70.451 140.714 50.543 210.504 60.894 70.823 50.688 4
DMMF_3d0.605 60.651 90.744 100.782 30.637 50.387 40.536 30.732 80.590 70.540 60.856 210.359 110.306 150.596 140.539 30.627 200.706 40.497 80.785 210.757 190.476 22
EMSANet0.600 70.716 40.746 90.395 180.614 90.382 50.523 40.713 110.571 110.503 100.922 60.404 50.397 40.655 90.400 160.626 210.663 60.469 130.900 40.827 40.577 14
Seichter, Daniel and Fischedick, Söhnke and Köhler, Mona and Gross, Horst-Michael: EMSANet: Efficient Multi-Task RGB-D Scene Analysis for Indoor Environments. IJCNN 2022
MCA-Net0.595 80.533 200.756 80.746 40.590 100.334 100.506 70.670 150.587 80.500 120.905 100.366 100.352 90.601 130.506 80.669 160.648 90.501 70.839 150.769 150.516 21
RFBNet0.592 90.616 110.758 70.659 50.581 110.330 110.469 110.655 180.543 140.524 80.924 40.355 130.336 110.572 170.479 100.671 140.648 90.480 100.814 190.814 70.614 10
FAN_NV_RVC0.586 100.510 210.764 60.079 260.620 80.330 110.494 80.753 50.573 90.556 50.884 160.405 40.303 160.718 30.452 130.672 130.658 70.509 50.898 50.813 80.727 2
DCRedNet0.583 110.682 70.723 130.542 110.510 200.310 150.451 130.668 160.549 130.520 90.920 70.375 70.446 20.528 200.417 150.670 150.577 180.478 110.862 100.806 90.628 9
MIX6D_RVC0.582 120.695 50.687 170.225 210.632 70.328 130.550 10.748 60.623 60.494 150.890 140.350 150.254 230.688 60.454 120.716 40.597 170.489 90.881 80.768 160.575 15
SSMAcopyleft0.577 130.695 50.716 150.439 140.563 140.314 140.444 150.719 90.551 120.503 100.887 150.346 160.348 100.603 120.353 200.709 60.600 150.457 140.901 30.786 110.599 13
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. International Journal of Computer Vision, 2019
DMMF0.567 140.623 100.767 50.238 200.571 130.347 60.413 190.719 90.472 200.418 220.895 130.357 120.260 220.696 50.523 70.666 170.642 110.437 180.895 60.793 100.603 12
UNIV_CNP_RVC_UE0.566 150.569 190.686 190.435 150.524 170.294 190.421 180.712 120.543 140.463 170.872 170.320 170.363 80.611 110.477 110.686 110.627 120.443 170.862 100.775 140.639 6
EMSAFormer0.564 160.581 160.736 110.564 100.546 160.219 230.517 50.675 140.486 190.427 210.904 110.352 140.320 130.589 150.528 50.708 70.464 240.413 220.847 140.786 110.611 11
SN_RN152pyrx8_RVCcopyleft0.546 170.572 170.663 210.638 70.518 180.298 170.366 240.633 210.510 170.446 190.864 190.296 200.267 190.542 190.346 210.704 80.575 190.431 190.853 130.766 170.630 8
UDSSEG_RVC0.545 180.610 130.661 220.588 80.556 150.268 210.482 90.642 200.572 100.475 160.836 230.312 180.367 70.630 100.189 230.639 190.495 230.452 150.826 170.756 200.541 17
segfomer with 6d0.542 190.594 150.687 170.146 240.579 120.308 160.515 60.703 130.472 200.498 130.868 180.369 90.282 170.589 150.390 170.701 90.556 200.416 210.860 120.759 180.539 19
FuseNetpermissive0.535 200.570 180.681 200.182 220.512 190.290 200.431 160.659 170.504 180.495 140.903 120.308 190.428 30.523 210.365 190.676 120.621 140.470 120.762 220.779 130.541 17
Caner Hazirbas, Lingni Ma, Csaba Domokos, Daniel Cremers: FuseNet: Incorporating Depth into Semantic Segmentation via Fusion-based CNN Architecture. ACCV 2016
AdapNet++copyleft0.503 210.613 120.722 140.418 170.358 260.337 80.370 230.479 240.443 220.368 240.907 90.207 230.213 250.464 240.525 60.618 220.657 80.450 160.788 200.721 230.408 25
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. International Journal of Computer Vision, 2019
3DMV (2d proj)0.498 220.481 240.612 230.579 90.456 220.343 70.384 210.623 220.525 160.381 230.845 220.254 220.264 210.557 180.182 240.581 240.598 160.429 200.760 230.661 250.446 24
Angela Dai, Matthias Niessner: 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation. ECCV'18
MSeg1080_RVCpermissive0.485 230.505 220.709 160.092 250.427 230.241 220.411 200.654 190.385 260.457 180.861 200.053 260.279 180.503 220.481 90.645 180.626 130.365 240.748 240.725 220.529 20
John Lambert*, Zhuang Liu*, Ozan Sener, James Hays, Vladlen Koltun: MSeg: A Composite Dataset for Multi-domain Semantic Segmentation. CVPR 2020
ILC-PSPNet0.475 240.490 230.581 240.289 190.507 210.067 260.379 220.610 230.417 240.435 200.822 250.278 210.267 190.503 220.228 220.616 230.533 220.375 230.820 180.729 210.560 16
Enet (reimpl)0.376 250.264 260.452 260.452 130.365 240.181 240.143 260.456 250.409 250.346 250.769 260.164 240.218 240.359 250.123 260.403 260.381 260.313 260.571 250.685 240.472 23
Re-implementation of Adam Paszke, Abhishek Chaurasia, Sangpil Kim, Eugenio Culurciello: ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation.
ScanNet (2d proj)permissive0.330 260.293 250.521 250.657 60.361 250.161 250.250 250.004 260.440 230.183 260.836 230.125 250.060 260.319 260.132 250.417 250.412 250.344 250.541 260.427 260.109 26
Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner: ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. CVPR'17


This table lists the benchmark results for the 2D semantic instance scenario.




Method Infoavg apbathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
EMSANet (Instance)0.241 10.401 10.439 10.085 10.242 10.220 10.081 10.289 20.117 20.121 10.182 10.126 10.346 10.181 20.181 20.358 10.156 10.675 20.131 1
Seichter, Daniel and Fischedick, Söhnke and Köhler, Mona and Gross, Horst-Michael: EMSANet: Efficient Multi-Task RGB-D Scene Analysis for Indoor Environments. IJCNN 2022
UniDet_RVC0.205 20.381 20.323 30.037 30.226 30.177 30.063 20.277 30.120 10.067 30.131 30.074 30.317 20.080 30.235 10.289 30.141 30.678 10.080 3
FKNet0.204 30.334 30.358 20.038 20.234 20.184 20.025 30.318 10.042 40.088 20.141 20.053 40.300 30.207 10.171 30.292 20.149 20.636 30.109 2
MaskRCNN_ScanNetpermissive0.119 40.129 40.212 40.002 40.112 40.148 40.014 40.205 40.044 30.066 40.078 40.095 20.142 40.030 40.128 40.139 40.080 40.459 40.057 4
Re-implementation of Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross Girshick: Mask R-CNN. ICCV'17


This table lists the benchmark results for the scene type classification scenario.




Method Infoavg recallapartmentbathroombedroom / hotelbookstore / libraryconference roomcopy/mail roomhallwaykitchenlaundry roomliving room / loungemiscofficestorage / basement / garage
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
LAST-PCL-type0.780 10.250 31.000 11.000 11.000 11.000 11.000 10.500 21.000 10.500 20.889 10.000 21.000 11.000 1
Yanmin Wu, Qiankun Gao, Renrui Zhang, and Jian Zhang: Language-Assisted 3D Scene Understanding. arxiv23.12
multi-taskpermissive0.700 20.500 11.000 10.882 30.500 31.000 11.000 10.500 21.000 11.000 10.778 20.000 20.938 20.000 3
Shengyu Huang, Mikhail Usvyatsov, Konrad Schindler: Indoor Scene Recognition in 3D. IROS 2020
3DASPP-SCE0.691 30.500 10.938 30.824 41.000 11.000 10.500 31.000 10.857 30.500 20.556 40.000 20.812 30.500 2
SE-ResNeXt-SSMA0.498 40.000 50.812 40.941 20.500 30.500 40.500 30.500 20.429 50.500 20.667 30.500 10.625 40.000 3
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. arXiv
resnet50_scannet0.353 50.250 30.812 40.529 50.500 30.500 40.000 50.500 20.571 40.000 50.556 40.000 20.375 50.000 3