Presenting the ScanNet200 Benchmark

We present the ScanNet200 benchmark, which studies an order of magnitude more class categories than previous version of ScanNet. The scene geometry is shared within the two tasks, but the parsing of surface annotation allows for a larger vocabulary and more realistic setting for in the wild 3D understanding methods.

The ScanNet200 benchmark includes both finer-grained categories as well as a large number of previously unaddressed classes. This induces a much more challenging setting regarding the diversity of naturally observed semantic classes seen in the raw ScanNet RGB-D observations, where the data also reflects naturally encountered class imbalances. The difference in category frequencies between ScanNet and ScanNet200 can be seen in the Figure above.

ScanNet200 Benchmark

This table lists the benchmark results for the ScanNet200 3D semantic label scenario.




Method Infoavg iouhead ioucommon ioutail ioualarm clockarmchairbackpackbagballbarbasketbathroom cabinetbathroom counterbathroom stallbathroom stall doorbathroom vanitybathtubbedbenchbicyclebinblackboardblanketblindsboardbookbookshelfbottlebowlboxbroombucketbulletin boardcabinetcalendarcandlecartcase of water bottlescd caseceilingceiling lightchairclockclosetcloset doorcloset rodcloset wallclothesclothes dryercoat rackcoffee kettlecoffee makercoffee tablecolumncomputer towercontainercopiercouchcountercratecupcurtaincushiondecorationdeskdining tabledish rackdishwasherdividerdoordoorframedresserdumbbelldustpanend tablefanfile cabinetfire alarmfire extinguisherfireplacefloorfolded chairfurnitureguitarguitar casehair dryerhandicap barhatheadphonesironing boardjacketkeyboardkeyboard pianokitchen cabinetkitchen counterladderlamplaptoplaundry basketlaundry detergentlaundry hamperledgelightlight switchluggagemachinemailboxmatmattressmicrowavemini fridgemirrormonitormousemusic standnightstandobjectoffice chairottomanovenpaperpaper bagpaper cutterpaper towel dispenserpaper towel rollpersonpianopicturepillarpillowpipeplantplateplungerposterpotted plantpower outletpower stripprinterprojectorprojector screenpurserackradiatorrailrange hoodrecycling binrefrigeratorscaleseatshelfshoeshowershower curtainshower curtain rodshower doorshower floorshower headshower wallsignsinksoap dishsoap dispensersofa chairspeakerstair railstairsstandstoolstorage binstorage containerstorage organizerstovestructurestuffed animalsuitcasetabletelephonetissue boxtoastertoaster oventoilettoilet papertoilet paper dispensertoilet paper holdertoilet seat cover dispensertoweltrash bintrash cantraytubetvtv standvacuum cleanerventwallwardrobewashing machinewater bottlewater coolerwater pitcherwhiteboardwindowwindowsill
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
LGroundpermissive0.272 10.485 10.184 10.106 10.476 10.077 20.218 10.000 10.000 10.000 10.547 10.295 10.540 10.746 20.745 10.058 20.112 30.005 10.658 20.077 30.000 20.322 10.178 30.512 20.190 10.199 10.277 10.000 10.000 10.173 10.399 10.000 10.000 10.039 20.000 20.858 10.085 20.676 10.002 10.103 10.498 10.323 10.703 10.000 10.000 10.296 10.549 20.216 10.702 10.768 10.718 10.028 20.092 20.786 20.000 10.000 20.453 20.022 20.251 30.252 10.572 10.348 10.321 10.514 10.063 20.279 20.552 10.000 20.019 20.932 10.132 20.000 10.000 10.000 30.156 30.457 10.623 10.518 10.265 20.358 20.381 10.395 10.000 10.000 10.127 30.012 30.051 10.000 10.000 20.886 20.014 10.437 30.179 10.244 10.826 10.000 10.000 10.599 10.136 10.085 20.000 20.000 10.000 10.565 10.612 10.143 10.207 10.566 10.232 20.446 10.127 10.708 20.000 20.384 10.000 10.000 10.000 10.402 10.000 10.059 10.000 10.525 30.566 10.229 20.659 20.000 10.000 10.265 10.446 10.147 20.720 30.597 20.066 20.000 10.187 10.000 10.726 10.467 30.134 30.000 20.413 30.629 20.000 10.363 20.055 30.022 20.000 10.626 10.000 20.000 10.323 20.479 30.154 20.117 10.028 20.901 10.243 10.415 30.295 30.143 30.610 20.000 10.000 20.777 10.397 30.324 20.000 10.778 10.179 10.702 20.000 10.274 30.404 10.233 10.622 10.398 2
David Rozenberszki, Or Litany, Angela Dai: Language-Grounded Indoor 3D Semantic Segmentation in the Wild. arXiv
CSC-Pretrainpermissive0.249 30.455 30.171 30.079 30.418 20.059 30.186 20.000 10.000 10.000 10.335 30.250 20.316 20.766 10.697 30.142 10.170 10.003 20.553 30.112 10.097 10.201 30.186 20.476 30.081 20.000 20.216 30.000 10.000 10.001 30.314 30.000 10.000 10.055 10.000 20.832 30.094 10.659 20.002 10.076 20.310 30.293 30.664 30.000 10.000 10.175 30.634 10.130 20.552 30.686 30.700 30.076 10.110 10.770 30.000 10.000 20.430 30.000 30.319 10.166 20.542 30.327 20.205 30.332 20.052 30.375 10.444 30.000 20.012 30.930 30.203 10.000 10.000 10.046 10.175 10.413 20.592 20.471 20.299 10.152 30.340 20.247 30.000 10.000 10.225 10.058 20.037 20.000 10.207 10.862 30.014 10.548 10.033 20.233 20.816 20.000 10.000 10.542 30.123 20.121 10.019 10.000 10.000 10.463 20.454 30.045 30.128 30.557 20.235 10.441 20.063 30.484 30.000 20.308 30.000 10.000 10.000 10.318 30.000 10.000 20.000 10.545 20.543 20.164 30.734 10.000 10.000 10.215 30.371 20.198 10.743 10.205 30.062 30.000 10.079 20.000 10.683 20.547 20.142 20.000 20.441 20.579 30.000 10.464 10.098 20.041 10.000 10.590 20.000 20.000 10.373 10.494 10.174 10.105 20.001 30.895 20.222 20.537 20.307 20.180 20.625 10.000 10.000 20.591 30.609 20.398 10.000 10.766 30.014 30.638 30.000 10.377 10.004 30.206 30.609 30.465 1
Ji Hou, Benjamin Graham, Matthias Nießner, Saining Xie: Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts. CVPR 2021
Minkowski 34Dpermissive0.253 20.463 20.154 20.102 20.381 30.084 10.134 30.000 10.000 10.000 10.386 20.141 30.279 30.737 30.703 20.014 30.164 20.000 30.663 10.092 20.000 20.224 20.291 10.531 10.056 30.000 20.242 20.000 10.000 10.013 20.331 20.000 10.000 10.035 30.001 10.858 10.059 30.650 30.000 30.056 30.353 20.299 20.670 20.000 10.000 10.284 20.484 30.071 30.594 20.720 20.710 20.027 30.068 30.813 10.000 10.005 10.492 10.164 10.274 20.111 30.571 20.307 30.293 20.307 30.150 10.163 30.531 20.002 10.545 10.932 10.093 30.000 10.000 10.002 20.159 20.368 30.581 30.440 30.228 30.406 10.282 30.294 20.000 10.000 10.189 20.060 10.036 30.000 10.000 20.897 10.000 30.525 20.025 30.205 30.771 30.000 10.000 10.593 20.108 30.044 30.000 20.000 10.000 10.282 30.589 20.094 20.169 20.466 30.227 30.419 30.125 20.757 10.002 10.334 20.000 10.000 10.000 10.357 20.000 10.000 20.000 10.582 10.513 30.337 10.612 30.000 10.000 10.250 20.352 30.136 30.724 20.655 10.280 10.000 10.046 30.000 10.606 30.559 10.159 10.102 10.445 10.655 10.000 10.310 30.117 10.000 30.000 10.581 30.026 10.000 10.265 30.483 20.084 30.097 30.044 10.865 30.142 30.588 10.351 10.272 10.596 30.000 10.003 10.622 20.720 10.096 30.000 10.771 20.016 20.772 10.000 10.302 20.194 20.214 20.621 20.197 3
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019

This table lists the benchmark results for the ScanNet200 3D semantic instance scenario.




Method Infoavg ap 50%head ap 50%common ap 50%tail ap 50%alarm clockarmchairbackpackbagballbarbasketbathroom cabinetbathroom counterbathroom stallbathroom stall doorbathroom vanitybathtubbedbenchbicyclebinblackboardblanketblindsboardbookbookshelfbottlebowlboxbroombucketbulletin boardcabinetcalendarcandlecartcase of water bottlescd caseceilingceiling lightchairclockclosetcloset doorcloset rodcloset wallclothesclothes dryercoat rackcoffee kettlecoffee makercoffee tablecolumncomputer towercontainercopiercouchcountercratecupcurtaincushiondecorationdeskdining tabledish rackdishwasherdividerdoordoorframedresserdumbbelldustpanend tablefanfile cabinetfire alarmfire extinguisherfireplacefolded chairfurnitureguitarguitar casehair dryerhandicap barhatheadphonesironing boardjacketkeyboardkeyboard pianokitchen cabinetkitchen counterladderlamplaptoplaundry basketlaundry detergentlaundry hamperledgelightlight switchluggagemachinemailboxmatmattressmicrowavemini fridgemirrormonitormousemusic standnightstandobjectoffice chairottomanovenpaperpaper bagpaper cutterpaper towel dispenserpaper towel rollpersonpianopicturepillarpillowpipeplantplateplungerposterpotted plantpower outletpower stripprinterprojectorprojector screenpurserackradiatorrailrange hoodrecycling binrefrigeratorscaleseatshelfshoeshowershower curtainshower curtain rodshower doorshower floorshower headshower wallsignsinksoap dishsoap dispensersofa chairspeakerstair railstairsstandstoolstorage binstorage containerstorage organizerstovestructurestuffed animalsuitcasetabletelephonetissue boxtoastertoaster oventoilettoilet papertoilet paper dispensertoilet paper holdertoilet seat cover dispensertoweltrash bintrash cantraytubetvtv standvacuum cleanerventwardrobewashing machinewater bottlewater coolerwater pitcherwhiteboardwindowwindowsill
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
No results yet.

ScanNet Benchmark

This table lists the benchmark results for the 3D semantic label scenario.


Method Infoavg ioubathtubbedbookshelfcabinetchaircountercurtaindeskdoorfloorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwallwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Mix3Dpermissive0.781 10.964 10.855 10.843 80.781 10.858 60.575 20.831 120.685 40.714 10.979 10.594 30.310 140.801 10.892 50.841 20.819 30.723 20.940 50.887 10.725 8
Alexey Nekrasov, Jonas Schult, Or Litany, Bastian Leibe, Francis Engelmann: Mix3D: Out-of-Context Data Augmentation for 3D Scenes. 3DV 2021 (Oral)
OccuSeg+Semantic0.764 20.758 380.796 120.839 90.746 60.907 10.562 30.850 70.680 50.672 30.978 20.610 10.335 70.777 30.819 220.847 10.830 10.691 60.972 10.885 20.727 6
O-CNNpermissive0.762 30.924 20.823 40.844 70.770 20.852 80.577 10.847 80.711 10.640 110.958 70.592 40.217 460.762 60.888 60.758 60.813 50.726 10.932 110.868 60.744 2
Peng-Shuai Wang, Yang Liu, Yu-Xiao Guo, Chun-Yu Sun, Xin Tong: O-CNN: Octree-based Convolutional Neural Networks for 3D Shape Analysis. SIGGRAPH 2017
PointConvFormer0.749 40.793 240.790 150.807 170.750 50.856 70.524 80.881 30.588 290.642 100.977 40.591 50.274 270.781 20.929 10.804 30.796 110.642 130.947 30.885 20.715 10
BPNetcopyleft0.749 40.909 30.818 70.811 140.752 40.839 120.485 210.842 100.673 60.644 80.957 80.528 160.305 160.773 40.859 100.788 40.818 40.693 50.916 120.856 110.723 9
Wenbo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia, Tien-Tsin Wong: Bidirectional Projection Network for Cross Dimension Scene Understanding. CVPR 2021 (Oral)
StratifiedFormerpermissive0.747 60.901 40.803 100.845 60.757 30.846 100.512 110.825 130.696 30.645 70.956 90.576 70.262 330.744 110.861 90.742 80.770 240.705 30.899 210.860 90.734 3
Xin Lai*, Jianhui Liu*, Li Jiang, Liwei Wang, Hengshuang Zhao, Shu Liu, Xiaojuan Qi, Jiaya Jia: Stratified Transformer for 3D Point Cloud Segmentation. CVPR 2022
Virtual MVFusion0.746 70.771 320.819 60.848 40.702 160.865 50.397 580.899 10.699 20.664 40.948 300.588 60.330 80.746 100.851 150.764 50.796 110.704 40.935 80.866 70.728 4
Abhijit Kundu, Xiaoqi Yin, Alireza Fathi, David Ross, Brian Brewington, Thomas Funkhouser, Caroline Pantofaru: Virtual Multi-view Fusion for 3D Semantic Segmentation. ECCV 2020
VMNetpermissive0.746 70.870 90.838 20.858 20.729 90.850 90.501 140.874 40.587 300.658 50.956 90.564 90.299 170.765 50.900 30.716 160.812 60.631 180.939 60.858 100.709 11
Zeyu HU, Xuyang Bai, Jiaxiang Shang, Runze Zhang, Jiayu Dong, Xin Wang, Guangyuan Sun, Hongbo Fu, Chiew-Lan Tai: VMNet: Voxel-Mesh Network for Geodesic-Aware 3D Semantic Segmentation. ICCV 2021 (Oral)
EQ-Net0.743 90.620 670.799 110.849 30.730 80.822 230.493 190.897 20.664 70.681 20.955 120.562 100.378 10.760 70.903 20.738 90.801 90.673 80.907 150.877 40.745 1
Zetong Yang*, Li Jiang*, Yanan Sun, Bernt Schiele, Jiaya JIa: A Unified Query-based Paradigm for Point Cloud Understanding. CVPR 2022
MinkowskiNetpermissive0.736 100.859 110.818 70.832 100.709 130.840 110.521 100.853 60.660 90.643 90.951 200.544 110.286 230.731 120.893 40.675 280.772 220.683 70.874 410.852 130.727 6
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019
IPCA0.731 110.890 50.837 30.864 10.726 100.873 20.530 70.824 140.489 610.647 60.978 20.609 20.336 60.624 270.733 350.758 60.776 200.570 410.949 20.877 40.728 4
SparseConvNet0.725 120.647 630.821 50.846 50.721 110.869 30.533 60.754 310.603 250.614 140.955 120.572 80.325 100.710 130.870 70.724 130.823 20.628 190.934 90.865 80.683 16
MatchingNet0.724 130.812 210.812 90.810 150.735 70.834 150.495 180.860 50.572 350.602 200.954 140.512 190.280 240.757 80.845 180.725 120.780 180.606 270.937 70.851 140.700 13
INS-Conv-semantic0.717 140.751 410.759 270.812 130.704 150.868 40.537 50.842 100.609 210.608 170.953 160.534 120.293 190.616 280.864 80.719 150.793 140.640 140.933 100.845 180.663 20
contrastBoundarypermissive0.705 150.769 350.775 210.809 160.687 180.820 260.439 450.812 190.661 80.591 230.945 390.515 180.171 630.633 240.856 110.720 140.796 110.668 90.889 290.847 160.689 15
Liyao Tang, Yibing Zhan, Zhe Chen, Baosheng Yu, Dacheng Tao: Contrastive Boundary Learning for Point Cloud Segmentation. CVPR2022
RFCR0.702 160.889 60.745 350.813 120.672 210.818 300.493 190.815 170.623 160.610 150.947 330.470 310.249 380.594 320.848 160.705 200.779 190.646 120.892 270.823 250.611 36
Jingyu Gong, Jiachen Xu, Xin Tan, Haichuan Song, Yanyun Qu, Yuan Xie, Lizhuang Ma: Omni-Supervised Point Cloud Segmentation via Gradual Receptive Field Component Reasoning. CVPR2021
One Thing One Click0.701 170.825 180.796 120.723 370.716 120.832 160.433 470.816 150.634 140.609 160.969 60.418 550.344 40.559 440.833 190.715 170.808 70.560 450.902 180.847 160.680 17
JSENetpermissive0.699 180.881 80.762 250.821 110.667 220.800 430.522 90.792 230.613 180.607 180.935 570.492 240.205 510.576 380.853 130.691 230.758 290.652 110.872 440.828 220.649 25
Zeyu HU, Mingmin Zhen, Xuyang BAI, Hongbo Fu, Chiew-lan Tai: JSENet: Joint Semantic Segmentation and Edge Detection Network for 3D Point Clouds. ECCV 2020
PicassoNet-IIpermissive0.696 190.704 530.790 150.787 230.709 130.837 130.459 310.815 170.543 450.615 130.956 90.529 140.250 360.551 470.790 270.703 210.799 100.619 220.908 140.848 150.700 13
Huan Lei, Naveed Akhtar, Mubarak Shah, and Ajmal Mian: Geometric feature learning for 3D meshes.
CU-Hybrid Net0.693 200.596 710.789 170.803 190.677 200.800 430.469 250.846 90.554 430.591 230.948 300.500 220.316 120.609 290.847 170.732 100.808 70.593 330.894 250.839 190.652 24
One-Thing-One-Click0.693 200.743 430.794 140.655 610.684 190.822 230.497 170.719 410.622 170.617 120.977 40.447 430.339 50.750 90.664 490.703 210.790 160.596 300.946 40.855 120.647 26
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
Feature_GeometricNetpermissive0.690 220.884 70.754 310.795 220.647 270.818 300.422 490.802 220.612 190.604 190.945 390.462 340.189 580.563 430.853 130.726 110.765 250.632 170.904 160.821 270.606 40
Kangcheng Liu, Ben M. Chen: https://arxiv.org/abs/2012.09439. arXiv Preprint
FusionNet0.688 230.704 530.741 390.754 340.656 230.829 180.501 140.741 360.609 210.548 310.950 240.522 170.371 20.633 240.756 300.715 170.771 230.623 200.861 510.814 290.658 21
Feihu Zhang, Jin Fang, Benjamin Wah, Philip Torr: Deep FusionNet for Point Cloud Semantic Segmentation. ECCV 2020
KP-FCNN0.684 240.847 140.758 290.784 250.647 270.814 330.473 230.772 260.605 230.594 220.935 570.450 410.181 610.587 330.805 250.690 240.785 170.614 230.882 330.819 280.632 31
H. Thomas, C. Qi, J. Deschaud, B. Marcotegui, F. Goulette, L. Guibas.: KPConv: Flexible and Deformable Convolution for Point Clouds. ICCV 2019
VACNN++0.684 240.728 490.757 300.776 270.690 170.804 410.464 290.816 150.577 340.587 250.945 390.508 210.276 260.671 140.710 400.663 330.750 320.589 360.881 350.832 210.653 23
Superpoint Network0.683 260.851 130.728 440.800 210.653 250.806 390.468 260.804 200.572 350.602 200.946 360.453 400.239 410.519 530.822 200.689 260.762 270.595 320.895 240.827 230.630 32
PointContrast_LA_SEM0.683 260.757 390.784 180.786 240.639 300.824 220.408 530.775 250.604 240.541 330.934 610.532 130.269 300.552 450.777 280.645 430.793 140.640 140.913 130.824 240.671 18
VI-PointConv0.676 280.770 340.754 310.783 260.621 340.814 330.552 40.758 290.571 370.557 290.954 140.529 140.268 320.530 510.682 450.675 280.719 400.603 280.888 300.833 200.665 19
Xingyi Li, Wenxuan Wu, Xiaoli Z. Fern, Li Fuxin: The Devils in the Point Clouds: Studying the Robustness of Point Cloud Convolutions.
ROSMRF3D0.673 290.789 250.748 330.763 320.635 320.814 330.407 550.747 330.581 330.573 260.950 240.484 250.271 290.607 300.754 310.649 380.774 210.596 300.883 320.823 250.606 40
SALANet0.670 300.816 200.770 230.768 300.652 260.807 380.451 330.747 330.659 100.545 320.924 670.473 300.149 730.571 400.811 240.635 460.746 330.623 200.892 270.794 420.570 52
PointConvpermissive0.666 310.781 270.759 270.699 450.644 290.822 230.475 220.779 240.564 400.504 490.953 160.428 490.203 530.586 350.754 310.661 340.753 300.588 370.902 180.813 310.642 27
Wenxuan Wu, Zhongang Qi, Li Fuxin: PointConv: Deep Convolutional Networks on 3D Point Clouds. CVPR 2019
PointASNLpermissive0.666 310.703 550.781 190.751 360.655 240.830 170.471 240.769 270.474 640.537 350.951 200.475 290.279 250.635 220.698 440.675 280.751 310.553 500.816 620.806 330.703 12
Xu Yan, Chaoda Zheng, Zhen Li, Sheng Wang, Shuguang Cui: PointASNL: Robust Point Clouds Processing using Nonlocal Neural Networks with Adaptive Sampling. CVPR 2020
PPCNN++permissive0.663 330.746 420.708 470.722 380.638 310.820 260.451 330.566 680.599 270.541 330.950 240.510 200.313 130.648 190.819 220.616 520.682 550.590 350.869 470.810 320.656 22
Pyunghwan Ahn, Juyoung Yang, Eojindl Yi, Chanho Lee, Junmo Kim: Projection-based Point Convolution for Efficient Point Cloud Segmentation. IEEE Access
DCM-Net0.658 340.778 280.702 500.806 180.619 350.813 360.468 260.693 480.494 570.524 410.941 490.449 420.298 180.510 550.821 210.675 280.727 390.568 430.826 590.803 350.637 29
Jonas Schult*, Francis Engelmann*, Theodora Kontogianni, Bastian Leibe: DualConvMesh-Net: Joint Geodesic and Euclidean Convolutions on 3D Meshes. CVPR 2020 [Oral]
HPGCNN0.656 350.698 560.743 370.650 620.564 530.820 260.505 130.758 290.631 150.479 540.945 390.480 270.226 420.572 390.774 290.690 240.735 360.614 230.853 540.776 560.597 45
Jisheng Dang, Qingyong Hu, Yulan Guo, Jun Yang: HPGCNN.
SAFNet-segpermissive0.654 360.752 400.734 410.664 580.583 470.815 320.399 570.754 310.639 120.535 370.942 470.470 310.309 150.665 150.539 570.650 370.708 460.635 160.857 530.793 440.642 27
Linqing Zhao, Jiwen Lu, Jie Zhou: Similarity-Aware Fusion Network for 3D Semantic Segmentation. IROS 2021
RandLA-Netpermissive0.645 370.778 280.731 420.699 450.577 480.829 180.446 370.736 370.477 630.523 430.945 390.454 380.269 300.484 620.749 340.618 500.738 340.599 290.827 580.792 470.621 34
MVPNetpermissive0.641 380.831 150.715 450.671 550.590 430.781 530.394 590.679 510.642 110.553 300.937 550.462 340.256 340.649 180.406 710.626 470.691 530.666 100.877 370.792 470.608 39
Maximilian Jaritz, Jiayuan Gu, Hao Su: Multi-view PointNet for 3D Scene Understanding. GMDL Workshop, ICCV 2019
PointConv-SFPN0.641 380.776 300.703 490.721 390.557 560.826 200.451 330.672 530.563 410.483 530.943 460.425 520.162 680.644 200.726 360.659 350.709 450.572 400.875 390.786 510.559 57
PointMRNet0.640 400.717 520.701 510.692 480.576 490.801 420.467 280.716 420.563 410.459 580.953 160.429 480.169 650.581 360.854 120.605 530.710 430.550 510.894 250.793 440.575 50
FPConvpermissive0.639 410.785 260.760 260.713 430.603 380.798 460.392 600.534 730.603 250.524 410.948 300.457 360.250 360.538 490.723 380.598 570.696 510.614 230.872 440.799 360.567 54
Yiqun Lin, Zizheng Yan, Haibin Huang, Dong Du, Ligang Liu, Shuguang Cui, Xiaoguang Han: FPConv: Learning Local Flattening for Point Convolution. CVPR 2020
PD-Net0.638 420.797 230.769 240.641 670.590 430.820 260.461 300.537 720.637 130.536 360.947 330.388 620.206 500.656 160.668 470.647 410.732 380.585 380.868 480.793 440.473 74
PointSPNet0.637 430.734 460.692 580.714 420.576 490.797 470.446 370.743 350.598 280.437 630.942 470.403 580.150 720.626 260.800 260.649 380.697 500.557 480.846 550.777 550.563 55
SConv0.636 440.830 160.697 540.752 350.572 520.780 550.445 390.716 420.529 480.530 380.951 200.446 440.170 640.507 570.666 480.636 450.682 550.541 570.886 310.799 360.594 46
Supervoxel-CNN0.635 450.656 610.711 460.719 400.613 360.757 640.444 420.765 280.534 470.566 270.928 650.478 280.272 280.636 210.531 590.664 320.645 660.508 640.864 500.792 470.611 36
joint point-basedpermissive0.634 460.614 680.778 200.667 570.633 330.825 210.420 500.804 200.467 660.561 280.951 200.494 230.291 200.566 410.458 650.579 620.764 260.559 470.838 560.814 290.598 44
Hung-Yueh Chiang, Yen-Liang Lin, Yueh-Cheng Liu, Winston H. Hsu: A Unified Point-Based Framework for 3D Segmentation. 3DV 2019
MCCNNpermissive0.633 470.866 100.731 420.771 280.576 490.809 370.410 520.684 490.497 560.491 510.949 270.466 330.105 770.581 360.646 510.620 480.680 570.542 560.817 610.795 400.618 35
P. Hermosilla, T. Ritschel, P.P. Vazquez, A. Vinacua, T. Ropinski: Monte Carlo Convolution for Learning on Non-Uniformly Sampled Point Clouds. SIGGRAPH Asia 2018
PointMTL0.632 480.731 470.688 610.675 520.591 420.784 520.444 420.565 690.610 200.492 500.949 270.456 370.254 350.587 330.706 410.599 560.665 620.612 260.868 480.791 500.579 49
3DSM_DMMF0.631 490.626 660.745 350.801 200.607 370.751 650.506 120.729 400.565 390.491 510.866 800.434 450.197 560.595 310.630 520.709 190.705 480.560 450.875 390.740 660.491 69
PointNet2-SFPN0.631 490.771 320.692 580.672 530.524 600.837 130.440 440.706 460.538 460.446 600.944 440.421 540.219 450.552 450.751 330.591 590.737 350.543 550.901 200.768 580.557 58
APCF-Net0.631 490.742 440.687 630.672 530.557 560.792 500.408 530.665 540.545 440.508 460.952 190.428 490.186 590.634 230.702 420.620 480.706 470.555 490.873 420.798 380.581 48
Haojia, Lin: Adaptive Pyramid Context Fusion for Point Cloud Perception. GRSL
FusionAwareConv0.630 520.604 700.741 390.766 310.590 430.747 660.501 140.734 380.503 550.527 390.919 710.454 380.323 110.550 480.420 700.678 270.688 540.544 530.896 230.795 400.627 33
Jiazhao Zhang, Chenyang Zhu, Lintao Zheng, Kai Xu: Fusion-Aware Point Convolution for Online Semantic 3D Scene Segmentation. CVPR 2020
DenSeR0.628 530.800 220.625 740.719 400.545 580.806 390.445 390.597 620.448 700.519 440.938 540.481 260.328 90.489 610.499 640.657 360.759 280.592 340.881 350.797 390.634 30
SegGroup_sempermissive0.627 540.818 190.747 340.701 440.602 390.764 610.385 650.629 590.490 590.508 460.931 640.409 570.201 540.564 420.725 370.618 500.692 520.539 580.873 420.794 420.548 61
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation.
SIConv0.625 550.830 160.694 560.757 330.563 540.772 590.448 360.647 570.520 500.509 450.949 270.431 470.191 570.496 590.614 530.647 410.672 600.535 600.876 380.783 520.571 51
HPEIN0.618 560.729 480.668 640.647 640.597 410.766 600.414 510.680 500.520 500.525 400.946 360.432 460.215 470.493 600.599 540.638 440.617 710.570 410.897 220.806 330.605 42
Li Jiang, Hengshuang Zhao, Shu Liu, Xiaoyong Shen, Chi-Wing Fu, Jiaya Jia: Hierarchical Point-Edge Interaction Network for Point Cloud Semantic Segmentation. ICCV 2019
SPH3D-GCNpermissive0.610 570.858 120.772 220.489 790.532 590.792 500.404 560.643 580.570 380.507 480.935 570.414 560.046 830.510 550.702 420.602 550.705 480.549 520.859 520.773 570.534 63
Huan Lei, Naveed Akhtar, and Ajmal Mian: Spherical Kernel for Efficient Graph Convolution on 3D Point Clouds. TPAMI 2020
AttAN0.609 580.760 370.667 650.649 630.521 610.793 480.457 320.648 560.528 490.434 650.947 330.401 590.153 710.454 640.721 390.648 400.717 410.536 590.904 160.765 590.485 70
Gege Zhang, Qinghua Ma, Licheng Jiao, Fang Liu and Qigong Sun: AttAN: Attention Adversarial Networks for 3D Point Cloud Semantic Segmentation. IJCAI2020
wsss-transformer0.600 590.634 640.743 370.697 470.601 400.781 530.437 460.585 660.493 580.446 600.933 620.394 600.011 850.654 170.661 500.603 540.733 370.526 610.832 570.761 610.480 71
LAP-D0.594 600.720 500.692 580.637 680.456 690.773 580.391 620.730 390.587 300.445 620.940 510.381 630.288 210.434 670.453 670.591 590.649 640.581 390.777 660.749 650.610 38
DPC0.592 610.720 500.700 520.602 720.480 650.762 630.380 660.713 440.585 320.437 630.940 510.369 650.288 210.434 670.509 630.590 610.639 690.567 440.772 670.755 630.592 47
Francis Engelmann, Theodora Kontogianni, Bastian Leibe: Dilated Point Convolutions: On the Receptive Field Size of Point Convolutions on 3D Point Clouds. ICRA 2020
CCRFNet0.589 620.766 360.659 680.683 500.470 680.740 680.387 640.620 610.490 590.476 550.922 690.355 690.245 390.511 540.511 620.571 630.643 670.493 680.872 440.762 600.600 43
ROSMRF0.580 630.772 310.707 480.681 510.563 540.764 610.362 670.515 740.465 670.465 570.936 560.427 510.207 490.438 650.577 550.536 670.675 590.486 690.723 720.779 530.524 65
SQN_0.1%0.569 640.676 580.696 550.657 600.497 620.779 560.424 480.548 700.515 520.376 700.902 780.422 530.357 30.379 710.456 660.596 580.659 630.544 530.685 750.665 780.556 59
TextureNetpermissive0.566 650.672 600.664 660.671 550.494 630.719 690.445 390.678 520.411 760.396 680.935 570.356 680.225 430.412 690.535 580.565 640.636 700.464 720.794 650.680 750.568 53
Jingwei Huang, Haotian Zhang, Li Yi, Thomas Funkerhouser, Matthias Niessner, Leonidas Guibas: TextureNet: Consistent Local Parametrizations for Learning from High-Resolution Signals on Meshes. CVPR
DVVNet0.562 660.648 620.700 520.770 290.586 460.687 730.333 710.650 550.514 530.475 560.906 750.359 670.223 440.340 740.442 690.422 780.668 610.501 650.708 730.779 530.534 63
Pointnet++ & Featurepermissive0.557 670.735 450.661 670.686 490.491 640.744 670.392 600.539 710.451 690.375 710.946 360.376 640.205 510.403 700.356 740.553 660.643 670.497 660.824 600.756 620.515 66
PointMRNet-lite0.553 680.633 650.648 690.659 590.430 720.800 430.390 630.592 640.454 680.371 720.939 530.368 660.136 750.368 720.448 680.560 650.715 420.486 690.882 330.720 700.462 75
GMLPs0.538 690.495 790.693 570.647 640.471 670.793 480.300 730.477 750.505 540.358 730.903 770.327 720.081 800.472 630.529 600.448 760.710 430.509 620.746 690.737 670.554 60
PanopticFusion-label0.529 700.491 800.688 610.604 710.386 750.632 780.225 830.705 470.434 730.293 780.815 810.348 700.241 400.499 580.669 460.507 690.649 640.442 770.796 640.602 810.561 56
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
subcloud_weak0.516 710.676 580.591 790.609 690.442 700.774 570.335 700.597 620.422 750.357 740.932 630.341 710.094 790.298 760.528 610.473 740.676 580.495 670.602 800.721 690.349 81
Online SegFusion0.515 720.607 690.644 720.579 740.434 710.630 790.353 680.628 600.440 710.410 660.762 840.307 740.167 660.520 520.403 720.516 680.565 730.447 760.678 760.701 720.514 67
Davide Menini, Suryansh Kumar, Martin R. Oswald, Erik Sandstroem, Cristian Sminchisescu, Luc van Gool: A Real-Time Learning Framework for Joint 3D Reconstruction and Semantic Segmentation. Robotics and Automation Letters Submission
3DMV, FTSDF0.501 730.558 750.608 770.424 830.478 660.690 720.246 790.586 650.468 650.450 590.911 730.394 600.160 690.438 650.212 800.432 770.541 780.475 710.742 700.727 680.477 72
PCNN0.498 740.559 740.644 720.560 760.420 740.711 710.229 810.414 760.436 720.352 750.941 490.324 730.155 700.238 800.387 730.493 700.529 790.509 620.813 630.751 640.504 68
3DMV0.484 750.484 810.538 810.643 660.424 730.606 820.310 720.574 670.433 740.378 690.796 820.301 750.214 480.537 500.208 810.472 750.507 820.413 800.693 740.602 810.539 62
Angela Dai, Matthias Niessner: 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation. ECCV'18
PointCNN with RGBpermissive0.458 760.577 730.611 760.356 850.321 820.715 700.299 750.376 790.328 820.319 760.944 440.285 770.164 670.216 830.229 790.484 720.545 770.456 740.755 680.709 710.475 73
Yangyan Li, Rui Bu, Mingchao Sun, Baoquan Chen: PointCNN. NeurIPS 2018
FCPNpermissive0.447 770.679 570.604 780.578 750.380 760.682 740.291 760.106 850.483 620.258 830.920 700.258 800.025 840.231 820.325 750.480 730.560 750.463 730.725 710.666 770.231 85
Dario Rethage, Johanna Wald, Jürgen Sturm, Nassir Navab, Federico Tombari: Fully-Convolutional Point Networks for Large-Scale Point Clouds. ECCV 2018
PNET20.442 780.548 760.548 800.597 730.363 780.628 800.300 730.292 800.374 780.307 770.881 790.268 790.186 590.238 800.204 820.407 790.506 830.449 750.667 770.620 800.462 75
SurfaceConvPF0.442 780.505 780.622 750.380 840.342 800.654 760.227 820.397 780.367 790.276 800.924 670.240 810.198 550.359 730.262 770.366 800.581 720.435 780.640 780.668 760.398 77
Hao Pan, Shilin Liu, Yang Liu, Xin Tong: Convolutional Neural Networks on 3D Surfaces Using Parallel Frames.
Tangent Convolutionspermissive0.438 800.437 830.646 710.474 800.369 770.645 770.353 680.258 820.282 840.279 790.918 720.298 760.147 740.283 770.294 760.487 710.562 740.427 790.619 790.633 790.352 80
Maxim Tatarchenko, Jaesik Park, Vladlen Koltun, Qian-Yi Zhou: Tangent convolutions for dense prediction in 3d. CVPR 2018
3DWSSS0.425 810.525 770.647 700.522 770.324 810.488 850.077 860.712 450.353 800.401 670.636 860.281 780.176 620.340 740.565 560.175 860.551 760.398 810.370 860.602 810.361 79
SPLAT Netcopyleft0.393 820.472 820.511 820.606 700.311 830.656 750.245 800.405 770.328 820.197 840.927 660.227 830.000 870.001 880.249 780.271 850.510 800.383 830.593 810.699 730.267 83
Hang Su, Varun Jampani, Deqing Sun, Subhransu Maji, Evangelos Kalogerakis, Ming-Hsuan Yang, Jan Kautz: SPLATNet: Sparse Lattice Networks for Point Cloud Processing. CVPR 2018
ScanNet+FTSDF0.383 830.297 850.491 830.432 820.358 790.612 810.274 770.116 840.411 760.265 810.904 760.229 820.079 810.250 780.185 830.320 830.510 800.385 820.548 820.597 840.394 78
PointNet++permissive0.339 840.584 720.478 840.458 810.256 850.360 860.250 780.247 830.278 850.261 820.677 850.183 840.117 760.212 840.145 850.364 810.346 860.232 860.548 820.523 850.252 84
Charles R. Qi, Li Yi, Hao Su, Leonidas J. Guibas: pointnet++: deep hierarchical feature learning on point sets in a metric space.
SSC-UNetpermissive0.308 850.353 840.290 860.278 860.166 860.553 830.169 850.286 810.147 860.148 860.908 740.182 850.064 820.023 870.018 870.354 820.363 840.345 840.546 840.685 740.278 82
ScanNetpermissive0.306 860.203 860.366 850.501 780.311 830.524 840.211 840.002 870.342 810.189 850.786 830.145 860.102 780.245 790.152 840.318 840.348 850.300 850.460 850.437 860.182 86
Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner: ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. CVPR'17
ERROR0.054 870.000 870.041 870.172 870.030 870.062 880.001 880.035 860.004 870.051 870.143 870.019 880.003 860.041 860.050 860.003 880.054 870.018 880.005 870.264 870.082 87
Feature-Geometry Netpermissive0.024 880.000 870.000 880.001 880.010 880.098 870.007 870.000 880.000 880.026 880.072 880.059 870.000 870.060 850.000 880.013 870.040 880.045 870.000 880.038 880.006 88

This table lists the benchmark results for the 3D semantic instance scenario.




Method Infoavg ap 50%bathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
SoftGroup++0.769 11.000 10.803 150.937 10.684 30.865 30.213 130.870 20.664 20.571 40.758 10.702 40.807 11.000 10.653 120.902 10.792 21.000 10.626 1
SoftGrouppermissive0.761 21.000 10.808 120.845 60.716 10.862 50.243 100.824 30.655 40.620 20.734 20.699 50.791 30.981 190.716 40.844 40.769 31.000 10.594 5
Thang Vu, Kookhoi Kim, Tung M. Luu, Xuan Thanh Nguyen, Chang D. Yoo: SoftGroup for 3D Instance Segmentaiton on Point Clouds. CVPR 2022 [Oral]
GraphCut0.732 31.000 10.788 180.724 180.642 60.859 60.248 90.787 90.618 80.596 30.653 40.722 20.583 231.000 10.766 20.861 20.825 11.000 10.504 15
IPCA-Inst0.731 41.000 10.788 190.884 50.698 20.788 200.252 80.760 110.646 50.511 120.637 60.665 60.804 21.000 10.644 130.778 100.747 51.000 10.561 10
TopoSeg0.725 51.000 10.806 140.933 20.668 50.758 230.272 60.734 150.630 60.549 80.654 30.606 70.697 60.966 210.612 160.839 50.754 41.000 10.573 6
DKNet0.718 61.000 10.814 90.782 100.619 70.872 20.224 110.751 130.569 100.677 10.585 90.724 10.633 150.981 190.515 230.819 70.736 61.000 10.617 2
SDSC0.700 71.000 10.848 40.763 160.609 100.792 180.262 70.824 30.627 70.535 100.547 170.493 130.600 171.000 10.712 60.731 220.689 101.000 10.563 9
HAISpermissive0.699 81.000 10.849 30.820 70.675 40.808 130.279 40.757 120.465 150.517 110.596 70.559 90.600 171.000 10.654 110.767 120.676 110.994 260.560 11
Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang: Hierarchical Aggregation for 3D Instance Segmentation. ICCV 2021
SSTNetpermissive0.698 91.000 10.697 340.888 40.556 160.803 140.387 20.626 220.417 190.556 70.585 100.702 30.600 171.000 10.824 10.720 240.692 81.000 10.509 14
Zhihao Liang, Zhihao Li, Songcen Xu, Mingkui Tan, Kui Jia: Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks. ICCV2021
SphereSeg0.680 101.000 10.856 20.744 170.618 80.893 10.151 150.651 200.713 10.537 90.579 120.430 210.651 71.000 10.389 320.744 190.697 70.991 270.601 4
MaskVoteNet_Coarse0.677 111.000 10.847 50.771 120.509 220.816 90.277 50.558 290.482 120.562 60.640 50.448 170.700 41.000 10.666 70.852 30.578 220.997 210.488 19
OccuSeg+instance0.672 121.000 10.758 260.682 200.576 140.842 70.477 10.504 320.524 110.567 50.585 110.451 160.557 241.000 10.751 30.797 90.563 251.000 10.467 22
Lei Han, Tian Zheng, Lan Xu, Lu Fang: OccuSeg: Occupancy-aware 3D Instance Segmentation. CVPR2020
Mask-Group0.664 131.000 10.822 80.764 150.616 90.815 100.139 190.694 170.597 90.459 170.566 130.599 80.600 170.516 380.715 50.819 80.635 151.000 10.603 3
Min Zhong, Xinghao Chen, Xiaokang Chen, Gang Zeng, Yunhe Wang: MaskGroup: Hierarchical Point Grouping and Masking for 3D Instance Segmentation. ICME 2022
INS-Conv-instance0.657 141.000 10.760 240.667 220.581 120.863 40.323 30.655 190.477 130.473 150.549 150.432 200.650 81.000 10.655 100.738 200.585 210.944 310.472 21
CSC-Pretrained0.648 151.000 10.810 100.768 130.523 210.813 110.143 180.819 50.389 200.422 240.511 200.443 180.650 81.000 10.624 150.732 210.634 161.000 10.375 28
PE0.645 161.000 10.773 210.798 90.538 180.786 210.088 260.799 80.350 240.435 230.547 160.545 100.646 140.933 220.562 190.761 150.556 300.997 210.501 17
Biao Zhang, Peter Wonka: Point Cloud Instance Segmentation using Probabilistic Embeddings. CVPR 2021
RPGN0.643 171.000 10.758 250.582 300.539 170.826 80.046 300.765 100.372 220.436 220.588 80.539 120.650 81.000 10.577 170.750 170.653 140.997 210.495 18
Dyco3Dcopyleft0.641 181.000 10.841 60.893 30.531 190.802 150.115 230.588 270.448 160.438 200.537 190.430 220.550 250.857 240.534 210.764 140.657 120.987 280.568 7
Tong He; Chunhua Shen; Anton van den Hengel: DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution. CVPR2021
GICN0.638 191.000 10.895 10.800 80.480 250.676 270.144 170.737 140.354 230.447 180.400 290.365 270.700 41.000 10.569 180.836 60.599 181.000 10.473 20
PointGroup0.636 201.000 10.765 220.624 240.505 240.797 160.116 220.696 160.384 210.441 190.559 140.476 140.596 211.000 10.666 70.756 160.556 290.997 210.513 13
Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia: PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation. CVPR 2020 [oral]
DD-UNet+Group0.635 210.667 310.797 170.714 190.562 150.774 220.146 160.810 70.429 180.476 140.546 180.399 240.633 151.000 10.632 140.722 230.609 171.000 10.514 12
H. Liu, R. Liu, K. Yang, J. Zhang, K. Peng, R. Stiefelhagen: HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor. ICCVW 2021
DENet0.629 221.000 10.797 160.608 250.589 110.627 310.219 120.882 10.310 260.402 280.383 310.396 250.650 81.000 10.663 90.543 390.691 91.000 10.568 8
3D-MPA0.611 231.000 10.833 70.765 140.526 200.756 240.136 210.588 270.470 140.438 210.432 270.358 280.650 80.857 240.429 280.765 130.557 281.000 10.430 24
Francis Engelmann, Martin Bokeloh, Alireza Fathi, Bastian Leibe, Matthias Nießner: 3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation. CVPR 2020
PCJC0.578 241.000 10.810 110.583 290.449 280.813 120.042 310.603 250.341 250.490 130.465 230.410 230.650 80.835 300.264 370.694 280.561 260.889 350.504 16
SSEN0.575 251.000 10.761 230.473 320.477 260.795 170.066 270.529 300.658 30.460 160.461 240.380 260.331 370.859 230.401 310.692 290.653 131.000 10.348 30
Dongsu Zhang, Junha Chun, Sang Kyun Cha, Young Min Kim: Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with Deep Metric Learning. Arxiv
RWSeg0.567 260.528 400.708 330.626 230.580 130.745 250.063 280.627 210.240 300.400 290.497 210.464 150.515 261.000 10.475 250.745 180.571 231.000 10.429 25
MTML0.549 271.000 10.807 130.588 280.327 330.647 290.004 360.815 60.180 320.418 250.364 330.182 320.445 301.000 10.442 270.688 300.571 241.000 10.396 26
Jean Lahoud, Bernard Ghanem, Marc Pollefeys, Martin R. Oswald: 3D Instance Segmentation via Multi-task Metric Learning. ICCV 2019 [oral]
One_Thing_One_Clickpermissive0.529 280.667 310.718 290.777 110.399 290.683 260.000 390.669 180.138 350.391 300.374 320.539 110.360 360.641 350.556 200.774 110.593 190.997 210.251 35
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
Sparse R-CNN0.515 291.000 10.538 410.282 350.468 270.790 190.173 140.345 370.429 170.413 270.484 220.176 330.595 220.591 360.522 220.668 310.476 340.986 290.327 31
Occipital-SCS0.512 301.000 10.716 300.509 310.506 230.611 320.092 250.602 260.177 330.346 330.383 300.165 340.442 310.850 290.386 330.618 350.543 310.889 350.389 27
3D-BoNet0.488 311.000 10.672 360.590 270.301 350.484 420.098 240.620 230.306 270.341 340.259 370.125 360.434 330.796 310.402 300.499 410.513 330.909 340.439 23
Bo Yang, Jianan Wang, Ronald Clark, Qingyong Hu, Sen Wang, Andrew Markham, Niki Trigoni: Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds. NeurIPS 2019 Spotlight
PanopticFusion-inst0.478 320.667 310.712 320.595 260.259 380.550 380.000 390.613 240.175 340.250 390.434 250.437 190.411 350.857 240.485 240.591 380.267 440.944 310.359 29
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
SPG_WSIS0.470 330.667 310.685 350.677 210.372 310.562 360.000 390.482 330.244 290.316 360.298 340.052 420.442 320.857 240.267 360.702 250.559 271.000 10.287 33
SALoss-ResNet0.459 341.000 10.737 280.159 450.259 370.587 340.138 200.475 340.217 310.416 260.408 280.128 350.315 380.714 320.411 290.536 400.590 200.873 380.304 32
Zhidong Liang, Ming Yang, Hao Li, Chunxiang Wang: 3D Instance Embedding Learning With a Structure-Aware Loss Function for Point Cloud Segmentation. IEEE Robotics and Automation Letters (IROS2020)
MASCpermissive0.447 350.528 400.555 390.381 330.382 300.633 300.002 370.509 310.260 280.361 320.432 260.327 290.451 290.571 370.367 340.639 330.386 350.980 300.276 34
Chen Liu, Yasutaka Furukawa: MASC: Multi-scale Affinity with Sparse Convolution for 3D Instance Segmentation.
SegGroup_inspermissive0.445 360.667 310.773 200.185 420.317 340.656 280.000 390.407 360.134 360.381 310.267 360.217 310.476 280.714 320.452 260.629 340.514 321.000 10.222 38
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation.
3D-SISpermissive0.382 371.000 10.432 430.245 370.190 390.577 350.013 340.263 390.033 420.320 350.240 380.075 380.422 340.857 240.117 410.699 260.271 430.883 370.235 37
Ji Hou, Angela Dai, Matthias Niessner: 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans. CVPR 2019
Hier3Dcopyleft0.323 380.667 310.542 400.264 360.157 420.550 370.000 390.205 420.009 430.270 380.218 390.075 380.500 270.688 340.007 470.698 270.301 400.459 440.200 39
Tan: HCFS3D: Hierarchical Coupled Feature Selection Network for 3D Semantic and Instance Segmentation.
UNet-backbone0.319 390.667 310.715 310.233 380.189 400.479 430.008 350.218 400.067 410.201 410.173 400.107 370.123 430.438 390.150 390.615 360.355 360.916 330.093 46
R-PointNet0.306 400.500 420.405 440.311 340.348 320.589 330.054 290.068 450.126 370.283 370.290 350.028 430.219 410.214 420.331 350.396 450.275 410.821 400.245 36
Region-18class0.284 410.250 460.751 270.228 400.270 360.521 390.000 390.468 350.008 450.205 400.127 410.000 470.068 450.070 450.262 380.652 320.323 380.740 410.173 40
SemRegionNet-20cls0.250 420.333 430.613 370.229 390.163 410.493 400.000 390.304 380.107 380.147 430.100 420.052 410.231 390.119 430.039 430.445 430.325 370.654 420.141 42
3D-BEVIS0.248 430.667 310.566 380.076 460.035 470.394 450.027 330.035 460.098 390.099 450.030 460.025 440.098 440.375 410.126 400.604 370.181 450.854 390.171 41
Cathrin Elich, Francis Engelmann, Jonas Schult, Theodora Kontogianni, Bastian Leibe: 3D-BEVIS: Birds-Eye-View Instance Segmentation.
tmp0.248 430.667 310.437 420.188 410.153 430.491 410.000 390.208 410.094 400.153 420.099 430.057 400.217 420.119 430.039 430.466 420.302 390.640 430.140 43
ASIS0.199 450.333 430.253 460.167 440.140 440.438 440.000 390.177 430.008 440.121 440.069 440.004 460.231 400.429 400.036 450.445 440.273 420.333 460.119 45
Sgpn_scannet0.143 460.208 470.390 450.169 430.065 450.275 460.029 320.069 440.000 460.087 460.043 450.014 450.027 470.000 460.112 420.351 460.168 460.438 450.138 44
MaskRCNN 2d->3d Proj0.058 470.333 430.002 470.000 470.053 460.002 470.002 380.021 470.000 460.045 470.024 470.238 300.065 460.000 460.014 460.107 470.020 470.110 470.006 47

This table lists the benchmark results for the 2D semantic label scenario.


Method Infoavg ioubathtubbedbookshelfcabinetchaircountercurtaindeskdoorfloorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwallwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Virtual MVFusion (R)0.745 10.861 10.839 10.881 10.672 10.512 10.422 110.898 10.723 10.714 10.954 20.454 10.509 10.773 10.895 10.756 10.820 10.653 10.935 10.891 10.728 1
Abhijit Kundu, Xiaoqi Yin, Alireza Fathi, David Ross, Brian Brewington, Thomas Funkhouser, Caroline Pantofaru: Virtual Multi-view Fusion for 3D Semantic Segmentation. ECCV 2020
BPNet_2Dcopyleft0.670 20.822 30.795 30.836 20.659 20.481 20.451 70.769 30.656 30.567 30.931 30.395 30.390 40.700 20.534 30.689 60.770 20.574 30.865 40.831 30.675 3
Wenbo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia and Tien-Tsin Wong: Bidirectional Projection Network for Cross Dimension Scene Understanding. CVPR 2021 (Oral)
CU-Hybrid-2D Net0.636 30.825 20.820 20.179 170.648 30.463 30.549 10.742 40.676 20.628 20.961 10.420 20.379 50.684 40.381 110.732 20.723 30.599 20.827 90.851 20.634 4
CMX0.613 40.681 60.725 80.502 110.634 50.297 130.478 50.830 20.651 40.537 50.924 40.375 40.315 100.686 30.451 90.714 30.543 150.504 50.894 30.823 40.688 2
DMMF_3d0.605 50.651 70.744 70.782 30.637 40.387 40.536 20.732 50.590 60.540 40.856 140.359 70.306 110.596 70.539 20.627 130.706 40.497 70.785 130.757 120.476 14
DMMF0.597 60.543 120.755 60.749 40.585 70.338 60.494 40.704 70.598 50.494 110.911 70.347 90.327 90.593 80.527 40.675 80.646 80.513 40.842 70.774 90.527 12
MCA-Net0.595 70.533 130.756 50.746 50.590 60.334 80.506 30.670 80.587 70.500 90.905 90.366 60.352 60.601 60.506 60.669 110.648 60.501 60.839 80.769 100.516 13
RFBNet0.592 80.616 80.758 40.659 60.581 80.330 90.469 60.655 110.543 100.524 60.924 40.355 80.336 80.572 90.479 80.671 90.648 60.480 80.814 110.814 50.614 7
DCRedNet0.583 90.682 50.723 90.542 100.510 120.310 110.451 70.668 90.549 90.520 70.920 60.375 40.446 20.528 120.417 100.670 100.577 130.478 90.862 50.806 60.628 6
SSMAcopyleft0.577 100.695 40.716 110.439 130.563 90.314 100.444 90.719 60.551 80.503 80.887 110.346 100.348 70.603 50.353 130.709 40.600 110.457 110.901 20.786 70.599 8
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. International Journal of Computer Vision, 2019
SN_RN152pyrx8_RVCcopyleft0.546 110.572 100.663 140.638 80.518 100.298 120.366 160.633 130.510 120.446 130.864 120.296 120.267 130.542 110.346 140.704 50.575 140.431 130.853 60.766 110.630 5
FuseNetpermissive0.535 120.570 110.681 130.182 160.512 110.290 140.431 100.659 100.504 130.495 100.903 100.308 110.428 30.523 130.365 120.676 70.621 100.470 100.762 140.779 80.541 10
Caner Hazirbas, Lingni Ma, Csaba Domokos, Daniel Cremers: FuseNet: Incorporating Depth into Semantic Segmentation via Fusion-based CNN Architecture. ACCV 2016
AdapNet++copyleft0.503 130.613 90.722 100.418 140.358 180.337 70.370 150.479 160.443 140.368 160.907 80.207 150.213 170.464 160.525 50.618 140.657 50.450 120.788 120.721 150.408 17
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. International Journal of Computer Vision, 2019
3DMV (2d proj)0.498 140.481 160.612 150.579 90.456 140.343 50.384 130.623 140.525 110.381 150.845 150.254 140.264 150.557 100.182 160.581 160.598 120.429 140.760 150.661 170.446 16
Angela Dai, Matthias Niessner: 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation. ECCV'18
MSeg1080_RVCpermissive0.485 150.505 140.709 120.092 180.427 150.241 150.411 120.654 120.385 180.457 120.861 130.053 180.279 120.503 140.481 70.645 120.626 90.365 160.748 160.725 140.529 11
John Lambert*, Zhuang Liu*, Ozan Sener, James Hays, Vladlen Koltun: MSeg: A Composite Dataset for Multi-domain Semantic Segmentation. CVPR 2020
ILC-PSPNet0.475 160.490 150.581 160.289 150.507 130.067 180.379 140.610 150.417 160.435 140.822 170.278 130.267 130.503 140.228 150.616 150.533 160.375 150.820 100.729 130.560 9
Enet (reimpl)0.376 170.264 180.452 180.452 120.365 160.181 160.143 180.456 170.409 170.346 170.769 180.164 160.218 160.359 170.123 180.403 180.381 180.313 180.571 170.685 160.472 15
Re-implementation of Adam Paszke, Abhishek Chaurasia, Sangpil Kim, Eugenio Culurciello: ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation.
ScanNet (2d proj)permissive0.330 180.293 170.521 170.657 70.361 170.161 170.250 170.004 180.440 150.183 180.836 160.125 170.060 180.319 180.132 170.417 170.412 170.344 170.541 180.427 180.109 18
Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner: ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. CVPR'17

This table lists the benchmark results for the 2D semantic instance scenario.




Method Infoavg apbathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
UniDet_RVC0.205 10.381 10.323 10.037 10.226 10.177 10.063 10.277 10.120 10.067 10.131 10.074 20.317 10.080 10.235 10.289 10.141 10.678 10.080 1
MaskRCNN_ScanNetpermissive0.119 20.129 20.212 20.002 20.112 20.148 20.014 20.205 20.044 20.066 20.078 20.095 10.142 20.030 20.128 20.139 20.080 20.459 20.057 2
Re-implementation of Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross Girshick: Mask R-CNN. ICCV'17

This table lists the benchmark results for the scene type classification scenario.




Method Infoavg recallapartmentbathroombedroom / hotelbookstore / libraryconference roomcopy/mail roomhallwaykitchenlaundry roomliving room / loungemiscofficestorage / basement / garage
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
multi-taskpermissive0.700 10.500 11.000 10.882 20.500 21.000 11.000 10.500 21.000 11.000 10.778 10.000 20.938 10.000 2
Shengyu Huang, Mikhail Usvyatsov, Konrad Schindler: Indoor Scene Recognition in 3D. IROS 2020
3DASPP-SCE0.691 20.500 10.938 20.824 31.000 11.000 10.500 21.000 10.857 20.500 20.556 30.000 20.812 20.500 1
SE-ResNeXt-SSMA0.498 30.000 40.812 30.941 10.500 20.500 30.500 20.500 20.429 40.500 20.667 20.500 10.625 30.000 2
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. arXiv
resnet50_scannet0.353 40.250 30.812 30.529 40.500 20.500 30.000 40.500 20.571 30.000 40.556 30.000 20.375 40.000 2