Presenting the ScanNet200 Benchmark

We present the ScanNet200 benchmark, which studies an order of magnitude more class categories than previous version of ScanNet. The scene geometry is shared within the two tasks, but the parsing of surface annotation allows for a larger vocabulary and more realistic setting for in the wild 3D understanding methods.

The ScanNet200 benchmark includes both finer-grained categories as well as a large number of previously unaddressed classes. This induces a much more challenging setting regarding the diversity of naturally observed semantic classes seen in the raw ScanNet RGB-D observations, where the data also reflects naturally encountered class imbalances. The difference in category frequencies between ScanNet and ScanNet200 can be seen in the Figure above.

ScanNet200 Benchmark

This table lists the benchmark results for the ScanNet200 3D semantic label scenario.




Method Infoavg iouhead ioucommon ioutail ioualarm clockarmchairbackpackbagballbarbasketbathroom cabinetbathroom counterbathroom stallbathroom stall doorbathroom vanitybathtubbedbenchbicyclebinblackboardblanketblindsboardbookbookshelfbottlebowlboxbroombucketbulletin boardcabinetcalendarcandlecartcase of water bottlescd caseceilingceiling lightchairclockclosetcloset doorcloset rodcloset wallclothesclothes dryercoat rackcoffee kettlecoffee makercoffee tablecolumncomputer towercontainercopiercouchcountercratecupcurtaincushiondecorationdeskdining tabledish rackdishwasherdividerdoordoorframedresserdumbbelldustpanend tablefanfile cabinetfire alarmfire extinguisherfireplacefloorfolded chairfurnitureguitarguitar casehair dryerhandicap barhatheadphonesironing boardjacketkeyboardkeyboard pianokitchen cabinetkitchen counterladderlamplaptoplaundry basketlaundry detergentlaundry hamperledgelightlight switchluggagemachinemailboxmatmattressmicrowavemini fridgemirrormonitormousemusic standnightstandobjectoffice chairottomanovenpaperpaper bagpaper cutterpaper towel dispenserpaper towel rollpersonpianopicturepillarpillowpipeplantplateplungerposterpotted plantpower outletpower stripprinterprojectorprojector screenpurserackradiatorrailrange hoodrecycling binrefrigeratorscaleseatshelfshoeshowershower curtainshower curtain rodshower doorshower floorshower headshower wallsignsinksoap dishsoap dispensersofa chairspeakerstair railstairsstandstoolstorage binstorage containerstorage organizerstovestructurestuffed animalsuitcasetabletelephonetissue boxtoastertoaster oventoilettoilet papertoilet paper dispensertoilet paper holdertoilet seat cover dispensertoweltrash bintrash cantraytubetvtv standvacuum cleanerventwallwardrobewashing machinewater bottlewater coolerwater pitcherwhiteboardwindowwindowsill
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
LGroundpermissive0.272 10.485 10.184 10.106 10.476 10.077 20.218 10.000 10.000 10.000 10.547 10.295 10.540 10.746 20.745 10.058 20.112 30.005 10.658 20.077 30.000 20.322 10.178 30.512 20.190 10.199 10.277 10.000 10.000 10.173 10.399 10.000 10.000 10.039 20.000 20.858 10.085 20.676 10.002 10.103 10.498 10.323 10.703 10.000 10.000 10.296 10.549 20.216 10.702 10.768 10.718 10.028 20.092 20.786 20.000 10.000 20.453 20.022 20.251 30.252 10.572 10.348 10.321 10.514 10.063 20.279 20.552 10.000 20.019 20.932 10.132 20.000 10.000 10.000 30.156 30.457 10.623 10.518 10.265 20.358 20.381 10.395 10.000 10.000 10.127 30.012 30.051 10.000 10.000 20.886 20.014 10.437 30.179 10.244 10.826 10.000 10.000 10.599 10.136 10.085 20.000 20.000 10.000 10.565 10.612 10.143 10.207 10.566 10.232 20.446 10.127 10.708 20.000 20.384 10.000 10.000 10.000 10.402 10.000 10.059 10.000 10.525 30.566 10.229 20.659 20.000 10.000 10.265 10.446 10.147 20.720 30.597 20.066 20.000 10.187 10.000 10.726 10.467 30.134 30.000 20.413 30.629 20.000 10.363 20.055 30.022 20.000 10.626 10.000 20.000 10.323 20.479 30.154 20.117 10.028 20.901 10.243 10.415 30.295 30.143 30.610 20.000 10.000 20.777 10.397 30.324 20.000 10.778 10.179 10.702 20.000 10.274 30.404 10.233 10.622 10.398 2
David Rozenberszki, Or Litany, Angela Dai: Language-Grounded Indoor 3D Semantic Segmentation in the Wild. arXiv
Minkowski 34Dpermissive0.253 20.463 20.154 20.102 20.381 30.084 10.134 30.000 10.000 10.000 10.386 20.141 30.279 30.737 30.703 20.014 30.164 20.000 30.663 10.092 20.000 20.224 20.291 10.531 10.056 30.000 20.242 20.000 10.000 10.013 20.331 20.000 10.000 10.035 30.001 10.858 10.059 30.650 30.000 30.056 30.353 20.299 20.670 20.000 10.000 10.284 20.484 30.071 30.594 20.720 20.710 20.027 30.068 30.813 10.000 10.005 10.492 10.164 10.274 20.111 30.571 20.307 30.293 20.307 30.150 10.163 30.531 20.002 10.545 10.932 10.093 30.000 10.000 10.002 20.159 20.368 30.581 30.440 30.228 30.406 10.282 30.294 20.000 10.000 10.189 20.060 10.036 30.000 10.000 20.897 10.000 30.525 20.025 30.205 30.771 30.000 10.000 10.593 20.108 30.044 30.000 20.000 10.000 10.282 30.589 20.094 20.169 20.466 30.227 30.419 30.125 20.757 10.002 10.334 20.000 10.000 10.000 10.357 20.000 10.000 20.000 10.582 10.513 30.337 10.612 30.000 10.000 10.250 20.352 30.136 30.724 20.655 10.280 10.000 10.046 30.000 10.606 30.559 10.159 10.102 10.445 10.655 10.000 10.310 30.117 10.000 30.000 10.581 30.026 10.000 10.265 30.483 20.084 30.097 30.044 10.865 30.142 30.588 10.351 10.272 10.596 30.000 10.003 10.622 20.720 10.096 30.000 10.771 20.016 20.772 10.000 10.302 20.194 20.214 20.621 20.197 3
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019
CSC-Pretrainpermissive0.249 30.455 30.171 30.079 30.418 20.059 30.186 20.000 10.000 10.000 10.335 30.250 20.316 20.766 10.697 30.142 10.170 10.003 20.553 30.112 10.097 10.201 30.186 20.476 30.081 20.000 20.216 30.000 10.000 10.001 30.314 30.000 10.000 10.055 10.000 20.832 30.094 10.659 20.002 10.076 20.310 30.293 30.664 30.000 10.000 10.175 30.634 10.130 20.552 30.686 30.700 30.076 10.110 10.770 30.000 10.000 20.430 30.000 30.319 10.166 20.542 30.327 20.205 30.332 20.052 30.375 10.444 30.000 20.012 30.930 30.203 10.000 10.000 10.046 10.175 10.413 20.592 20.471 20.299 10.152 30.340 20.247 30.000 10.000 10.225 10.058 20.037 20.000 10.207 10.862 30.014 10.548 10.033 20.233 20.816 20.000 10.000 10.542 30.123 20.121 10.019 10.000 10.000 10.463 20.454 30.045 30.128 30.557 20.235 10.441 20.063 30.484 30.000 20.308 30.000 10.000 10.000 10.318 30.000 10.000 20.000 10.545 20.543 20.164 30.734 10.000 10.000 10.215 30.371 20.198 10.743 10.205 30.062 30.000 10.079 20.000 10.683 20.547 20.142 20.000 20.441 20.579 30.000 10.464 10.098 20.041 10.000 10.590 20.000 20.000 10.373 10.494 10.174 10.105 20.001 30.895 20.222 20.537 20.307 20.180 20.625 10.000 10.000 20.591 30.609 20.398 10.000 10.766 30.014 30.638 30.000 10.377 10.004 30.206 30.609 30.465 1
Ji Hou, Benjamin Graham, Matthias Nießner, Saining Xie: Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts. CVPR 2021

This table lists the benchmark results for the ScanNet200 3D semantic instance scenario.




Method Infoavg ap 50%head ap 50%common ap 50%tail ap 50%alarm clockarmchairbackpackbagballbarbasketbathroom cabinetbathroom counterbathroom stallbathroom stall doorbathroom vanitybathtubbedbenchbicyclebinblackboardblanketblindsboardbookbookshelfbottlebowlboxbroombucketbulletin boardcabinetcalendarcandlecartcase of water bottlescd caseceilingceiling lightchairclockclosetcloset doorcloset rodcloset wallclothesclothes dryercoat rackcoffee kettlecoffee makercoffee tablecolumncomputer towercontainercopiercouchcountercratecupcurtaincushiondecorationdeskdining tabledish rackdishwasherdividerdoordoorframedresserdumbbelldustpanend tablefanfile cabinetfire alarmfire extinguisherfireplacefolded chairfurnitureguitarguitar casehair dryerhandicap barhatheadphonesironing boardjacketkeyboardkeyboard pianokitchen cabinetkitchen counterladderlamplaptoplaundry basketlaundry detergentlaundry hamperledgelightlight switchluggagemachinemailboxmatmattressmicrowavemini fridgemirrormonitormousemusic standnightstandobjectoffice chairottomanovenpaperpaper bagpaper cutterpaper towel dispenserpaper towel rollpersonpianopicturepillarpillowpipeplantplateplungerposterpotted plantpower outletpower stripprinterprojectorprojector screenpurserackradiatorrailrange hoodrecycling binrefrigeratorscaleseatshelfshoeshowershower curtainshower curtain rodshower doorshower floorshower headshower wallsignsinksoap dishsoap dispensersofa chairspeakerstair railstairsstandstoolstorage binstorage containerstorage organizerstovestructurestuffed animalsuitcasetabletelephonetissue boxtoastertoaster oventoilettoilet papertoilet paper dispensertoilet paper holdertoilet seat cover dispensertoweltrash bintrash cantraytubetvtv standvacuum cleanerventwardrobewashing machinewater bottlewater coolerwater pitcherwhiteboardwindowwindowsill
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
No results yet.

ScanNet Benchmark

This table lists the benchmark results for the 3D semantic label scenario.


Method Infoavg ioubathtubbedbookshelfcabinetchaircountercurtaindeskdoorfloorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwallwindow
sort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Mix3Dpermissive0.781 10.964 10.855 10.843 80.781 10.858 60.575 20.831 120.685 40.714 10.979 10.594 30.310 140.801 10.892 50.841 20.819 30.723 20.940 50.887 10.725 8
Alexey Nekrasov, Jonas Schult, Or Litany, Bastian Leibe, Francis Engelmann: Mix3D: Out-of-Context Data Augmentation for 3D Scenes. 3DV 2021 (Oral)
O-CNNpermissive0.762 30.924 20.823 40.844 70.770 20.852 80.577 10.847 80.711 10.640 110.958 70.592 40.217 460.762 60.888 60.758 60.813 50.726 10.932 110.868 60.744 2
Peng-Shuai Wang, Yang Liu, Yu-Xiao Guo, Chun-Yu Sun, Xin Tong: O-CNN: Octree-based Convolutional Neural Networks for 3D Shape Analysis. SIGGRAPH 2017
StratifiedFormerpermissive0.747 60.901 40.803 100.845 60.757 30.846 100.512 110.825 130.696 30.645 70.956 90.576 70.262 330.744 110.861 90.742 80.770 240.705 30.899 210.860 90.734 3
Xin Lai*, Jianhui Liu*, Li Jiang, Liwei Wang, Hengshuang Zhao, Shu Liu, Xiaojuan Qi, Jiaya Jia: Stratified Transformer for 3D Point Cloud Segmentation. CVPR 2022
BPNetcopyleft0.749 40.909 30.818 70.811 140.752 40.839 120.485 210.842 100.673 60.644 80.957 80.528 160.305 160.773 40.859 100.788 40.818 40.693 50.916 120.856 110.723 9
Wenbo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia, Tien-Tsin Wong: Bidirectional Projection Network for Cross Dimension Scene Understanding. CVPR 2021 (Oral)
PointConvFormer0.749 40.793 240.790 150.807 170.750 50.856 70.524 80.881 30.588 290.642 100.977 40.591 50.274 270.781 20.929 10.804 30.796 110.642 130.947 30.885 20.715 10
OccuSeg+Semantic0.764 20.758 380.796 120.839 90.746 60.907 10.562 30.850 70.680 50.672 30.978 20.610 10.335 70.777 30.819 220.847 10.830 10.691 60.972 10.885 20.727 6
MatchingNet0.724 130.812 210.812 90.810 150.735 70.834 150.495 180.860 50.572 350.602 200.954 140.512 190.280 240.757 80.845 180.725 120.780 180.606 270.937 70.851 140.700 13
EQ-Net0.743 90.620 670.799 110.849 30.730 80.822 230.493 190.897 20.664 70.681 20.955 120.562 100.378 10.760 70.903 20.738 90.801 90.673 80.907 150.877 40.745 1
Zetong Yang*, Li Jiang*, Yanan Sun, Bernt Schiele, Jiaya JIa: A Unified Query-based Paradigm for Point Cloud Understanding. CVPR 2022
VMNetpermissive0.746 70.870 90.838 20.858 20.729 90.850 90.501 140.874 40.587 300.658 50.956 90.564 90.299 170.765 50.900 30.716 160.812 60.631 180.939 60.858 100.709 11
Zeyu HU, Xuyang Bai, Jiaxiang Shang, Runze Zhang, Jiayu Dong, Xin Wang, Guangyuan Sun, Hongbo Fu, Chiew-Lan Tai: VMNet: Voxel-Mesh Network for Geodesic-Aware 3D Semantic Segmentation. ICCV 2021 (Oral)
IPCA0.731 110.890 50.837 30.864 10.726 100.873 20.530 70.824 140.489 610.647 60.978 20.609 20.336 60.624 270.733 350.758 60.776 200.570 410.949 20.877 40.728 4
SparseConvNet0.725 120.647 630.821 50.846 50.721 110.869 30.533 60.754 310.603 250.614 140.955 120.572 80.325 100.710 130.870 70.724 130.823 20.628 190.934 90.865 80.683 16
One Thing One Click0.701 170.825 180.796 120.723 370.716 120.832 160.433 470.816 150.634 140.609 160.969 60.418 550.344 40.559 440.833 190.715 170.808 70.560 450.902 180.847 160.680 17
MinkowskiNetpermissive0.736 100.859 110.818 70.832 100.709 130.840 110.521 100.853 60.660 90.643 90.951 200.544 110.286 230.731 120.893 40.675 280.772 220.683 70.874 410.852 130.727 6
C. Choy, J. Gwak, S. Savarese: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019
PicassoNet-IIpermissive0.696 190.704 530.790 150.787 230.709 130.837 130.459 310.815 170.543 450.615 130.956 90.529 140.250 360.551 470.790 270.703 210.799 100.619 220.908 140.848 150.700 13
Huan Lei, Naveed Akhtar, Mubarak Shah, and Ajmal Mian: Geometric feature learning for 3D meshes.
INS-Conv-semantic0.717 140.751 410.759 270.812 130.704 150.868 40.537 50.842 100.609 210.608 170.953 160.534 120.293 190.616 280.864 80.719 150.793 140.640 140.933 100.845 180.663 20
Virtual MVFusion0.746 70.771 320.819 60.848 40.702 160.865 50.397 580.899 10.699 20.664 40.948 300.588 60.330 80.746 100.851 150.764 50.796 110.704 40.935 80.866 70.728 4
Abhijit Kundu, Xiaoqi Yin, Alireza Fathi, David Ross, Brian Brewington, Thomas Funkhouser, Caroline Pantofaru: Virtual Multi-view Fusion for 3D Semantic Segmentation. ECCV 2020
VACNN++0.684 240.728 490.757 300.776 270.690 170.804 410.464 290.816 150.577 340.587 250.945 390.508 210.276 260.671 140.710 400.663 330.750 320.589 360.881 350.832 210.653 23
contrastBoundarypermissive0.705 150.769 350.775 210.809 160.687 180.820 260.439 450.812 190.661 80.591 230.945 390.515 180.171 630.633 240.856 110.720 140.796 110.668 90.889 290.847 160.689 15
Liyao Tang, Yibing Zhan, Zhe Chen, Baosheng Yu, Dacheng Tao: Contrastive Boundary Learning for Point Cloud Segmentation. CVPR2022
One-Thing-One-Click0.693 200.743 430.794 140.655 610.684 190.822 230.497 170.719 410.622 170.617 120.977 40.447 430.339 50.750 90.664 490.703 210.790 160.596 300.946 40.855 120.647 26
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
CU-Hybrid Net0.693 200.596 710.789 170.803 190.677 200.800 430.469 250.846 90.554 430.591 230.948 300.500 220.316 120.609 290.847 170.732 100.808 70.593 330.894 250.839 190.652 24
RFCR0.702 160.889 60.745 350.813 120.672 210.818 300.493 190.815 170.623 160.610 150.947 330.470 310.249 380.594 320.848 160.705 200.779 190.646 120.892 270.823 250.611 36
Jingyu Gong, Jiachen Xu, Xin Tan, Haichuan Song, Yanyun Qu, Yuan Xie, Lizhuang Ma: Omni-Supervised Point Cloud Segmentation via Gradual Receptive Field Component Reasoning. CVPR2021
JSENetpermissive0.699 180.881 80.762 250.821 110.667 220.800 430.522 90.792 230.613 180.607 180.935 570.492 240.205 510.576 380.853 130.691 230.758 290.652 110.872 440.828 220.649 25
Zeyu HU, Mingmin Zhen, Xuyang BAI, Hongbo Fu, Chiew-lan Tai: JSENet: Joint Semantic Segmentation and Edge Detection Network for 3D Point Clouds. ECCV 2020
FusionNet0.688 230.704 530.741 390.754 340.656 230.829 180.501 140.741 360.609 210.548 310.950 240.522 170.371 20.633 240.756 300.715 170.771 230.623 200.861 510.814 290.658 21
Feihu Zhang, Jin Fang, Benjamin Wah, Philip Torr: Deep FusionNet for Point Cloud Semantic Segmentation. ECCV 2020
PointASNLpermissive0.666 310.703 550.781 190.751 360.655 240.830 170.471 240.769 270.474 640.537 350.951 200.475 290.279 250.635 220.698 440.675 280.751 310.553 500.816 620.806 330.703 12
Xu Yan, Chaoda Zheng, Zhen Li, Sheng Wang, Shuguang Cui: PointASNL: Robust Point Clouds Processing using Nonlocal Neural Networks with Adaptive Sampling. CVPR 2020
Superpoint Network0.683 260.851 130.728 440.800 210.653 250.806 390.468 260.804 200.572 350.602 200.946 360.453 400.239 410.519 530.822 200.689 260.762 270.595 320.895 240.827 230.630 32
SALANet0.670 300.816 200.770 230.768 300.652 260.807 380.451 330.747 330.659 100.545 320.924 670.473 300.149 730.571 400.811 240.635 460.746 330.623 200.892 270.794 420.570 52
KP-FCNN0.684 240.847 140.758 290.784 250.647 270.814 330.473 230.772 260.605 230.594 220.935 570.450 410.181 610.587 330.805 250.690 240.785 170.614 230.882 330.819 280.632 31
H. Thomas, C. Qi, J. Deschaud, B. Marcotegui, F. Goulette, L. Guibas.: KPConv: Flexible and Deformable Convolution for Point Clouds. ICCV 2019
Feature_GeometricNetpermissive0.690 220.884 70.754 310.795 220.647 270.818 300.422 490.802 220.612 190.604 190.945 390.462 340.189 580.563 430.853 130.726 110.765 250.632 170.904 160.821 270.606 40
Kangcheng Liu, Ben M. Chen: https://arxiv.org/abs/2012.09439. arXiv Preprint
PointConvpermissive0.666 310.781 270.759 270.699 450.644 290.822 230.475 220.779 240.564 400.504 490.953 160.428 490.203 530.586 350.754 310.661 340.753 300.588 370.902 180.813 310.642 27
Wenxuan Wu, Zhongang Qi, Li Fuxin: PointConv: Deep Convolutional Networks on 3D Point Clouds. CVPR 2019
PointContrast_LA_SEM0.683 260.757 390.784 180.786 240.639 300.824 220.408 530.775 250.604 240.541 330.934 610.532 130.269 300.552 450.777 280.645 430.793 140.640 140.913 130.824 240.671 18
PPCNN++permissive0.663 330.746 420.708 470.722 380.638 310.820 260.451 330.566 680.599 270.541 330.950 240.510 200.313 130.648 190.819 220.616 520.682 550.590 350.869 470.810 320.656 22
Pyunghwan Ahn, Juyoung Yang, Eojindl Yi, Chanho Lee, Junmo Kim: Projection-based Point Convolution for Efficient Point Cloud Segmentation. IEEE Access
ROSMRF3D0.673 290.789 250.748 330.763 320.635 320.814 330.407 550.747 330.581 330.573 260.950 240.484 250.271 290.607 300.754 310.649 380.774 210.596 300.883 320.823 250.606 40
joint point-basedpermissive0.634 460.614 680.778 200.667 570.633 330.825 210.420 500.804 200.467 660.561 280.951 200.494 230.291 200.566 410.458 650.579 620.764 260.559 470.838 560.814 290.598 44
Hung-Yueh Chiang, Yen-Liang Lin, Yueh-Cheng Liu, Winston H. Hsu: A Unified Point-Based Framework for 3D Segmentation. 3DV 2019
VI-PointConv0.676 280.770 340.754 310.783 260.621 340.814 330.552 40.758 290.571 370.557 290.954 140.529 140.268 320.530 510.682 450.675 280.719 400.603 280.888 300.833 200.665 19
Xingyi Li, Wenxuan Wu, Xiaoli Z. Fern, Li Fuxin: The Devils in the Point Clouds: Studying the Robustness of Point Cloud Convolutions.
DCM-Net0.658 340.778 280.702 500.806 180.619 350.813 360.468 260.693 480.494 570.524 410.941 490.449 420.298 180.510 550.821 210.675 280.727 390.568 430.826 590.803 350.637 29
Jonas Schult*, Francis Engelmann*, Theodora Kontogianni, Bastian Leibe: DualConvMesh-Net: Joint Geodesic and Euclidean Convolutions on 3D Meshes. CVPR 2020 [Oral]
Supervoxel-CNN0.635 450.656 610.711 460.719 400.613 360.757 640.444 420.765 280.534 470.566 270.928 650.478 280.272 280.636 210.531 590.664 320.645 660.508 640.864 500.792 470.611 36
3DSM_DMMF0.631 490.626 660.745 350.801 200.607 370.751 650.506 120.729 400.565 390.491 510.866 800.434 450.197 560.595 310.630 520.709 190.705 480.560 450.875 390.740 660.491 69
FPConvpermissive0.639 410.785 260.760 260.713 430.603 380.798 460.392 600.534 730.603 250.524 410.948 300.457 360.250 360.538 490.723 380.598 570.696 510.614 230.872 440.799 360.567 54
Yiqun Lin, Zizheng Yan, Haibin Huang, Dong Du, Ligang Liu, Shuguang Cui, Xiaoguang Han: FPConv: Learning Local Flattening for Point Convolution. CVPR 2020
SegGroup_sempermissive0.627 540.818 190.747 340.701 440.602 390.764 610.385 650.629 590.490 590.508 460.931 640.409 570.201 540.564 420.725 370.618 500.692 520.539 580.873 420.794 420.548 61
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation.
wsss-transformer0.600 590.634 640.743 370.697 470.601 400.781 530.437 460.585 660.493 580.446 600.933 620.394 600.011 850.654 170.661 500.603 540.733 370.526 610.832 570.761 610.480 71
HPEIN0.618 560.729 480.668 640.647 640.597 410.766 600.414 510.680 500.520 500.525 400.946 360.432 460.215 470.493 600.599 540.638 440.617 710.570 410.897 220.806 330.605 42
Li Jiang, Hengshuang Zhao, Shu Liu, Xiaoyong Shen, Chi-Wing Fu, Jiaya Jia: Hierarchical Point-Edge Interaction Network for Point Cloud Semantic Segmentation. ICCV 2019
PointMTL0.632 480.731 470.688 610.675 520.591 420.784 520.444 420.565 690.610 200.492 500.949 270.456 370.254 350.587 330.706 410.599 560.665 620.612 260.868 480.791 500.579 49
MVPNetpermissive0.641 380.831 150.715 450.671 550.590 430.781 530.394 590.679 510.642 110.553 300.937 550.462 340.256 340.649 180.406 710.626 470.691 530.666 100.877 370.792 470.608 39
Maximilian Jaritz, Jiayuan Gu, Hao Su: Multi-view PointNet for 3D Scene Understanding. GMDL Workshop, ICCV 2019
FusionAwareConv0.630 520.604 700.741 390.766 310.590 430.747 660.501 140.734 380.503 550.527 390.919 710.454 380.323 110.550 480.420 700.678 270.688 540.544 530.896 230.795 400.627 33
Jiazhao Zhang, Chenyang Zhu, Lintao Zheng, Kai Xu: Fusion-Aware Point Convolution for Online Semantic 3D Scene Segmentation. CVPR 2020
PD-Net0.638 420.797 230.769 240.641 670.590 430.820 260.461 300.537 720.637 130.536 360.947 330.388 620.206 500.656 160.668 470.647 410.732 380.585 380.868 480.793 440.473 74
DVVNet0.562 660.648 620.700 520.770 290.586 460.687 730.333 710.650 550.514 530.475 560.906 750.359 670.223 440.340 740.442 690.422 780.668 610.501 650.708 730.779 530.534 63
SAFNet-segpermissive0.654 360.752 400.734 410.664 580.583 470.815 320.399 570.754 310.639 120.535 370.942 470.470 310.309 150.665 150.539 570.650 370.708 460.635 160.857 530.793 440.642 27
Linqing Zhao, Jiwen Lu, Jie Zhou: Similarity-Aware Fusion Network for 3D Semantic Segmentation. IROS 2021
RandLA-Netpermissive0.645 370.778 280.731 420.699 450.577 480.829 180.446 370.736 370.477 630.523 430.945 390.454 380.269 300.484 620.749 340.618 500.738 340.599 290.827 580.792 470.621 34
PointMRNet0.640 400.717 520.701 510.692 480.576 490.801 420.467 280.716 420.563 410.459 580.953 160.429 480.169 650.581 360.854 120.605 530.710 430.550 510.894 250.793 440.575 50
PointSPNet0.637 430.734 460.692 580.714 420.576 490.797 470.446 370.743 350.598 280.437 630.942 470.403 580.150 720.626 260.800 260.649 380.697 500.557 480.846 550.777 550.563 55
MCCNNpermissive0.633 470.866 100.731 420.771 280.576 490.809 370.410 520.684 490.497 560.491 510.949 270.466 330.105 770.581 360.646 510.620 480.680 570.542 560.817 610.795 400.618 35
P. Hermosilla, T. Ritschel, P.P. Vazquez, A. Vinacua, T. Ropinski: Monte Carlo Convolution for Learning on Non-Uniformly Sampled Point Clouds. SIGGRAPH Asia 2018
SConv0.636 440.830 160.697 540.752 350.572 520.780 550.445 390.716 420.529 480.530 380.951 200.446 440.170 640.507 570.666 480.636 450.682 550.541 570.886 310.799 360.594 46
HPGCNN0.656 350.698 560.743 370.650 620.564 530.820 260.505 130.758 290.631 150.479 540.945 390.480 270.226 420.572 390.774 290.690 240.735 360.614 230.853 540.776 560.597 45
Jisheng Dang, Qingyong Hu, Yulan Guo, Jun Yang: HPGCNN.
ROSMRF0.580 630.772 310.707 480.681 510.563 540.764 610.362 670.515 740.465 670.465 570.936 560.427 510.207 490.438 650.577 550.536 670.675 590.486 690.723 720.779 530.524 65
SIConv0.625 550.830 160.694 560.757 330.563 540.772 590.448 360.647 570.520 500.509 450.949 270.431 470.191 570.496 590.614 530.647 410.672 600.535 600.876 380.783 520.571 51
PointConv-SFPN0.641 380.776 300.703 490.721 390.557 560.826 200.451 330.672 530.563 410.483 530.943 460.425 520.162 680.644 200.726 360.659 350.709 450.572 400.875 390.786 510.559 57
APCF-Net0.631 490.742 440.687 630.672 530.557 560.792 500.408 530.665 540.545 440.508 460.952 190.428 490.186 590.634 230.702 420.620 480.706 470.555 490.873 420.798 380.581 48
Haojia, Lin: Adaptive Pyramid Context Fusion for Point Cloud Perception. GRSL
DenSeR0.628 530.800 220.625 740.719 400.545 580.806 390.445 390.597 620.448 700.519 440.938 540.481 260.328 90.489 610.499 640.657 360.759 280.592 340.881 350.797 390.634 30
SPH3D-GCNpermissive0.610 570.858 120.772 220.489 790.532 590.792 500.404 560.643 580.570 380.507 480.935 570.414 560.046 830.510 550.702 420.602 550.705 480.549 520.859 520.773 570.534 63
Huan Lei, Naveed Akhtar, and Ajmal Mian: Spherical Kernel for Efficient Graph Convolution on 3D Point Clouds. TPAMI 2020
PointNet2-SFPN0.631 490.771 320.692 580.672 530.524 600.837 130.440 440.706 460.538 460.446 600.944 440.421 540.219 450.552 450.751 330.591 590.737 350.543 550.901 200.768 580.557 58
AttAN0.609 580.760 370.667 650.649 630.521 610.793 480.457 320.648 560.528 490.434 650.947 330.401 590.153 710.454 640.721 390.648 400.717 410.536 590.904 160.765 590.485 70
Gege Zhang, Qinghua Ma, Licheng Jiao, Fang Liu and Qigong Sun: AttAN: Attention Adversarial Networks for 3D Point Cloud Semantic Segmentation. IJCAI2020
SQN_0.1%0.569 640.676 580.696 550.657 600.497 620.779 560.424 480.548 700.515 520.376 700.902 780.422 530.357 30.379 710.456 660.596 580.659 630.544 530.685 750.665 780.556 59
TextureNetpermissive0.566 650.672 600.664 660.671 550.494 630.719 690.445 390.678 520.411 760.396 680.935 570.356 680.225 430.412 690.535 580.565 640.636 700.464 720.794 650.680 750.568 53
Jingwei Huang, Haotian Zhang, Li Yi, Thomas Funkerhouser, Matthias Niessner, Leonidas Guibas: TextureNet: Consistent Local Parametrizations for Learning from High-Resolution Signals on Meshes. CVPR
Pointnet++ & Featurepermissive0.557 670.735 450.661 670.686 490.491 640.744 670.392 600.539 710.451 690.375 710.946 360.376 640.205 510.403 700.356 740.553 660.643 670.497 660.824 600.756 620.515 66
DPC0.592 610.720 500.700 520.602 720.480 650.762 630.380 660.713 440.585 320.437 630.940 510.369 650.288 210.434 670.509 630.590 610.639 690.567 440.772 670.755 630.592 47
Francis Engelmann, Theodora Kontogianni, Bastian Leibe: Dilated Point Convolutions: On the Receptive Field Size of Point Convolutions on 3D Point Clouds. ICRA 2020
3DMV, FTSDF0.501 730.558 750.608 770.424 830.478 660.690 720.246 790.586 650.468 650.450 590.911 730.394 600.160 690.438 650.212 800.432 770.541 780.475 710.742 700.727 680.477 72
GMLPs0.538 690.495 790.693 570.647 640.471 670.793 480.300 730.477 750.505 540.358 730.903 770.327 720.081 800.472 630.529 600.448 760.710 430.509 620.746 690.737 670.554 60
CCRFNet0.589 620.766 360.659 680.683 500.470 680.740 680.387 640.620 610.490 590.476 550.922 690.355 690.245 390.511 540.511 620.571 630.643 670.493 680.872 440.762 600.600 43
LAP-D0.594 600.720 500.692 580.637 680.456 690.773 580.391 620.730 390.587 300.445 620.940 510.381 630.288 210.434 670.453 670.591 590.649 640.581 390.777 660.749 650.610 38
subcloud_weak0.516 710.676 580.591 790.609 690.442 700.774 570.335 700.597 620.422 750.357 740.932 630.341 710.094 790.298 760.528 610.473 740.676 580.495 670.602 800.721 690.349 81
Online SegFusion0.515 720.607 690.644 720.579 740.434 710.630 790.353 680.628 600.440 710.410 660.762 840.307 740.167 660.520 520.403 720.516 680.565 730.447 760.678 760.701 720.514 67
Davide Menini, Suryansh Kumar, Martin R. Oswald, Erik Sandstroem, Cristian Sminchisescu, Luc van Gool: A Real-Time Learning Framework for Joint 3D Reconstruction and Semantic Segmentation. Robotics and Automation Letters Submission
PointMRNet-lite0.553 680.633 650.648 690.659 590.430 720.800 430.390 630.592 640.454 680.371 720.939 530.368 660.136 750.368 720.448 680.560 650.715 420.486 690.882 330.720 700.462 75
3DMV0.484 750.484 810.538 810.643 660.424 730.606 820.310 720.574 670.433 740.378 690.796 820.301 750.214 480.537 500.208 810.472 750.507 820.413 800.693 740.602 810.539 62
Angela Dai, Matthias Niessner: 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation. ECCV'18
PCNN0.498 740.559 740.644 720.560 760.420 740.711 710.229 810.414 760.436 720.352 750.941 490.324 730.155 700.238 800.387 730.493 700.529 790.509 620.813 630.751 640.504 68
PanopticFusion-label0.529 700.491 800.688 610.604 710.386 750.632 780.225 830.705 470.434 730.293 780.815 810.348 700.241 400.499 580.669 460.507 690.649 640.442 770.796 640.602 810.561 56
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
FCPNpermissive0.447 770.679 570.604 780.578 750.380 760.682 740.291 760.106 850.483 620.258 830.920 700.258 800.025 840.231 820.325 750.480 730.560 750.463 730.725 710.666 770.231 85
Dario Rethage, Johanna Wald, Jürgen Sturm, Nassir Navab, Federico Tombari: Fully-Convolutional Point Networks for Large-Scale Point Clouds. ECCV 2018
Tangent Convolutionspermissive0.438 800.437 830.646 710.474 800.369 770.645 770.353 680.258 820.282 840.279 790.918 720.298 760.147 740.283 770.294 760.487 710.562 740.427 790.619 790.633 790.352 80
Maxim Tatarchenko, Jaesik Park, Vladlen Koltun, Qian-Yi Zhou: Tangent convolutions for dense prediction in 3d. CVPR 2018
PNET20.442 780.548 760.548 800.597 730.363 780.628 800.300 730.292 800.374 780.307 770.881 790.268 790.186 590.238 800.204 820.407 790.506 830.449 750.667 770.620 800.462 75
ScanNet+FTSDF0.383 830.297 850.491 830.432 820.358 790.612 810.274 770.116 840.411 760.265 810.904 760.229 820.079 810.250 780.185 830.320 830.510 800.385 820.548 820.597 840.394 78
SurfaceConvPF0.442 780.505 780.622 750.380 840.342 800.654 760.227 820.397 780.367 790.276 800.924 670.240 810.198 550.359 730.262 770.366 800.581 720.435 780.640 780.668 760.398 77
Hao Pan, Shilin Liu, Yang Liu, Xin Tong: Convolutional Neural Networks on 3D Surfaces Using Parallel Frames.
3DWSSS0.425 810.525 770.647 700.522 770.324 810.488 850.077 860.712 450.353 800.401 670.636 860.281 780.176 620.340 740.565 560.175 860.551 760.398 810.370 860.602 810.361 79
PointCNN with RGBpermissive0.458 760.577 730.611 760.356 850.321 820.715 700.299 750.376 790.328 820.319 760.944 440.285 770.164 670.216 830.229 790.484 720.545 770.456 740.755 680.709 710.475 73
Yangyan Li, Rui Bu, Mingchao Sun, Baoquan Chen: PointCNN. NeurIPS 2018
SPLAT Netcopyleft0.393 820.472 820.511 820.606 700.311 830.656 750.245 800.405 770.328 820.197 840.927 660.227 830.000 870.001 880.249 780.271 850.510 800.383 830.593 810.699 730.267 83
Hang Su, Varun Jampani, Deqing Sun, Subhransu Maji, Evangelos Kalogerakis, Ming-Hsuan Yang, Jan Kautz: SPLATNet: Sparse Lattice Networks for Point Cloud Processing. CVPR 2018
ScanNetpermissive0.306 860.203 860.366 850.501 780.311 830.524 840.211 840.002 870.342 810.189 850.786 830.145 860.102 780.245 790.152 840.318 840.348 850.300 850.460 850.437 860.182 86
Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner: ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. CVPR'17
PointNet++permissive0.339 840.584 720.478 840.458 810.256 850.360 860.250 780.247 830.278 850.261 820.677 850.183 840.117 760.212 840.145 850.364 810.346 860.232 860.548 820.523 850.252 84
Charles R. Qi, Li Yi, Hao Su, Leonidas J. Guibas: pointnet++: deep hierarchical feature learning on point sets in a metric space.
SSC-UNetpermissive0.308 850.353 840.290 860.278 860.166 860.553 830.169 850.286 810.147 860.148 860.908 740.182 850.064 820.023 870.018 870.354 820.363 840.345 840.546 840.685 740.278 82
ERROR0.054 870.000 870.041 870.172 870.030 870.062 880.001 880.035 860.004 870.051 870.143 870.019 880.003 860.041 860.050 860.003 880.054 870.018 880.005 870.264 870.082 87
Feature-Geometry Netpermissive0.024 880.000 870.000 880.001 880.010 880.098 870.007 870.000 880.000 880.026 880.072 880.059 870.000 870.060 850.000 880.013 870.040 880.045 870.000 880.038 880.006 88

This table lists the benchmark results for the 3D semantic instance scenario.




Method Infoavg ap 50%bathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
SoftGrouppermissive0.761 31.000 10.808 130.845 60.716 10.862 50.243 110.824 30.655 50.620 20.734 20.699 50.791 40.981 200.716 50.844 40.769 31.000 10.594 5
Thang Vu, Kookhoi Kim, Tung M. Luu, Xuan Thanh Nguyen, Chang D. Yoo: SoftGroup for 3D Instance Segmentaiton on Point Clouds. CVPR 2022 [Oral]
IPCA-Inst0.731 51.000 10.788 200.884 50.698 20.788 210.252 90.760 110.646 60.511 130.637 70.665 60.804 31.000 10.644 140.778 110.747 51.000 10.561 11
SoftGroup++0.769 11.000 10.803 160.937 10.684 30.865 30.213 140.870 20.664 30.571 50.758 10.702 40.807 21.000 10.653 130.902 10.792 21.000 10.626 1
Mask3D0.765 21.000 10.893 20.721 190.682 40.855 70.449 20.710 160.775 10.601 30.707 30.646 70.811 11.000 10.817 20.807 90.725 71.000 10.578 6
HAISpermissive0.699 91.000 10.849 40.820 70.675 50.808 140.279 50.757 120.465 160.517 120.596 80.559 100.600 181.000 10.654 120.767 130.676 120.994 270.560 12
Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang: Hierarchical Aggregation for 3D Instance Segmentation. ICCV 2021
TopoSeg0.725 61.000 10.806 150.933 20.668 60.758 240.272 70.734 150.630 70.549 90.654 40.606 80.697 70.966 220.612 170.839 50.754 41.000 10.573 7
GraphCut0.732 41.000 10.788 190.724 180.642 70.859 60.248 100.787 90.618 90.596 40.653 50.722 20.583 241.000 10.766 30.861 20.825 11.000 10.504 16
DKNet0.718 71.000 10.814 100.782 100.619 80.872 20.224 120.751 130.569 110.677 10.585 100.724 10.633 160.981 200.515 240.819 70.736 61.000 10.617 2
SphereSeg0.680 111.000 10.856 30.744 170.618 90.893 10.151 160.651 210.713 20.537 100.579 130.430 220.651 81.000 10.389 330.744 200.697 80.991 280.601 4
Mask-Group0.664 141.000 10.822 90.764 150.616 100.815 110.139 200.694 180.597 100.459 180.566 140.599 90.600 180.516 390.715 60.819 80.635 161.000 10.603 3
Min Zhong, Xinghao Chen, Xiaokang Chen, Gang Zeng, Yunhe Wang: MaskGroup: Hierarchical Point Grouping and Masking for 3D Instance Segmentation. ICME 2022
SDSC0.700 81.000 10.848 50.763 160.609 110.792 190.262 80.824 30.627 80.535 110.547 180.493 140.600 181.000 10.712 70.731 230.689 111.000 10.563 10
DENet0.629 231.000 10.797 170.608 260.589 120.627 320.219 130.882 10.310 270.402 290.383 320.396 260.650 91.000 10.663 100.543 400.691 101.000 10.568 9
INS-Conv-instance0.657 151.000 10.760 250.667 230.581 130.863 40.323 40.655 200.477 140.473 160.549 160.432 210.650 91.000 10.655 110.738 210.585 220.944 320.472 22
RWSeg0.567 270.528 410.708 340.626 240.580 140.745 260.063 290.627 220.240 310.400 300.497 220.464 160.515 271.000 10.475 260.745 190.571 241.000 10.429 26
OccuSeg+instance0.672 131.000 10.758 270.682 210.576 150.842 80.477 10.504 330.524 120.567 60.585 120.451 170.557 251.000 10.751 40.797 100.563 261.000 10.467 23
Lei Han, Tian Zheng, Lan Xu, Lu Fang: OccuSeg: Occupancy-aware 3D Instance Segmentation. CVPR2020
DD-UNet+Group0.635 220.667 320.797 180.714 200.562 160.774 230.146 170.810 70.429 190.476 150.546 190.399 250.633 161.000 10.632 150.722 240.609 181.000 10.514 13
H. Liu, R. Liu, K. Yang, J. Zhang, K. Peng, R. Stiefelhagen: HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor. ICCVW 2021
SSTNetpermissive0.698 101.000 10.697 350.888 40.556 170.803 150.387 30.626 230.417 200.556 80.585 110.702 30.600 181.000 10.824 10.720 250.692 91.000 10.509 15
Zhihao Liang, Zhihao Li, Songcen Xu, Mingkui Tan, Kui Jia: Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks. ICCV2021
RPGN0.643 181.000 10.758 260.582 310.539 180.826 90.046 310.765 100.372 230.436 230.588 90.539 130.650 91.000 10.577 180.750 180.653 150.997 220.495 19
PE0.645 171.000 10.773 220.798 90.538 190.786 220.088 270.799 80.350 250.435 240.547 170.545 110.646 150.933 230.562 200.761 160.556 310.997 220.501 18
Biao Zhang, Peter Wonka: Point Cloud Instance Segmentation using Probabilistic Embeddings. CVPR 2021
Dyco3Dcopyleft0.641 191.000 10.841 70.893 30.531 200.802 160.115 240.588 280.448 170.438 210.537 200.430 230.550 260.857 250.534 220.764 150.657 130.987 290.568 8
Tong He; Chunhua Shen; Anton van den Hengel: DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution. CVPR2021
3D-MPA0.611 241.000 10.833 80.765 140.526 210.756 250.136 220.588 280.470 150.438 220.432 280.358 290.650 90.857 250.429 290.765 140.557 291.000 10.430 25
Francis Engelmann, Martin Bokeloh, Alireza Fathi, Bastian Leibe, Matthias Nießner: 3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation. CVPR 2020
CSC-Pretrained0.648 161.000 10.810 110.768 130.523 220.813 120.143 190.819 50.389 210.422 250.511 210.443 190.650 91.000 10.624 160.732 220.634 171.000 10.375 29
MaskVoteNet_Coarse0.677 121.000 10.847 60.771 120.509 230.816 100.277 60.558 300.482 130.562 70.640 60.448 180.700 51.000 10.666 80.852 30.578 230.997 220.488 20
Occipital-SCS0.512 311.000 10.716 310.509 320.506 240.611 330.092 260.602 270.177 340.346 340.383 310.165 350.442 320.850 300.386 340.618 360.543 320.889 360.389 28
PointGroup0.636 211.000 10.765 230.624 250.505 250.797 170.116 230.696 170.384 220.441 200.559 150.476 150.596 221.000 10.666 80.756 170.556 300.997 220.513 14
Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia: PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation. CVPR 2020 [oral]
GICN0.638 201.000 10.895 10.800 80.480 260.676 280.144 180.737 140.354 240.447 190.400 300.365 280.700 51.000 10.569 190.836 60.599 191.000 10.473 21
SSEN0.575 261.000 10.761 240.473 330.477 270.795 180.066 280.529 310.658 40.460 170.461 250.380 270.331 380.859 240.401 320.692 300.653 141.000 10.348 31
Dongsu Zhang, Junha Chun, Sang Kyun Cha, Young Min Kim: Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with Deep Metric Learning. Arxiv
Sparse R-CNN0.515 301.000 10.538 420.282 360.468 280.790 200.173 150.345 380.429 180.413 280.484 230.176 340.595 230.591 370.522 230.668 320.476 350.986 300.327 32
PCJC0.578 251.000 10.810 120.583 300.449 290.813 130.042 320.603 260.341 260.490 140.465 240.410 240.650 90.835 310.264 380.694 290.561 270.889 360.504 17
One_Thing_One_Clickpermissive0.529 290.667 320.718 300.777 110.399 300.683 270.000 400.669 190.138 360.391 310.374 330.539 120.360 370.641 360.556 210.774 120.593 200.997 220.251 36
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
MASCpermissive0.447 360.528 410.555 400.381 340.382 310.633 310.002 380.509 320.260 290.361 330.432 270.327 300.451 300.571 380.367 350.639 340.386 360.980 310.276 35
Chen Liu, Yasutaka Furukawa: MASC: Multi-scale Affinity with Sparse Convolution for 3D Instance Segmentation.
SPG_WSIS0.470 340.667 320.685 360.677 220.372 320.562 370.000 400.482 340.244 300.316 370.298 350.052 430.442 330.857 250.267 370.702 260.559 281.000 10.287 34
R-PointNet0.306 410.500 430.405 450.311 350.348 330.589 340.054 300.068 460.126 380.283 380.290 360.028 440.219 420.214 430.331 360.396 460.275 420.821 410.245 37
MTML0.549 281.000 10.807 140.588 290.327 340.647 300.004 370.815 60.180 330.418 260.364 340.182 330.445 311.000 10.442 280.688 310.571 251.000 10.396 27
Jean Lahoud, Bernard Ghanem, Marc Pollefeys, Martin R. Oswald: 3D Instance Segmentation via Multi-task Metric Learning. ICCV 2019 [oral]
SegGroup_inspermissive0.445 370.667 320.773 210.185 430.317 350.656 290.000 400.407 370.134 370.381 320.267 370.217 320.476 290.714 330.452 270.629 350.514 331.000 10.222 39
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation.
3D-BoNet0.488 321.000 10.672 370.590 280.301 360.484 430.098 250.620 240.306 280.341 350.259 380.125 370.434 340.796 320.402 310.499 420.513 340.909 350.439 24
Bo Yang, Jianan Wang, Ronald Clark, Qingyong Hu, Sen Wang, Andrew Markham, Niki Trigoni: Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds. NeurIPS 2019 Spotlight
Region-18class0.284 420.250 470.751 280.228 410.270 370.521 400.000 400.468 360.008 460.205 410.127 420.000 480.068 460.070 460.262 390.652 330.323 390.740 420.173 41
SALoss-ResNet0.459 351.000 10.737 290.159 460.259 380.587 350.138 210.475 350.217 320.416 270.408 290.128 360.315 390.714 330.411 300.536 410.590 210.873 390.304 33
Zhidong Liang, Ming Yang, Hao Li, Chunxiang Wang: 3D Instance Embedding Learning With a Structure-Aware Loss Function for Point Cloud Segmentation. IEEE Robotics and Automation Letters (IROS2020)
PanopticFusion-inst0.478 330.667 320.712 330.595 270.259 390.550 390.000 400.613 250.175 350.250 400.434 260.437 200.411 360.857 250.485 250.591 390.267 450.944 320.359 30
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
3D-SISpermissive0.382 381.000 10.432 440.245 380.190 400.577 360.013 350.263 400.033 430.320 360.240 390.075 390.422 350.857 250.117 420.699 270.271 440.883 380.235 38
Ji Hou, Angela Dai, Matthias Niessner: 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans. CVPR 2019
UNet-backbone0.319 400.667 320.715 320.233 390.189 410.479 440.008 360.218 410.067 420.201 420.173 410.107 380.123 440.438 400.150 400.615 370.355 370.916 340.093 47
SemRegionNet-20cls0.250 430.333 440.613 380.229 400.163 420.493 410.000 400.304 390.107 390.147 440.100 430.052 420.231 400.119 440.039 440.445 440.325 380.654 430.141 43
Hier3Dcopyleft0.323 390.667 320.542 410.264 370.157 430.550 380.000 400.205 430.009 440.270 390.218 400.075 390.500 280.688 350.007 480.698 280.301 410.459 450.200 40
Tan: HCFS3D: Hierarchical Coupled Feature Selection Network for 3D Semantic and Instance Segmentation.
tmp0.248 440.667 320.437 430.188 420.153 440.491 420.000 400.208 420.094 410.153 430.099 440.057 410.217 430.119 440.039 440.466 430.302 400.640 440.140 44
ASIS0.199 460.333 440.253 470.167 450.140 450.438 450.000 400.177 440.008 450.121 450.069 450.004 470.231 410.429 410.036 460.445 450.273 430.333 470.119 46
Sgpn_scannet0.143 470.208 480.390 460.169 440.065 460.275 470.029 330.069 450.000 470.087 470.043 460.014 460.027 480.000 470.112 430.351 470.168 470.438 460.138 45
MaskRCNN 2d->3d Proj0.058 480.333 440.002 480.000 480.053 470.002 480.002 390.021 480.000 470.045 480.024 480.238 310.065 470.000 470.014 470.107 480.020 480.110 480.006 48
3D-BEVIS0.248 440.667 320.566 390.076 470.035 480.394 460.027 340.035 470.098 400.099 460.030 470.025 450.098 450.375 420.126 410.604 380.181 460.854 400.171 42
Cathrin Elich, Francis Engelmann, Jonas Schult, Theodora Kontogianni, Bastian Leibe: 3D-BEVIS: Birds-Eye-View Instance Segmentation.

This table lists the benchmark results for the 2D semantic label scenario.


Method Infoavg ioubathtubbedbookshelfcabinetchaircountercurtaindeskdoorfloorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwallwindow
sort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Virtual MVFusion (R)0.745 10.861 10.839 10.881 10.672 10.512 10.422 110.898 10.723 10.714 10.954 20.454 10.509 10.773 10.895 10.756 10.820 10.653 10.935 10.891 10.728 1
Abhijit Kundu, Xiaoqi Yin, Alireza Fathi, David Ross, Brian Brewington, Thomas Funkhouser, Caroline Pantofaru: Virtual Multi-view Fusion for 3D Semantic Segmentation. ECCV 2020
BPNet_2Dcopyleft0.670 20.822 30.795 30.836 20.659 20.481 20.451 70.769 30.656 30.567 30.931 30.395 30.390 40.700 20.534 30.689 60.770 20.574 30.865 40.831 30.675 3
Wenbo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia and Tien-Tsin Wong: Bidirectional Projection Network for Cross Dimension Scene Understanding. CVPR 2021 (Oral)
CU-Hybrid-2D Net0.636 30.825 20.820 20.179 170.648 30.463 30.549 10.742 40.676 20.628 20.961 10.420 20.379 50.684 40.381 110.732 20.723 30.599 20.827 90.851 20.634 4
DMMF_3d0.605 50.651 70.744 70.782 30.637 40.387 40.536 20.732 50.590 60.540 40.856 140.359 70.306 110.596 70.539 20.627 130.706 40.497 70.785 130.757 120.476 14
CMX0.613 40.681 60.725 80.502 110.634 50.297 130.478 50.830 20.651 40.537 50.924 40.375 40.315 100.686 30.451 90.714 30.543 150.504 50.894 30.823 40.688 2
MCA-Net0.595 70.533 130.756 50.746 50.590 60.334 80.506 30.670 80.587 70.500 90.905 90.366 60.352 60.601 60.506 60.669 110.648 60.501 60.839 80.769 100.516 13
DMMF0.597 60.543 120.755 60.749 40.585 70.338 60.494 40.704 70.598 50.494 110.911 70.347 90.327 90.593 80.527 40.675 80.646 80.513 40.842 70.774 90.527 12
RFBNet0.592 80.616 80.758 40.659 60.581 80.330 90.469 60.655 110.543 100.524 60.924 40.355 80.336 80.572 90.479 80.671 90.648 60.480 80.814 110.814 50.614 7
SSMAcopyleft0.577 100.695 40.716 110.439 130.563 90.314 100.444 90.719 60.551 80.503 80.887 110.346 100.348 70.603 50.353 130.709 40.600 110.457 110.901 20.786 70.599 8
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. International Journal of Computer Vision, 2019
SN_RN152pyrx8_RVCcopyleft0.546 110.572 100.663 140.638 80.518 100.298 120.366 160.633 130.510 120.446 130.864 120.296 120.267 130.542 110.346 140.704 50.575 140.431 130.853 60.766 110.630 5
FuseNetpermissive0.535 120.570 110.681 130.182 160.512 110.290 140.431 100.659 100.504 130.495 100.903 100.308 110.428 30.523 130.365 120.676 70.621 100.470 100.762 140.779 80.541 10
Caner Hazirbas, Lingni Ma, Csaba Domokos, Daniel Cremers: FuseNet: Incorporating Depth into Semantic Segmentation via Fusion-based CNN Architecture. ACCV 2016
DCRedNet0.583 90.682 50.723 90.542 100.510 120.310 110.451 70.668 90.549 90.520 70.920 60.375 40.446 20.528 120.417 100.670 100.577 130.478 90.862 50.806 60.628 6
ILC-PSPNet0.475 160.490 150.581 160.289 150.507 130.067 180.379 140.610 150.417 160.435 140.822 170.278 130.267 130.503 140.228 150.616 150.533 160.375 150.820 100.729 130.560 9
3DMV (2d proj)0.498 140.481 160.612 150.579 90.456 140.343 50.384 130.623 140.525 110.381 150.845 150.254 140.264 150.557 100.182 160.581 160.598 120.429 140.760 150.661 170.446 16
Angela Dai, Matthias Niessner: 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation. ECCV'18
MSeg1080_RVCpermissive0.485 150.505 140.709 120.092 180.427 150.241 150.411 120.654 120.385 180.457 120.861 130.053 180.279 120.503 140.481 70.645 120.626 90.365 160.748 160.725 140.529 11
John Lambert*, Zhuang Liu*, Ozan Sener, James Hays, Vladlen Koltun: MSeg: A Composite Dataset for Multi-domain Semantic Segmentation. CVPR 2020
Enet (reimpl)0.376 170.264 180.452 180.452 120.365 160.181 160.143 180.456 170.409 170.346 170.769 180.164 160.218 160.359 170.123 180.403 180.381 180.313 180.571 170.685 160.472 15
Re-implementation of Adam Paszke, Abhishek Chaurasia, Sangpil Kim, Eugenio Culurciello: ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation.
ScanNet (2d proj)permissive0.330 180.293 170.521 170.657 70.361 170.161 170.250 170.004 180.440 150.183 180.836 160.125 170.060 180.319 180.132 170.417 170.412 170.344 170.541 180.427 180.109 18
Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner: ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. CVPR'17
AdapNet++copyleft0.503 130.613 90.722 100.418 140.358 180.337 70.370 150.479 160.443 140.368 160.907 80.207 150.213 170.464 160.525 50.618 140.657 50.450 120.788 120.721 150.408 17
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. International Journal of Computer Vision, 2019

This table lists the benchmark results for the 2D semantic instance scenario.




Method Infoavg apbathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
UniDet_RVC0.205 10.381 10.323 10.037 10.226 10.177 10.063 10.277 10.120 10.067 10.131 10.074 20.317 10.080 10.235 10.289 10.141 10.678 10.080 1
MaskRCNN_ScanNetpermissive0.119 20.129 20.212 20.002 20.112 20.148 20.014 20.205 20.044 20.066 20.078 20.095 10.142 20.030 20.128 20.139 20.080 20.459 20.057 2
Re-implementation of Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross Girshick: Mask R-CNN. ICCV'17

This table lists the benchmark results for the scene type classification scenario.




Method Infoavg recallapartmentbathroombedroom / hotelbookstore / libraryconference roomcopy/mail roomhallwaykitchenlaundry roomliving room / loungemiscofficestorage / basement / garage
sort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
SE-ResNeXt-SSMA0.498 30.000 40.812 30.941 10.500 20.500 30.500 20.500 20.429 40.500 20.667 20.500 10.625 30.000 2
Abhinav Valada, Rohit Mohan, Wolfram Burgard: Self-Supervised Model Adaptation for Multimodal Semantic Segmentation. arXiv
multi-taskpermissive0.700 10.500 11.000 10.882 20.500 21.000 11.000 10.500 21.000 11.000 10.778 10.000 20.938 10.000 2
Shengyu Huang, Mikhail Usvyatsov, Konrad Schindler: Indoor Scene Recognition in 3D. IROS 2020
3DASPP-SCE0.691 20.500 10.938 20.824 31.000 11.000 10.500 21.000 10.857 20.500 20.556 30.000 20.812 20.500 1
resnet50_scannet0.353 40.250 30.812 30.529 40.500 20.500 30.000 40.500 20.571 30.000 40.556 30.000 20.375 40.000 2