The 3D semantic instance prediction task involves detecting and segmenting the object in an 3D scan mesh.

Evaluation and metrics

Our evaluation ranks all methods according to the average precision for each class. We report the mean average precision AP at overlap 0.25 (AP 25%), overlap 0.5 (AP 50%), and over overlaps in the range [0.5:0.95:0.05] (AP). Note that multiple predictions of the same ground truth instance are penalized as false positives.



This table lists the benchmark results for the 3D semantic instance scenario.




Method Infoavg ap 50%bathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Spherical Mask(CtoF)0.812 11.000 10.973 30.852 110.718 40.917 40.574 40.677 240.748 90.729 70.715 50.795 10.809 11.000 10.831 20.854 70.787 71.000 10.638 3
OneFormer3Dcopyleft0.801 21.000 10.973 20.909 50.698 100.928 20.582 30.668 290.685 140.780 20.687 90.698 130.702 111.000 10.794 70.900 20.784 90.986 460.635 4
Maxim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: OneFormer3D: One Transformer for Unified Point Cloud Segmentation.
UniPerception0.800 31.000 10.930 50.872 90.727 30.862 180.454 130.764 130.820 10.746 50.706 70.750 20.772 70.926 390.764 120.818 220.826 10.997 340.660 2
InsSSM0.799 41.000 10.915 70.710 350.729 20.925 30.664 10.670 270.770 50.766 30.739 20.737 40.700 121.000 10.792 80.829 160.815 30.997 340.625 7
TST3D0.795 51.000 10.929 60.918 40.709 70.884 130.596 20.704 210.769 60.734 60.644 150.699 120.751 91.000 10.794 60.876 40.757 160.997 340.550 26
ExtMask3D0.789 61.000 10.988 10.756 280.706 80.912 50.429 140.647 340.806 40.755 40.673 100.689 140.772 81.000 10.789 90.852 80.811 41.000 10.617 10
Queryformer0.787 71.000 10.933 40.601 440.754 10.886 110.558 60.661 310.767 70.665 130.716 40.639 190.808 31.000 10.844 10.897 30.804 51.000 10.624 8
MAFT0.786 81.000 10.894 120.807 170.694 120.893 80.486 90.674 250.740 100.786 10.704 80.727 60.739 101.000 10.707 180.849 100.756 171.000 10.685 1
Mask3D0.780 91.000 10.786 360.716 330.696 110.885 120.500 70.714 190.810 30.672 120.715 50.679 150.809 11.000 10.831 20.833 140.787 71.000 10.602 14
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
SPFormerpermissive0.770 100.903 500.903 90.806 180.609 260.886 100.568 50.815 60.705 130.711 80.655 120.652 180.685 171.000 10.789 100.809 240.776 121.000 10.583 19
Sun Jiahao, Qing Chunmei, Tan Junpeng, Xu Xiangmin: Superpoint Transformer for 3D Scene Instance Segmentation. AAAI 2023 [Oral]
SoftGroup++0.769 111.000 10.803 290.937 10.684 130.865 150.213 290.870 20.664 170.571 190.758 10.702 100.807 41.000 10.653 250.902 10.792 61.000 10.626 6
SoftGrouppermissive0.761 121.000 10.808 250.845 120.716 50.862 170.243 260.824 40.655 190.620 140.734 30.699 110.791 60.981 330.716 160.844 110.769 131.000 10.594 17
Thang Vu, Kookhoi Kim, Tung M. Luu, Xuan Thanh Nguyen, Chang D. Yoo: SoftGroup for 3D Instance Segmentaiton on Point Clouds. CVPR 2022 [Oral]
ISBNetpermissive0.757 131.000 10.904 80.731 310.678 140.895 60.458 110.644 360.670 160.710 90.620 200.732 50.650 201.000 10.756 130.778 270.779 101.000 10.614 11
Tuan Duc Ngo, Binh-Son Hua, Khoi Nguyen: ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution. CVPR 2023
SIM3D0.751 141.000 10.611 580.797 210.620 220.888 90.497 80.671 260.754 80.708 100.655 110.744 30.674 181.000 10.801 50.817 230.661 261.000 10.626 5
TD3Dpermissive0.751 141.000 10.774 370.867 100.621 210.934 10.404 150.706 200.812 20.605 170.633 180.626 200.690 161.000 10.640 270.820 190.777 111.000 10.612 12
Maksim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: Top-Down Beats Bottom-Up in 3D Instance Segmentation. WACV 2024
PBNetpermissive0.747 161.000 10.818 210.837 140.713 60.844 200.457 120.647 340.711 120.614 150.617 220.657 170.650 201.000 10.692 190.822 180.765 151.000 10.595 16
W.Zhao, Y.Yan, C.Yang, J.Ye,X.Yang,K.Huang: Divide and Conquer: 3D Instance Segmentation With Point-Wise Binarization. ICCV 2023
GraphCut0.732 171.000 10.788 340.724 320.642 190.859 190.248 250.787 110.618 220.596 180.653 140.722 80.583 411.000 10.766 110.861 50.825 21.000 10.504 32
IPCA-Inst0.731 181.000 10.788 350.884 80.698 90.788 360.252 240.760 140.646 200.511 270.637 170.665 160.804 51.000 10.644 260.778 280.747 191.000 10.561 23
TopoSeg0.725 191.000 10.806 280.933 20.668 160.758 400.272 230.734 180.630 210.549 230.654 130.606 210.697 150.966 360.612 310.839 120.754 181.000 10.573 20
DKNet0.718 201.000 10.814 220.782 220.619 230.872 140.224 270.751 160.569 260.677 110.585 260.724 70.633 310.981 330.515 410.819 200.736 201.000 10.617 9
Yizheng Wu, Min Shi, Shuaiyuan Du, Hao Lu, Zhiguo Cao, Weicai Zhong: 3D Instances as 1D Kernels. ECCV 2022
SSEC0.707 211.000 10.850 140.924 30.648 170.747 430.162 310.862 30.572 250.520 250.624 190.549 240.649 291.000 10.560 360.706 430.768 141.000 10.591 18
HAISpermissive0.699 221.000 10.849 150.820 150.675 150.808 300.279 210.757 150.465 320.517 260.596 240.559 230.600 351.000 10.654 240.767 300.676 240.994 420.560 24
Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang: Hierarchical Aggregation for 3D Instance Segmentation. ICCV 2021
SSTNetpermissive0.698 231.000 10.697 530.888 70.556 330.803 310.387 160.626 380.417 370.556 220.585 270.702 90.600 351.000 10.824 40.720 420.692 221.000 10.509 31
Zhihao Liang, Zhihao Li, Songcen Xu, Mingkui Tan, Kui Jia: Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks. ICCV2021
DualGroup0.694 241.000 10.799 310.811 160.622 200.817 250.376 170.805 90.590 240.487 310.568 300.525 280.650 200.835 490.600 320.829 150.655 281.000 10.526 28
DANCENET0.680 251.000 10.807 260.733 300.600 270.768 390.375 180.543 460.538 270.610 160.599 230.498 290.632 330.981 330.739 150.856 60.633 340.882 570.454 41
SphereSeg0.680 251.000 10.856 130.744 290.618 240.893 70.151 320.651 330.713 110.537 240.579 290.430 380.651 191.000 10.389 520.744 370.697 210.991 440.601 15
Box2Mask0.677 271.000 10.847 160.771 240.509 420.816 260.277 220.558 450.482 290.562 210.640 160.448 340.700 121.000 10.666 200.852 90.578 410.997 340.488 36
Julian Chibane, Francis Engelmann, Tuan Anh Tran, Gerard Pons-Moll: Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation Using Bounding Boxes. ECCV 2022
OccuSeg+instance0.672 281.000 10.758 450.682 370.576 310.842 210.477 100.504 520.524 280.567 200.585 280.451 330.557 431.000 10.751 140.797 250.563 441.000 10.467 40
Lei Han, Tian Zheng, Lan Xu, Lu Fang: OccuSeg: Occupancy-aware 3D Instance Segmentation. CVPR2020
Mask-Group0.664 291.000 10.822 200.764 270.616 250.815 270.139 360.694 230.597 230.459 350.566 310.599 220.600 350.516 590.715 170.819 210.635 321.000 10.603 13
Min Zhong, Xinghao Chen, Xiaokang Chen, Gang Zeng, Yunhe Wang: MaskGroup: Hierarchical Point Grouping and Masking for 3D Instance Segmentation. ICME 2022
INS-Conv-instance0.657 301.000 10.760 430.667 390.581 290.863 160.323 190.655 320.477 300.473 330.549 330.432 370.650 201.000 10.655 230.738 380.585 400.944 490.472 39
CSC-Pretrained0.648 311.000 10.810 230.768 250.523 400.813 280.143 350.819 50.389 400.422 440.511 370.443 350.650 201.000 10.624 290.732 390.634 331.000 10.375 48
PE0.645 321.000 10.773 390.798 200.538 350.786 370.088 440.799 100.350 440.435 420.547 340.545 250.646 300.933 380.562 350.761 330.556 490.997 340.501 34
Biao Zhang, Peter Wonka: Point Cloud Instance Segmentation using Probabilistic Embeddings. CVPR 2021
RPGN0.643 331.000 10.758 440.582 500.539 340.826 240.046 490.765 120.372 420.436 410.588 250.539 270.650 201.000 10.577 330.750 350.653 300.997 340.495 35
Shichao Dong, Guosheng Lin, Tzu-Yi Hung: Learning Regional Purity for Instance Segmentation on 3D Point Clouds. ECCV 2022
Dyco3Dcopyleft0.641 341.000 10.841 170.893 60.531 370.802 320.115 410.588 430.448 340.438 390.537 360.430 390.550 440.857 410.534 390.764 320.657 270.987 450.568 21
Tong He; Chunhua Shen; Anton van den Hengel: DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution. CVPR2021
GICN0.638 351.000 10.895 110.800 190.480 460.676 480.144 340.737 170.354 430.447 360.400 500.365 450.700 121.000 10.569 340.836 130.599 361.000 10.473 38
PointGroup0.636 361.000 10.765 400.624 410.505 440.797 330.116 400.696 220.384 410.441 370.559 320.476 310.596 381.000 10.666 200.756 340.556 480.997 340.513 30
Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia: PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation. CVPR 2020 [oral]
DD-UNet+Group0.635 370.667 520.797 330.714 340.562 320.774 380.146 330.810 80.429 360.476 320.546 350.399 410.633 311.000 10.632 280.722 410.609 351.000 10.514 29
H. Liu, R. Liu, K. Yang, J. Zhang, K. Peng, R. Stiefelhagen: HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor. ICCVW 2021
Mask3D_evaluation0.631 381.000 10.829 190.606 430.646 180.836 220.068 450.511 500.462 330.507 280.619 210.389 430.610 341.000 10.432 470.828 170.673 250.788 610.552 25
DENet0.629 391.000 10.797 320.608 420.589 280.627 520.219 280.882 10.310 460.402 490.383 520.396 420.650 201.000 10.663 220.543 600.691 231.000 10.568 22
3D-MPA0.611 401.000 10.833 180.765 260.526 390.756 410.136 380.588 430.470 310.438 400.432 460.358 470.650 200.857 410.429 480.765 310.557 471.000 10.430 43
Francis Engelmann, Martin Bokeloh, Alireza Fathi, Bastian Leibe, Matthias Nie├čner: 3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation. CVPR 2020
OSIS0.605 411.000 10.801 300.599 450.535 360.728 450.286 200.436 560.679 150.491 290.433 440.256 490.404 560.857 410.620 300.724 400.510 541.000 10.539 27
AOIA0.601 421.000 10.761 420.687 360.485 450.828 230.008 560.663 300.405 390.405 480.425 470.490 300.596 380.714 520.553 380.779 260.597 370.992 430.424 45
PCJC0.578 431.000 10.810 240.583 490.449 490.813 290.042 500.603 410.341 450.490 300.465 410.410 400.650 200.835 490.264 580.694 470.561 450.889 540.504 33
SSEN0.575 441.000 10.761 410.473 520.477 470.795 340.066 460.529 480.658 180.460 340.461 420.380 440.331 580.859 400.401 510.692 490.653 291.000 10.348 50
Dongsu Zhang, Junha Chun, Sang Kyun Cha, Young Min Kim: Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with Deep Metric Learning. Arxiv
RWSeg0.567 450.528 620.708 520.626 400.580 300.745 440.063 470.627 370.240 500.400 500.497 380.464 320.515 451.000 10.475 430.745 360.571 421.000 10.429 44
NeuralBF0.555 460.667 520.896 100.843 130.517 410.751 420.029 510.519 490.414 380.439 380.465 400.000 680.484 470.857 410.287 560.693 480.651 311.000 10.485 37
Weiwei Sun, Daniel Rebain, Renjie Liao, Vladimir Tankovich, Soroosh Yazdani, Kwang Moo Yi, Andrea Tagliasacchi: NeuralBF: Neural Bilateral Filtering for Top-down Instance Segmentation on Point Clouds. WACV 2023
MTML0.549 471.000 10.807 270.588 480.327 540.647 500.004 580.815 70.180 530.418 450.364 540.182 520.445 501.000 10.442 460.688 500.571 431.000 10.396 46
Jean Lahoud, Bernard Ghanem, Marc Pollefeys, Martin R. Oswald: 3D Instance Segmentation via Multi-task Metric Learning. ICCV 2019 [oral]
ClickSeg_Instance0.539 481.000 10.621 560.300 550.530 380.698 460.127 390.533 470.222 510.430 430.400 490.365 450.574 420.938 370.472 440.659 520.543 500.944 490.347 51
One_Thing_One_Clickpermissive0.529 490.667 520.718 480.777 230.399 500.683 470.000 610.669 280.138 560.391 510.374 530.539 260.360 570.641 560.556 370.774 290.593 380.997 340.251 56
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
Sparse R-CNN0.515 501.000 10.538 620.282 560.468 480.790 350.173 300.345 580.429 350.413 470.484 390.176 530.595 400.591 570.522 400.668 510.476 550.986 470.327 52
Occipital-SCS0.512 511.000 10.716 490.509 510.506 430.611 530.092 430.602 420.177 540.346 540.383 510.165 540.442 510.850 480.386 530.618 560.543 510.889 540.389 47
3D-BoNet0.488 521.000 10.672 550.590 470.301 560.484 630.098 420.620 390.306 470.341 550.259 580.125 560.434 530.796 510.402 500.499 620.513 530.909 530.439 42
Bo Yang, Jianan Wang, Ronald Clark, Qingyong Hu, Sen Wang, Andrew Markham, Niki Trigoni: Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds. NeurIPS 2019 Spotlight
PanopticFusion-inst0.478 530.667 520.712 510.595 460.259 590.550 590.000 610.613 400.175 550.250 600.434 430.437 360.411 550.857 410.485 420.591 590.267 650.944 490.359 49
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
SPG_WSIS0.470 540.667 520.685 540.677 380.372 520.562 570.000 610.482 530.244 490.316 570.298 550.052 630.442 520.857 410.267 570.702 440.559 461.000 10.287 54
SALoss-ResNet0.459 551.000 10.737 470.159 660.259 580.587 550.138 370.475 540.217 520.416 460.408 480.128 550.315 590.714 520.411 490.536 610.590 390.873 580.304 53
Zhidong Liang, Ming Yang, Hao Li, Chunxiang Wang: 3D Instance Embedding Learning With a Structure-Aware Loss Function for Point Cloud Segmentation. IEEE Robotics and Automation Letters (IROS2020)
MASCpermissive0.447 560.528 620.555 600.381 530.382 510.633 510.002 590.509 510.260 480.361 530.432 450.327 480.451 490.571 580.367 540.639 540.386 560.980 480.276 55
Chen Liu, Yasutaka Furukawa: MASC: Multi-scale Affinity with Sparse Convolution for 3D Instance Segmentation.
SegGroup_inspermissive0.445 570.667 520.773 380.185 630.317 550.656 490.000 610.407 570.134 570.381 520.267 570.217 510.476 480.714 520.452 450.629 550.514 521.000 10.222 59
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
3D-SISpermissive0.382 581.000 10.432 650.245 580.190 600.577 560.013 550.263 600.033 630.320 560.240 590.075 590.422 540.857 410.117 630.699 450.271 640.883 560.235 58
Ji Hou, Angela Dai, Matthias Niessner: 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans. CVPR 2019
Hier3Dcopyleft0.323 590.667 520.542 610.264 570.157 630.550 580.000 610.205 630.009 650.270 590.218 600.075 590.500 460.688 550.007 690.698 460.301 610.459 660.200 60
Tan: HCFS3D: Hierarchical Coupled Feature Selection Network for 3D Semantic and Instance Segmentation.
UNet-backbone0.319 600.667 520.715 500.233 590.189 610.479 640.008 560.218 610.067 620.201 620.173 610.107 570.123 640.438 600.150 600.615 570.355 570.916 520.093 68
R-PointNet0.306 610.500 640.405 660.311 540.348 530.589 540.054 480.068 660.126 580.283 580.290 560.028 640.219 620.214 630.331 550.396 660.275 620.821 600.245 57
Region-18class0.284 620.250 680.751 460.228 610.270 570.521 600.000 610.468 550.008 670.205 610.127 620.000 680.068 660.070 670.262 590.652 530.323 590.740 620.173 61
SemRegionNet-20cls0.250 630.333 650.613 570.229 600.163 620.493 610.000 610.304 590.107 590.147 650.100 640.052 620.231 600.119 650.039 650.445 640.325 580.654 630.141 64
tmp0.248 640.667 520.437 640.188 620.153 640.491 620.000 610.208 620.094 610.153 640.099 650.057 610.217 630.119 650.039 650.466 630.302 600.640 640.140 65
3D-BEVIS0.248 640.667 520.566 590.076 670.035 690.394 670.027 530.035 680.098 600.099 670.030 680.025 650.098 650.375 620.126 620.604 580.181 670.854 590.171 62
Cathrin Elich, Francis Engelmann, Jonas Schult, Theodora Kontogianni, Bastian Leibe: 3D-BEVIS: Birds-Eye-View Instance Segmentation.
Sem_Recon_ins0.227 660.764 510.486 630.069 680.098 660.426 660.017 540.067 670.015 640.172 630.100 630.096 580.054 680.183 640.135 610.366 670.260 660.614 650.168 63
ASIS0.199 670.333 650.253 680.167 650.140 650.438 650.000 610.177 640.008 660.121 660.069 660.004 670.231 610.429 610.036 670.445 650.273 630.333 680.119 67
Sgpn_scannet0.143 680.208 690.390 670.169 640.065 670.275 680.029 520.069 650.000 680.087 680.043 670.014 660.027 690.000 680.112 640.351 680.168 680.438 670.138 66
MaskRCNN 2d->3d Proj0.058 690.333 650.002 690.000 690.053 680.002 690.002 600.021 690.000 680.045 690.024 690.238 500.065 670.000 680.014 680.107 690.020 690.110 690.006 69