The 3D semantic instance prediction task involves detecting and segmenting the object in an 3D scan mesh.

Evaluation and metrics

Our evaluation ranks all methods according to the average precision for each class. We report the mean average precision AP at overlap 0.25 (AP 25%), overlap 0.5 (AP 50%), and over overlaps in the range [0.5:0.95:0.05] (AP). Note that multiple predictions of the same ground truth instance are penalized as false positives.



This table lists the benchmark results for the 3D semantic instance scenario.




Method Infoavg ap 25%bathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort by
UniPerception0.884 61.000 10.979 200.872 160.869 30.892 280.806 120.890 60.835 290.892 90.755 140.811 20.779 100.955 480.951 80.876 220.914 10.997 400.840 6
DCD0.885 51.000 10.933 400.856 220.832 150.959 70.930 20.858 100.802 370.859 180.767 90.796 90.709 201.000 10.971 70.871 280.904 21.000 10.874 1
InsSSM0.883 71.000 10.996 60.800 390.865 40.960 60.808 110.852 160.940 60.899 80.785 40.810 30.700 221.000 10.912 200.851 430.895 30.997 400.827 9
Lei Yao, Yi Wang, Moyun Liu, Lap-Pui Chau: SGIFormer: Semantic-guided and Geometric-enhanced Interleaving Transformer for 3D Instance Segmentation. TCSVT, 2024
PointRel0.901 11.000 10.978 230.928 30.879 10.962 50.882 50.749 370.947 30.912 20.802 30.753 180.820 21.000 10.984 40.919 50.894 41.000 10.815 15
: Relation3D (PointRel): Enhancing Relation Modeling for Point Cloud Instance Segmentation.
Queryformer0.874 151.000 10.978 220.809 370.876 20.936 180.702 230.716 420.920 110.875 160.766 100.772 140.818 61.000 10.995 10.916 60.892 51.000 10.767 24
SoftGroup++0.874 151.000 10.972 260.947 10.839 140.898 270.556 410.913 20.881 210.756 250.828 20.748 200.821 11.000 10.937 140.937 10.887 61.000 10.821 12
Competitor-SPFormer0.881 91.000 11.000 10.845 240.854 70.962 40.714 220.857 120.904 140.902 60.782 70.789 120.662 281.000 10.988 30.874 250.886 70.997 400.847 3
OneFormer3Dcopyleft0.896 21.000 11.000 10.913 60.858 60.951 100.786 150.837 190.916 120.908 40.778 80.803 60.750 141.000 10.976 60.926 40.882 80.995 480.849 2
Maxim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: OneFormer3D: One Transformer for Unified Point Cloud Segmentation.
GraphCut0.832 261.000 10.922 490.724 580.798 250.902 260.701 240.856 140.859 240.715 300.706 260.748 190.640 421.000 10.934 160.862 370.880 91.000 10.729 30
ExtMask3D0.867 181.000 11.000 10.756 540.816 220.940 150.795 130.760 350.862 230.888 140.739 190.763 160.774 111.000 10.929 180.878 210.879 101.000 10.819 14
SSEC0.820 291.000 10.983 180.924 40.826 170.817 520.415 560.899 50.793 390.673 380.731 220.636 330.653 291.000 10.939 120.804 600.878 111.000 10.780 20
Mask3D0.870 171.000 10.985 170.782 470.818 200.938 170.760 170.749 370.923 90.877 150.760 110.785 130.820 21.000 10.912 200.864 360.878 120.983 540.825 10
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
Spherical Mask(CtoF)0.875 131.000 10.991 150.873 140.850 90.946 130.691 260.752 360.926 70.889 120.759 120.794 100.820 21.000 10.912 200.900 90.878 121.000 10.769 23
SoftGrouppermissive0.865 191.000 10.969 270.860 200.860 50.913 230.558 380.899 40.911 130.760 240.828 10.736 220.802 80.981 450.919 190.875 230.877 141.000 10.820 13
Thang Vu, Kookhoi Kim, Tung M. Luu, Xuan Thanh Nguyen, Chang D. Yoo: SoftGroup for 3D Instance Segmentaiton on Point Clouds. CVPR 2022 [Oral]
MG-Former0.887 41.000 10.991 140.837 260.801 240.935 190.887 40.857 110.946 40.891 100.748 180.805 50.739 161.000 10.993 20.809 580.876 151.000 10.842 4
TD3Dpermissive0.875 131.000 10.976 240.877 120.783 300.970 20.889 30.828 200.945 50.803 230.713 250.720 250.709 191.000 10.936 150.934 30.873 161.000 10.791 19
Maksim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: Top-Down Beats Bottom-Up in 3D Instance Segmentation. WACV 2024
SPFormerpermissive0.851 211.000 10.994 100.806 380.774 320.942 140.637 300.849 170.859 250.889 110.720 240.730 230.665 271.000 10.911 270.868 330.873 171.000 10.796 17
Sun Jiahao, Qing Chunmei, Tan Junpeng, Xu Xiangmin: Superpoint Transformer for 3D Scene Instance Segmentation. AAAI 2023 [Oral]
Competitor-MAFT0.896 21.000 11.000 10.872 160.847 110.967 30.955 10.778 330.901 150.919 10.784 50.812 10.770 121.000 10.949 90.865 340.868 181.000 10.840 5
IPCA-Inst0.851 211.000 10.968 280.884 110.842 130.862 400.693 250.812 250.888 200.677 370.783 60.698 260.807 71.000 10.911 270.865 350.865 191.000 10.757 27
EV3D0.877 121.000 10.996 80.873 140.854 80.950 110.691 260.783 310.926 70.889 130.754 150.794 110.820 21.000 10.912 200.900 90.860 201.000 10.779 21
SphereSeg0.835 241.000 10.963 310.891 90.794 270.954 80.822 80.710 430.961 20.721 290.693 310.530 480.653 301.000 10.867 390.857 390.859 210.991 510.771 22
SIM3D0.878 111.000 10.972 250.863 190.817 210.952 90.821 90.783 300.890 180.902 70.735 210.797 70.799 91.000 10.931 170.893 130.853 221.000 10.792 18
TopoSeg0.832 261.000 10.981 190.933 20.819 190.826 490.524 470.841 180.811 340.681 360.759 130.687 270.727 170.981 450.911 270.883 170.853 231.000 10.756 28
KmaxOneFormerNetpermissive0.883 71.000 11.000 10.798 400.848 100.971 10.853 60.903 30.827 320.910 30.748 170.809 40.724 181.000 10.980 50.855 400.844 241.000 10.832 7
MAFT0.860 201.000 10.990 160.810 360.829 160.949 120.809 100.688 480.836 280.904 50.751 160.796 80.741 151.000 10.864 400.848 450.837 251.000 10.828 8
ISBNetpermissive0.835 241.000 10.950 340.731 560.819 180.918 210.790 140.740 390.851 270.831 190.661 330.742 210.650 311.000 10.937 130.814 570.836 261.000 10.765 25
Tuan Duc Ngo, Binh-Son Hua, Khoi Nguyen: ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution. CVPR 2023
TST3D0.879 101.000 10.994 90.921 50.807 230.939 160.771 160.887 70.923 100.862 170.722 230.768 150.756 131.000 10.910 300.904 70.836 270.999 390.824 11
Duc Tran Dang Trung, Byeongkeun Kang, Yeejin Lee: MSTA3D: Multi-scale Twin-attention for 3D Instance Segmentation. ACM Multimedia 2024
Mask3D_evaluation0.843 231.000 10.955 330.847 230.795 260.932 200.750 190.780 320.891 170.818 200.737 200.633 350.703 211.000 10.902 320.870 290.820 280.941 620.805 16
PBNetpermissive0.825 281.000 10.963 300.837 280.843 120.865 350.822 70.647 510.878 220.733 270.639 400.683 280.650 311.000 10.853 410.870 300.820 291.000 10.744 29
Weiguang Zhao, Yuyao Yan, Chaolong Yang, Jianan Ye, Xi Yang, Kaizhu Huang: Divide and Conquer: 3D Instance Segmentation With Point-Wise Binarization. ICCV 2023
DENet0.786 381.000 10.929 420.736 550.750 430.720 650.755 180.934 10.794 380.590 520.561 530.537 460.650 311.000 10.882 330.804 610.789 301.000 10.719 33
RPGN0.806 311.000 10.992 120.789 420.723 490.891 290.650 290.810 260.832 300.665 400.699 290.658 290.700 221.000 10.881 340.832 490.774 310.997 400.613 50
Shichao Dong, Guosheng Lin, Tzu-Yi Hung: Learning Regional Purity for Instance Segmentation on 3D Point Clouds. ECCV 2022
HAISpermissive0.803 321.000 10.994 100.820 320.759 370.855 410.554 420.882 80.827 330.615 460.676 320.638 320.646 401.000 10.912 200.797 630.767 320.994 490.726 31
Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang: Hierarchical Aggregation for 3D Instance Segmentation. ICCV 2021
DKNet0.815 301.000 10.930 410.844 250.765 360.915 220.534 450.805 270.805 360.807 220.654 340.763 170.650 311.000 10.794 530.881 180.766 331.000 10.758 26
Yizheng Wu, Min Shi, Shuaiyuan Du, Hao Lu, Zhiguo Cao, Weicai Zhong: 3D Instances as 1D Kernels. ECCV 2022
CSC-Pretrained0.791 351.000 10.996 60.829 310.767 340.889 310.600 330.819 230.770 440.594 510.620 440.541 450.700 221.000 10.941 100.889 150.763 341.000 10.526 60
Dyco3Dcopyleft0.761 461.000 10.935 380.893 80.752 420.863 380.600 330.588 570.742 480.641 420.633 410.546 430.550 550.857 540.789 550.853 410.762 350.987 520.699 36
Tong He; Chunhua Shen; Anton van den Hengel: DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution. CVPR2021
SSEN0.724 521.000 10.926 440.781 480.661 570.845 440.596 360.529 630.764 470.653 410.489 640.461 540.500 620.859 530.765 560.872 270.761 361.000 10.577 54
Dongsu Zhang, Junha Chun, Sang Kyun Cha, Young Min Kim: Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with Deep Metric Learning. Arxiv
SSTNetpermissive0.789 361.000 10.840 630.888 100.717 500.835 450.717 210.684 490.627 590.724 280.652 360.727 240.600 481.000 10.912 200.822 520.757 371.000 10.691 38
Zhihao Liang, Zhihao Li, Songcen Xu, Mingkui Tan, Kui Jia: Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks. ICCV2021
Box2Mask0.803 321.000 10.962 320.874 130.707 530.887 320.686 280.598 560.961 10.715 310.694 300.469 530.700 221.000 10.912 200.902 80.753 380.997 400.637 44
Julian Chibane, Francis Engelmann, Tuan Anh Tran, Gerard Pons-Moll: Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation Using Bounding Boxes. ECCV 2022
AOIA0.767 431.000 10.937 360.810 350.740 450.906 240.550 430.800 290.706 510.577 560.624 420.544 440.596 530.857 540.879 380.880 190.750 390.992 500.658 40
DualGroup0.782 401.000 10.927 430.811 340.772 330.853 420.631 320.805 270.773 410.613 470.611 450.610 360.650 310.835 620.881 340.879 200.750 401.000 10.675 39
NeuralBF0.718 531.000 10.945 350.901 70.754 390.817 510.460 500.700 450.772 420.688 340.568 520.000 750.500 620.981 450.606 660.872 260.740 411.000 10.614 49
Weiwei Sun, Daniel Rebain, Renjie Liao, Vladimir Tankovich, Soroosh Yazdani, Kwang Moo Yi, Andrea Tagliasacchi: NeuralBF: Neural Bilateral Filtering for Top-down Instance Segmentation on Point Clouds. WACV 2023
GICN0.788 371.000 10.978 210.867 180.781 310.833 460.527 460.824 210.806 350.549 590.596 470.551 410.700 221.000 10.853 410.935 20.733 421.000 10.651 41
MTML0.731 501.000 10.992 120.779 490.609 620.746 600.308 610.867 90.601 620.607 480.539 570.519 490.550 551.000 10.824 460.869 310.729 431.000 10.616 48
Jean Lahoud, Bernard Ghanem, Marc Pollefeys, Martin R. Oswald: 3D Instance Segmentation via Multi-task Metric Learning. ICCV 2019 [oral]
RWSeg0.739 481.000 10.899 550.759 520.753 410.823 500.282 620.691 470.658 560.582 550.594 480.547 420.628 461.000 10.795 520.868 320.728 441.000 10.692 37
Mask-Group0.792 341.000 10.968 290.812 330.766 350.864 360.460 500.815 240.888 190.598 500.651 370.639 310.600 480.918 510.941 100.896 120.721 451.000 10.723 32
Min Zhong, Xinghao Chen, Xiaokang Chen, Gang Zeng, Yunhe Wang: MaskGroup: Hierarchical Point Grouping and Masking for 3D Instance Segmentation. ICME 2022
PointGroup0.778 411.000 10.900 530.798 410.715 510.863 370.493 480.706 440.895 160.569 570.701 270.576 390.639 431.000 10.880 360.851 420.719 460.997 400.709 35
Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia: PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation. CVPR 2020 [oral]
SPG_WSIS0.678 611.000 10.880 600.836 290.701 540.727 640.273 660.607 550.706 520.541 620.515 610.174 670.600 480.857 540.716 600.846 460.711 471.000 10.506 61
Occipital-SCS0.688 571.000 10.913 500.730 570.737 470.743 620.442 530.855 150.655 570.546 600.546 560.263 630.508 610.889 520.568 670.771 660.705 480.889 680.625 46
SALoss-ResNet0.695 551.000 10.855 610.579 690.589 640.735 630.484 490.588 570.856 260.634 430.571 510.298 610.500 621.000 10.824 460.818 530.702 490.935 650.545 58
Zhidong Liang, Ming Yang, Hao Li, Chunxiang Wang: 3D Instance Embedding Learning With a Structure-Aware Loss Function for Point Cloud Segmentation. IEEE Robotics and Automation Letters (IROS2020)
DANCENET0.786 381.000 10.936 370.783 450.737 460.852 430.742 200.647 510.765 460.811 210.624 430.579 380.632 451.000 10.909 310.898 110.696 500.944 580.601 53
3D-MPA0.737 491.000 10.933 390.785 430.794 280.831 470.279 640.588 570.695 540.616 450.559 540.556 400.650 311.000 10.809 500.875 240.696 511.000 10.608 52
Francis Engelmann, Martin Bokeloh, Alireza Fathi, Bastian Leibe, Matthias Nießner: 3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation. CVPR 2020
One_Thing_One_Clickpermissive0.675 621.000 10.823 640.782 460.621 600.766 570.211 680.736 410.560 660.586 530.522 590.636 340.453 660.641 660.853 410.850 440.694 520.997 400.411 67
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
INS-Conv-instance0.762 451.000 10.923 460.765 500.785 290.905 250.600 330.655 500.646 580.683 350.647 380.530 470.650 311.000 10.824 460.830 500.693 530.944 580.644 43
PE0.776 421.000 10.900 540.860 200.728 480.869 330.400 570.857 130.774 400.568 580.701 280.602 370.646 400.933 500.843 440.890 140.691 540.997 400.709 34
Biao Zhang, Peter Wonka: Point Cloud Instance Segmentation using Probabilistic Embeddings. CVPR 2021
DD-UNet+Group0.764 441.000 10.897 560.837 270.753 400.830 480.459 520.824 210.699 530.629 440.653 350.438 560.650 311.000 10.880 360.858 380.690 551.000 10.650 42
H. Liu, R. Liu, K. Yang, J. Zhang, K. Peng, R. Stiefelhagen: HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor. ICCVW 2021
Sparse R-CNN0.714 541.000 10.926 450.694 590.699 550.890 300.636 310.516 640.693 550.743 260.588 490.369 600.601 470.594 680.800 510.886 160.676 560.986 530.546 57
SegGroup_inspermissive0.637 631.000 10.923 480.593 680.561 660.746 610.143 730.504 650.766 450.485 670.442 650.372 590.530 580.714 630.815 490.775 650.673 571.000 10.431 66
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
3D-BoNet0.687 581.000 10.887 580.836 290.587 650.643 720.550 430.620 530.724 500.522 640.501 620.243 640.512 601.000 10.751 580.807 590.661 580.909 670.612 51
Bo Yang, Jianan Wang, Ronald Clark, Qingyong Hu, Sen Wang, Andrew Markham, Niki Trigoni: Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds. NeurIPS 2019 Spotlight
PanopticFusion-inst0.693 561.000 10.852 620.655 630.616 610.788 550.334 590.763 340.771 430.457 690.555 550.652 300.518 590.857 540.765 560.732 690.631 590.944 580.577 55
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
PCJC0.684 601.000 10.895 570.757 530.659 580.862 390.189 710.739 400.606 610.712 320.581 500.515 500.650 310.857 540.357 720.785 640.631 600.889 680.635 45
OSIS0.725 511.000 10.885 590.653 640.657 590.801 530.576 370.695 460.828 310.698 330.534 580.457 550.500 620.857 540.831 450.841 470.627 611.000 10.619 47
ClickSeg_Instance0.685 591.000 10.818 650.600 670.715 520.795 540.557 390.533 620.591 640.601 490.519 600.429 580.638 440.938 490.706 610.817 550.624 620.944 580.502 62
OccuSeg+instance0.742 471.000 10.923 460.785 430.745 440.867 340.557 390.578 600.729 490.670 390.644 390.488 510.577 541.000 10.794 530.830 500.620 631.000 10.550 56
Lei Han, Tian Zheng, Lan Xu, Lu Fang: OccuSeg: Occupancy-aware 3D Instance Segmentation. CVPR2020
R-PointNet0.544 670.500 740.655 730.661 620.663 560.765 580.432 540.214 730.612 600.584 540.499 630.204 660.286 720.429 710.655 630.650 740.539 640.950 570.499 63
MASCpermissive0.615 640.711 710.802 660.540 700.757 380.777 560.029 740.577 610.588 650.521 650.600 460.436 570.534 570.697 640.616 650.838 480.526 650.980 550.534 59
Chen Liu, Yasutaka Furukawa: MASC: Multi-scale Affinity with Sparse Convolution for 3D Instance Segmentation.
Sem_Recon_ins0.484 700.764 700.608 750.470 720.521 680.637 730.311 600.218 720.348 740.365 730.223 720.222 650.258 730.629 670.734 590.596 750.509 660.858 710.444 65
UNet-backbone0.605 651.000 10.909 510.764 510.603 630.704 660.415 550.301 700.548 670.461 680.394 660.267 620.386 680.857 540.649 640.817 540.504 670.959 560.356 70
tmp0.474 711.000 10.727 680.433 740.481 710.673 700.022 760.380 680.517 690.436 710.338 690.128 690.343 700.429 710.291 740.728 700.473 680.833 720.300 73
Hier3Dcopyleft0.540 681.000 10.727 680.626 650.467 720.693 670.200 690.412 660.480 710.528 630.318 700.077 740.600 480.688 650.382 690.768 670.472 690.941 620.350 71
Tan: HCFS3D: Hierarchical Coupled Feature Selection Network for 3D Semantic and Instance Segmentation.
SemRegionNet-20cls0.470 721.000 10.727 680.447 730.481 700.678 690.024 750.380 680.518 680.440 700.339 680.128 690.350 690.429 710.212 750.711 710.465 700.833 720.290 74
Region-18class0.497 690.250 760.902 520.689 600.540 670.747 590.276 650.610 540.268 750.489 660.348 670.000 750.243 750.220 740.663 620.814 560.459 710.928 660.496 64
ASIS0.422 730.333 750.707 710.676 610.401 730.650 710.350 580.177 740.594 630.376 720.202 730.077 730.404 670.571 690.197 760.674 730.447 720.500 750.260 75
3D-SISpermissive0.558 661.000 10.773 670.614 660.503 690.691 680.200 690.412 660.498 700.546 610.311 710.103 710.600 480.857 540.382 690.799 620.445 730.938 640.371 68
Ji Hou, Angela Dai, Matthias Niessner: 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans. CVPR 2019
Sgpn_scannet0.390 750.556 730.636 740.493 710.353 740.539 750.271 670.160 750.450 720.359 740.178 740.146 680.250 740.143 750.347 730.698 720.436 740.667 740.331 72
3D-BEVIS0.401 740.667 720.687 720.419 750.137 760.587 740.188 720.235 710.359 730.211 750.093 760.080 720.311 710.571 690.382 690.754 680.300 750.874 700.357 69
Cathrin Elich, Francis Engelmann, Jonas Schult, Theodora Kontogianni, Bastian Leibe: 3D-BEVIS: Birds-Eye-View Instance Segmentation.
MaskRCNN 2d->3d Proj0.261 760.903 690.081 760.008 760.233 750.175 760.280 630.106 760.150 760.203 760.175 750.480 520.218 760.143 750.542 680.404 760.153 760.393 760.049 76