The 3D semantic instance prediction task involves detecting and segmenting the object in an 3D scan mesh.

Evaluation and metrics

Our evaluation ranks all methods according to the average precision for each class. We report the mean average precision AP at overlap 0.25 (AP 25%), overlap 0.5 (AP 50%), and over overlaps in the range [0.5:0.95:0.05] (AP). Note that multiple predictions of the same ground truth instance are penalized as false positives.



This table lists the benchmark results for the 3D semantic instance scenario.




Method Infoavg ap 25%bathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
PointRel0.901 11.000 10.978 230.928 30.879 10.962 50.882 50.749 380.947 30.912 20.802 30.753 190.820 21.000 10.984 40.919 50.894 41.000 10.815 15
: Relation3D (PointRel): Enhancing Relation Modeling for Point Cloud Instance Segmentation.
Competitor-MAFT0.896 21.000 11.000 10.872 160.847 110.967 30.955 10.778 330.901 150.919 10.784 50.812 10.770 131.000 10.949 90.865 350.868 181.000 10.840 5
OneFormer3Dcopyleft0.896 21.000 11.000 10.913 60.858 60.951 100.786 160.837 190.916 120.908 40.778 80.803 60.750 151.000 10.976 60.926 40.882 80.995 480.849 2
Maxim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: OneFormer3D: One Transformer for Unified Point Cloud Segmentation.
MG-Former0.887 41.000 10.991 140.837 260.801 250.935 190.887 40.857 110.946 40.891 100.748 180.805 50.739 171.000 10.993 20.809 590.876 151.000 10.842 4
DCD0.885 51.000 10.933 410.856 220.832 150.959 70.930 20.858 100.802 380.859 180.767 90.796 100.709 211.000 10.971 70.871 290.904 21.000 10.874 1
UniPerception0.884 61.000 10.979 200.872 160.869 30.892 280.806 130.890 60.835 290.892 90.755 140.811 20.779 100.955 490.951 80.876 230.914 10.997 400.840 6
KmaxOneFormerNetpermissive0.883 71.000 11.000 10.798 410.848 100.971 10.853 70.903 30.827 320.910 30.748 170.809 40.724 191.000 10.980 50.855 410.844 241.000 10.832 7
InsSSM0.883 71.000 10.996 60.800 400.865 40.960 60.808 120.852 160.940 60.899 80.785 40.810 30.700 231.000 10.912 200.851 440.895 30.997 400.827 9
Lei Yao, Yi Wang, Moyun Liu, Lap-Pui Chau: SGIFormer: Semantic-guided and Geometric-enhanced Interleaving Transformer for 3D Instance Segmentation. TCSVT, 2024
Competitor-SPFormer0.881 91.000 11.000 10.845 240.854 70.962 40.714 230.857 120.904 140.902 60.782 70.789 130.662 291.000 10.988 30.874 260.886 70.997 400.847 3
TST3D0.879 101.000 10.994 90.921 50.807 240.939 160.771 170.887 70.923 100.862 170.722 230.768 160.756 141.000 10.910 310.904 70.836 270.999 390.824 11
Duc Tran Dang Trung, Byeongkeun Kang, Yeejin Lee: MSTA3D: Multi-scale Twin-attention for 3D Instance Segmentation. ACM Multimedia 2024
SIM3D0.878 111.000 10.972 250.863 190.817 220.952 90.821 100.783 300.890 180.902 70.735 210.797 80.799 91.000 10.931 170.893 130.853 221.000 10.792 18
EV3D0.877 121.000 10.996 80.873 140.854 80.950 110.691 270.783 310.926 70.889 130.754 150.794 120.820 21.000 10.912 200.900 90.860 201.000 10.779 21
Spherical Mask(CtoF)0.875 131.000 10.991 150.873 140.850 90.946 130.691 270.752 370.926 70.889 120.759 120.794 110.820 21.000 10.912 200.900 90.878 121.000 10.769 23
TD3Dpermissive0.875 131.000 10.976 240.877 120.783 310.970 20.889 30.828 200.945 50.803 240.713 250.720 260.709 201.000 10.936 150.934 30.873 161.000 10.791 19
Maksim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: Top-Down Beats Bottom-Up in 3D Instance Segmentation. WACV 2024
SoftGroup++0.874 151.000 10.972 260.947 10.839 140.898 270.556 420.913 20.881 210.756 260.828 20.748 210.821 11.000 10.937 140.937 10.887 61.000 10.821 12
Queryformer0.874 151.000 10.978 220.809 380.876 20.936 180.702 240.716 430.920 110.875 160.766 100.772 150.818 61.000 10.995 10.916 60.892 51.000 10.767 24
Mask3D0.870 171.000 10.985 170.782 480.818 210.938 170.760 180.749 380.923 90.877 150.760 110.785 140.820 21.000 10.912 200.864 370.878 120.983 540.825 10
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
ExtMask3D0.867 181.000 11.000 10.756 550.816 230.940 150.795 140.760 360.862 230.888 140.739 190.763 170.774 111.000 10.929 180.878 220.879 101.000 10.819 14
SoftGrouppermissive0.865 191.000 10.969 270.860 200.860 50.913 230.558 390.899 40.911 130.760 250.828 10.736 230.802 80.981 460.919 190.875 240.877 141.000 10.820 13
Thang Vu, Kookhoi Kim, Tung M. Luu, Xuan Thanh Nguyen, Chang D. Yoo: SoftGroup for 3D Instance Segmentaiton on Point Clouds. CVPR 2022 [Oral]
MAFT0.860 201.000 10.990 160.810 370.829 160.949 120.809 110.688 490.836 280.904 50.751 160.796 90.741 161.000 10.864 410.848 460.837 251.000 10.828 8
SPFormerpermissive0.851 211.000 10.994 100.806 390.774 330.942 140.637 310.849 170.859 250.889 110.720 240.730 240.665 281.000 10.911 280.868 340.873 171.000 10.796 17
Sun Jiahao, Qing Chunmei, Tan Junpeng, Xu Xiangmin: Superpoint Transformer for 3D Scene Instance Segmentation. AAAI 2023 [Oral]
IPCA-Inst0.851 211.000 10.968 280.884 110.842 130.862 410.693 260.812 250.888 200.677 380.783 60.698 270.807 71.000 10.911 280.865 360.865 191.000 10.757 27
ODIN - Inspermissive0.847 231.000 10.951 340.834 310.828 170.875 330.871 60.767 340.821 340.816 210.690 320.800 70.771 121.000 10.912 200.891 140.821 280.886 700.713 34
Ayush Jain, Pushkal Katara, Nikolaos Gkanatsios, Adam W. Harley, Gabriel Sarch, Kriti Aggarwal, Vishrav Chaudhary, Katerina Fragkiadaki: ODIN: A Single Model for 2D and 3D Segmentation. CVPR 2024
Mask3D_evaluation0.843 241.000 10.955 330.847 230.795 270.932 200.750 200.780 320.891 170.818 200.737 200.633 360.703 221.000 10.902 330.870 300.820 290.941 620.805 16
ISBNetpermissive0.835 251.000 10.950 350.731 570.819 190.918 210.790 150.740 400.851 270.831 190.661 340.742 220.650 321.000 10.937 130.814 580.836 261.000 10.765 25
Tuan Duc Ngo, Binh-Son Hua, Khoi Nguyen: ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution. CVPR 2023
SphereSeg0.835 251.000 10.963 310.891 90.794 280.954 80.822 90.710 440.961 20.721 300.693 310.530 490.653 311.000 10.867 400.857 400.859 210.991 510.771 22
GraphCut0.832 271.000 10.922 500.724 590.798 260.902 260.701 250.856 140.859 240.715 310.706 260.748 200.640 431.000 10.934 160.862 380.880 91.000 10.729 30
TopoSeg0.832 271.000 10.981 190.933 20.819 200.826 500.524 480.841 180.811 350.681 370.759 130.687 280.727 180.981 460.911 280.883 180.853 231.000 10.756 28
PBNetpermissive0.825 291.000 10.963 300.837 280.843 120.865 360.822 80.647 520.878 220.733 280.639 410.683 290.650 321.000 10.853 420.870 310.820 301.000 10.744 29
Weiguang Zhao, Yuyao Yan, Chaolong Yang, Jianan Ye, Xi Yang, Kaizhu Huang: Divide and Conquer: 3D Instance Segmentation With Point-Wise Binarization. ICCV 2023
SSEC0.820 301.000 10.983 180.924 40.826 180.817 530.415 570.899 50.793 400.673 390.731 220.636 340.653 301.000 10.939 120.804 610.878 111.000 10.780 20
DKNet0.815 311.000 10.930 420.844 250.765 370.915 220.534 460.805 270.805 370.807 230.654 350.763 180.650 321.000 10.794 540.881 190.766 341.000 10.758 26
Yizheng Wu, Min Shi, Shuaiyuan Du, Hao Lu, Zhiguo Cao, Weicai Zhong: 3D Instances as 1D Kernels. ECCV 2022
RPGN0.806 321.000 10.992 120.789 430.723 500.891 290.650 300.810 260.832 300.665 410.699 290.658 300.700 231.000 10.881 350.832 500.774 320.997 400.613 51
Shichao Dong, Guosheng Lin, Tzu-Yi Hung: Learning Regional Purity for Instance Segmentation on 3D Point Clouds. ECCV 2022
HAISpermissive0.803 331.000 10.994 100.820 330.759 380.855 420.554 430.882 80.827 330.615 470.676 330.638 330.646 411.000 10.912 200.797 640.767 330.994 490.726 31
Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang: Hierarchical Aggregation for 3D Instance Segmentation. ICCV 2021
Box2Mask0.803 331.000 10.962 320.874 130.707 540.887 320.686 290.598 570.961 10.715 320.694 300.469 540.700 231.000 10.912 200.902 80.753 390.997 400.637 45
Julian Chibane, Francis Engelmann, Tuan Anh Tran, Gerard Pons-Moll: Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation Using Bounding Boxes. ECCV 2022
Mask-Group0.792 351.000 10.968 290.812 340.766 360.864 370.460 510.815 240.888 190.598 510.651 380.639 320.600 490.918 520.941 100.896 120.721 461.000 10.723 32
Min Zhong, Xinghao Chen, Xiaokang Chen, Gang Zeng, Yunhe Wang: MaskGroup: Hierarchical Point Grouping and Masking for 3D Instance Segmentation. ICME 2022
CSC-Pretrained0.791 361.000 10.996 60.829 320.767 350.889 310.600 340.819 230.770 450.594 520.620 450.541 460.700 231.000 10.941 100.889 160.763 351.000 10.526 61
SSTNetpermissive0.789 371.000 10.840 640.888 100.717 510.835 460.717 220.684 500.627 600.724 290.652 370.727 250.600 491.000 10.912 200.822 530.757 381.000 10.691 39
Zhihao Liang, Zhihao Li, Songcen Xu, Mingkui Tan, Kui Jia: Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks. ICCV2021
GICN0.788 381.000 10.978 210.867 180.781 320.833 470.527 470.824 210.806 360.549 600.596 480.551 420.700 231.000 10.853 420.935 20.733 431.000 10.651 42
DENet0.786 391.000 10.929 430.736 560.750 440.720 660.755 190.934 10.794 390.590 530.561 540.537 470.650 321.000 10.882 340.804 620.789 311.000 10.719 33
DANCENET0.786 391.000 10.936 380.783 460.737 470.852 440.742 210.647 520.765 470.811 220.624 440.579 390.632 461.000 10.909 320.898 110.696 510.944 580.601 54
DualGroup0.782 411.000 10.927 440.811 350.772 340.853 430.631 330.805 270.773 420.613 480.611 460.610 370.650 320.835 630.881 350.879 210.750 411.000 10.675 40
PointGroup0.778 421.000 10.900 540.798 420.715 520.863 380.493 490.706 450.895 160.569 580.701 270.576 400.639 441.000 10.880 370.851 430.719 470.997 400.709 36
Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia: PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation. CVPR 2020 [oral]
PE0.776 431.000 10.900 550.860 200.728 490.869 340.400 580.857 130.774 410.568 590.701 280.602 380.646 410.933 510.843 450.890 150.691 550.997 400.709 35
Biao Zhang, Peter Wonka: Point Cloud Instance Segmentation using Probabilistic Embeddings. CVPR 2021
AOIA0.767 441.000 10.937 370.810 360.740 460.906 240.550 440.800 290.706 520.577 570.624 430.544 450.596 540.857 550.879 390.880 200.750 400.992 500.658 41
DD-UNet+Group0.764 451.000 10.897 570.837 270.753 410.830 490.459 530.824 210.699 540.629 450.653 360.438 570.650 321.000 10.880 370.858 390.690 561.000 10.650 43
H. Liu, R. Liu, K. Yang, J. Zhang, K. Peng, R. Stiefelhagen: HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor. ICCVW 2021
INS-Conv-instance0.762 461.000 10.923 470.765 510.785 300.905 250.600 340.655 510.646 590.683 360.647 390.530 480.650 321.000 10.824 470.830 510.693 540.944 580.644 44
Dyco3Dcopyleft0.761 471.000 10.935 390.893 80.752 430.863 390.600 340.588 580.742 490.641 430.633 420.546 440.550 560.857 550.789 560.853 420.762 360.987 520.699 37
Tong He; Chunhua Shen; Anton van den Hengel: DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution. CVPR2021
OccuSeg+instance0.742 481.000 10.923 470.785 440.745 450.867 350.557 400.578 610.729 500.670 400.644 400.488 520.577 551.000 10.794 540.830 510.620 641.000 10.550 57
Lei Han, Tian Zheng, Lan Xu, Lu Fang: OccuSeg: Occupancy-aware 3D Instance Segmentation. CVPR2020
RWSeg0.739 491.000 10.899 560.759 530.753 420.823 510.282 630.691 480.658 570.582 560.594 490.547 430.628 471.000 10.795 530.868 330.728 451.000 10.692 38
3D-MPA0.737 501.000 10.933 400.785 440.794 290.831 480.279 650.588 580.695 550.616 460.559 550.556 410.650 321.000 10.809 510.875 250.696 521.000 10.608 53
Francis Engelmann, Martin Bokeloh, Alireza Fathi, Bastian Leibe, Matthias Nießner: 3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation. CVPR 2020
MTML0.731 511.000 10.992 120.779 500.609 630.746 610.308 620.867 90.601 630.607 490.539 580.519 500.550 561.000 10.824 470.869 320.729 441.000 10.616 49
Jean Lahoud, Bernard Ghanem, Marc Pollefeys, Martin R. Oswald: 3D Instance Segmentation via Multi-task Metric Learning. ICCV 2019 [oral]
OSIS0.725 521.000 10.885 600.653 650.657 600.801 540.576 380.695 470.828 310.698 340.534 590.457 560.500 630.857 550.831 460.841 480.627 621.000 10.619 48
SSEN0.724 531.000 10.926 450.781 490.661 580.845 450.596 370.529 640.764 480.653 420.489 650.461 550.500 630.859 540.765 570.872 280.761 371.000 10.577 55
Dongsu Zhang, Junha Chun, Sang Kyun Cha, Young Min Kim: Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with Deep Metric Learning. Arxiv
NeuralBF0.718 541.000 10.945 360.901 70.754 400.817 520.460 510.700 460.772 430.688 350.568 530.000 760.500 630.981 460.606 670.872 270.740 421.000 10.614 50
Weiwei Sun, Daniel Rebain, Renjie Liao, Vladimir Tankovich, Soroosh Yazdani, Kwang Moo Yi, Andrea Tagliasacchi: NeuralBF: Neural Bilateral Filtering for Top-down Instance Segmentation on Point Clouds. WACV 2023
Sparse R-CNN0.714 551.000 10.926 460.694 600.699 560.890 300.636 320.516 650.693 560.743 270.588 500.369 610.601 480.594 690.800 520.886 170.676 570.986 530.546 58
SALoss-ResNet0.695 561.000 10.855 620.579 700.589 650.735 640.484 500.588 580.856 260.634 440.571 520.298 620.500 631.000 10.824 470.818 540.702 500.935 650.545 59
Zhidong Liang, Ming Yang, Hao Li, Chunxiang Wang: 3D Instance Embedding Learning With a Structure-Aware Loss Function for Point Cloud Segmentation. IEEE Robotics and Automation Letters (IROS2020)
PanopticFusion-inst0.693 571.000 10.852 630.655 640.616 620.788 560.334 600.763 350.771 440.457 700.555 560.652 310.518 600.857 550.765 570.732 700.631 600.944 580.577 56
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
Occipital-SCS0.688 581.000 10.913 510.730 580.737 480.743 630.442 540.855 150.655 580.546 610.546 570.263 640.508 620.889 530.568 680.771 670.705 490.889 680.625 47
3D-BoNet0.687 591.000 10.887 590.836 290.587 660.643 730.550 440.620 540.724 510.522 650.501 630.243 650.512 611.000 10.751 590.807 600.661 590.909 670.612 52
Bo Yang, Jianan Wang, Ronald Clark, Qingyong Hu, Sen Wang, Andrew Markham, Niki Trigoni: Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds. NeurIPS 2019 Spotlight
ClickSeg_Instance0.685 601.000 10.818 660.600 680.715 530.795 550.557 400.533 630.591 650.601 500.519 610.429 590.638 450.938 500.706 620.817 560.624 630.944 580.502 63
PCJC0.684 611.000 10.895 580.757 540.659 590.862 400.189 720.739 410.606 620.712 330.581 510.515 510.650 320.857 550.357 730.785 650.631 610.889 680.635 46
SPG_WSIS0.678 621.000 10.880 610.836 290.701 550.727 650.273 670.607 560.706 530.541 630.515 620.174 680.600 490.857 550.716 610.846 470.711 481.000 10.506 62
One_Thing_One_Clickpermissive0.675 631.000 10.823 650.782 470.621 610.766 580.211 690.736 420.560 670.586 540.522 600.636 350.453 670.641 670.853 420.850 450.694 530.997 400.411 68
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
SegGroup_inspermissive0.637 641.000 10.923 490.593 690.561 670.746 620.143 740.504 660.766 460.485 680.442 660.372 600.530 590.714 640.815 500.775 660.673 581.000 10.431 67
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
MASCpermissive0.615 650.711 720.802 670.540 710.757 390.777 570.029 750.577 620.588 660.521 660.600 470.436 580.534 580.697 650.616 660.838 490.526 660.980 550.534 60
Chen Liu, Yasutaka Furukawa: MASC: Multi-scale Affinity with Sparse Convolution for 3D Instance Segmentation.
UNet-backbone0.605 661.000 10.909 520.764 520.603 640.704 670.415 560.301 710.548 680.461 690.394 670.267 630.386 690.857 550.649 650.817 550.504 680.959 560.356 71
3D-SISpermissive0.558 671.000 10.773 680.614 670.503 700.691 690.200 700.412 670.498 710.546 620.311 720.103 720.600 490.857 550.382 700.799 630.445 740.938 640.371 69
Ji Hou, Angela Dai, Matthias Niessner: 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans. CVPR 2019
R-PointNet0.544 680.500 750.655 740.661 630.663 570.765 590.432 550.214 740.612 610.584 550.499 640.204 670.286 730.429 720.655 640.650 750.539 650.950 570.499 64
Hier3Dcopyleft0.540 691.000 10.727 690.626 660.467 730.693 680.200 700.412 670.480 720.528 640.318 710.077 750.600 490.688 660.382 700.768 680.472 700.941 620.350 72
Tan: HCFS3D: Hierarchical Coupled Feature Selection Network for 3D Semantic and Instance Segmentation.
Region-18class0.497 700.250 770.902 530.689 610.540 680.747 600.276 660.610 550.268 760.489 670.348 680.000 760.243 760.220 750.663 630.814 570.459 720.928 660.496 65
Sem_Recon_ins0.484 710.764 710.608 760.470 730.521 690.637 740.311 610.218 730.348 750.365 740.223 730.222 660.258 740.629 680.734 600.596 760.509 670.858 720.444 66
tmp0.474 721.000 10.727 690.433 750.481 720.673 710.022 770.380 690.517 700.436 720.338 700.128 700.343 710.429 720.291 750.728 710.473 690.833 730.300 74
SemRegionNet-20cls0.470 731.000 10.727 690.447 740.481 710.678 700.024 760.380 690.518 690.440 710.339 690.128 700.350 700.429 720.212 760.711 720.465 710.833 730.290 75
ASIS0.422 740.333 760.707 720.676 620.401 740.650 720.350 590.177 750.594 640.376 730.202 740.077 740.404 680.571 700.197 770.674 740.447 730.500 760.260 76
3D-BEVIS0.401 750.667 730.687 730.419 760.137 770.587 750.188 730.235 720.359 740.211 760.093 770.080 730.311 720.571 700.382 700.754 690.300 760.874 710.357 70
Cathrin Elich, Francis Engelmann, Jonas Schult, Theodora Kontogianni, Bastian Leibe: 3D-BEVIS: Birds-Eye-View Instance Segmentation.
Sgpn_scannet0.390 760.556 740.636 750.493 720.353 750.539 760.271 680.160 760.450 730.359 750.178 750.146 690.250 750.143 760.347 740.698 730.436 750.667 750.331 73
MaskRCNN 2d->3d Proj0.261 770.903 700.081 770.008 770.233 760.175 770.280 640.106 770.150 770.203 770.175 760.480 530.218 770.143 760.542 690.404 770.153 770.393 770.049 77