The 3D semantic instance prediction task involves detecting and segmenting the object in an 3D scan mesh.

Evaluation and metrics

Our evaluation ranks all methods according to the average precision for each class. We report the mean average precision AP at overlap 0.25 (AP 25%), overlap 0.5 (AP 50%), and over overlaps in the range [0.5:0.95:0.05] (AP). Note that multiple predictions of the same ground truth instance are penalized as false positives.



This table lists the benchmark results for the 3D semantic instance scenario.




Method Infoavg ap 25%bathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
OneFormer3Dcopyleft0.896 11.000 11.000 10.913 50.858 40.951 30.786 90.837 140.916 80.908 20.778 40.803 20.750 111.000 10.976 20.926 40.882 50.995 400.849 1
Maxim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: OneFormer3D: One Transformer for Unified Point Cloud Segmentation.
UniPerception0.884 21.000 10.979 150.872 140.869 20.892 200.806 60.890 50.835 220.892 40.755 100.811 10.779 80.955 400.951 30.876 190.914 10.997 340.840 2
TST3D0.879 31.000 10.994 50.921 40.807 150.939 80.771 100.887 60.923 60.862 100.722 150.768 70.756 101.000 10.910 220.904 60.836 190.999 330.824 5
Spherical Mask(CtoF)0.875 41.000 10.991 100.873 130.850 50.946 50.691 190.752 280.926 40.889 60.759 80.794 40.820 21.000 10.912 130.900 80.878 91.000 10.769 15
TD3Dpermissive0.875 41.000 10.976 180.877 110.783 210.970 10.889 10.828 150.945 30.803 150.713 170.720 170.709 141.000 10.936 90.934 30.873 121.000 10.791 12
Maksim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: Top-Down Beats Bottom-Up in 3D Instance Segmentation. WACV 2024
Queryformer0.874 61.000 10.978 170.809 300.876 10.936 100.702 160.716 330.920 70.875 90.766 50.772 60.818 41.000 10.995 10.916 50.892 21.000 10.767 16
SoftGroup++0.874 61.000 10.972 190.947 10.839 80.898 190.556 330.913 20.881 140.756 170.828 20.748 110.821 11.000 10.937 80.937 10.887 31.000 10.821 6
Mask3D0.870 81.000 10.985 120.782 380.818 130.938 90.760 110.749 290.923 50.877 80.760 70.785 50.820 21.000 10.912 130.864 300.878 90.983 460.825 4
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
ExtMask3D0.867 91.000 11.000 10.756 450.816 140.940 70.795 70.760 270.862 160.888 70.739 120.763 80.774 91.000 10.929 110.878 180.879 71.000 10.819 8
SoftGrouppermissive0.865 101.000 10.969 200.860 160.860 30.913 140.558 300.899 30.911 90.760 160.828 10.736 130.802 60.981 370.919 120.875 200.877 111.000 10.820 7
Thang Vu, Kookhoi Kim, Tung M. Luu, Xuan Thanh Nguyen, Chang D. Yoo: SoftGroup for 3D Instance Segmentaiton on Point Clouds. CVPR 2022 [Oral]
MAFT0.860 111.000 10.990 110.810 290.829 90.949 40.809 50.688 400.836 210.904 30.751 110.796 30.741 121.000 10.864 320.848 370.837 171.000 10.828 3
IPCA-Inst0.851 121.000 10.968 210.884 100.842 70.862 320.693 180.812 200.888 130.677 290.783 30.698 180.807 51.000 10.911 190.865 290.865 141.000 10.757 19
SPFormerpermissive0.851 121.000 10.994 60.806 310.774 230.942 60.637 220.849 120.859 180.889 50.720 160.730 150.665 201.000 10.911 190.868 280.873 131.000 10.796 10
Sun Jiahao, Qing Chunmei, Tan Junpeng, Xu Xiangmin: Superpoint Transformer for 3D Scene Instance Segmentation. AAAI 2023 [Oral]
Mask3D_evaluation0.843 141.000 10.955 260.847 180.795 170.932 110.750 130.780 250.891 110.818 120.737 130.633 270.703 151.000 10.902 240.870 240.820 200.941 540.805 9
SIM3D0.842 151.000 10.998 30.608 580.717 420.908 150.818 40.699 370.798 290.908 10.760 60.733 140.793 71.000 10.912 130.831 420.883 41.000 10.792 11
ISBNetpermissive0.835 161.000 10.950 270.731 470.819 110.918 120.790 80.740 300.851 200.831 110.661 250.742 120.650 231.000 10.937 70.814 500.836 181.000 10.765 17
Tuan Duc Ngo, Binh-Son Hua, Khoi Nguyen: ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution. CVPR 2023
SphereSeg0.835 161.000 10.963 240.891 80.794 180.954 20.822 30.710 340.961 20.721 210.693 230.530 400.653 221.000 10.867 310.857 330.859 150.991 430.771 14
GraphCut0.832 181.000 10.922 410.724 490.798 160.902 180.701 170.856 100.859 170.715 220.706 180.748 100.640 341.000 10.934 100.862 310.880 61.000 10.729 22
TopoSeg0.832 181.000 10.981 140.933 20.819 120.826 410.524 390.841 130.811 260.681 280.759 90.687 190.727 130.981 370.911 190.883 140.853 161.000 10.756 20
PBNetpermissive0.825 201.000 10.963 230.837 210.843 60.865 270.822 20.647 430.878 150.733 190.639 320.683 200.650 231.000 10.853 330.870 250.820 211.000 10.744 21
W.Zhao, Y.Yan, C.Yang, J.Ye,X.Yang,K.Huang: Divide and Conquer: 3D Instance Segmentation With Point-Wise Binarization. ICCV 2023
SSEC0.820 211.000 10.983 130.924 30.826 100.817 440.415 480.899 40.793 310.673 300.731 140.636 250.653 211.000 10.939 60.804 520.878 81.000 10.780 13
DKNet0.815 221.000 10.930 330.844 190.765 270.915 130.534 370.805 220.805 280.807 140.654 260.763 90.650 231.000 10.794 450.881 150.766 251.000 10.758 18
Yizheng Wu, Min Shi, Shuaiyuan Du, Hao Lu, Zhiguo Cao, Weicai Zhong: 3D Instances as 1D Kernels. ECCV 2022
RPGN0.806 231.000 10.992 80.789 330.723 400.891 210.650 210.810 210.832 230.665 320.699 210.658 210.700 161.000 10.881 260.832 410.774 230.997 340.613 42
Shichao Dong, Guosheng Lin, Tzu-Yi Hung: Learning Regional Purity for Instance Segmentation on 3D Point Clouds. ECCV 2022
HAISpermissive0.803 241.000 10.994 60.820 250.759 280.855 330.554 340.882 70.827 250.615 380.676 240.638 240.646 321.000 10.912 130.797 550.767 240.994 410.726 23
Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang: Hierarchical Aggregation for 3D Instance Segmentation. ICCV 2021
Box2Mask0.803 241.000 10.962 250.874 120.707 450.887 240.686 200.598 480.961 10.715 230.694 220.469 450.700 161.000 10.912 130.902 70.753 300.997 340.637 36
Julian Chibane, Francis Engelmann, Tuan Anh Tran, Gerard Pons-Moll: Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation Using Bounding Boxes. ECCV 2022
Mask-Group0.792 261.000 10.968 220.812 260.766 260.864 280.460 420.815 190.888 120.598 420.651 290.639 230.600 400.918 430.941 40.896 100.721 371.000 10.723 24
Min Zhong, Xinghao Chen, Xiaokang Chen, Gang Zeng, Yunhe Wang: MaskGroup: Hierarchical Point Grouping and Masking for 3D Instance Segmentation. ICME 2022
CSC-Pretrained0.791 271.000 10.996 40.829 240.767 250.889 230.600 250.819 180.770 360.594 430.620 360.541 370.700 161.000 10.941 40.889 120.763 261.000 10.526 52
SSTNetpermissive0.789 281.000 10.840 550.888 90.717 410.835 370.717 150.684 410.627 510.724 200.652 280.727 160.600 401.000 10.912 130.822 450.757 291.000 10.691 30
Zhihao Liang, Zhihao Li, Songcen Xu, Mingkui Tan, Kui Jia: Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks. ICCV2021
GICN0.788 291.000 10.978 160.867 150.781 220.833 380.527 380.824 160.806 270.549 510.596 390.551 330.700 161.000 10.853 330.935 20.733 341.000 10.651 33
DANCENET0.786 301.000 10.936 300.783 360.737 370.852 350.742 140.647 430.765 380.811 130.624 350.579 300.632 371.000 10.909 230.898 90.696 420.944 500.601 45
DENet0.786 301.000 10.929 340.736 460.750 340.720 570.755 120.934 10.794 300.590 440.561 450.537 380.650 231.000 10.882 250.804 530.789 221.000 10.719 25
DualGroup0.782 321.000 10.927 350.811 270.772 240.853 340.631 240.805 220.773 330.613 390.611 370.610 280.650 230.835 540.881 260.879 170.750 321.000 10.675 31
PointGroup0.778 331.000 10.900 450.798 320.715 430.863 290.493 400.706 350.895 100.569 490.701 190.576 310.639 351.000 10.880 280.851 350.719 380.997 340.709 27
Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia: PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation. CVPR 2020 [oral]
PE0.776 341.000 10.900 460.860 160.728 390.869 250.400 490.857 90.774 320.568 500.701 200.602 290.646 320.933 420.843 360.890 110.691 460.997 340.709 26
Biao Zhang, Peter Wonka: Point Cloud Instance Segmentation using Probabilistic Embeddings. CVPR 2021
AOIA0.767 351.000 10.937 290.810 280.740 360.906 160.550 350.800 240.706 430.577 480.624 340.544 360.596 450.857 460.879 300.880 160.750 310.992 420.658 32
DD-UNet+Group0.764 361.000 10.897 480.837 200.753 310.830 400.459 440.824 160.699 450.629 360.653 270.438 480.650 231.000 10.880 280.858 320.690 471.000 10.650 34
H. Liu, R. Liu, K. Yang, J. Zhang, K. Peng, R. Stiefelhagen: HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor. ICCVW 2021
INS-Conv-instance0.762 371.000 10.923 380.765 410.785 200.905 170.600 250.655 420.646 500.683 270.647 300.530 390.650 231.000 10.824 380.830 430.693 450.944 500.644 35
Dyco3Dcopyleft0.761 381.000 10.935 310.893 70.752 330.863 300.600 250.588 490.742 400.641 340.633 330.546 350.550 470.857 460.789 470.853 340.762 270.987 440.699 28
Tong He; Chunhua Shen; Anton van den Hengel: DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution. CVPR2021
OccuSeg+instance0.742 391.000 10.923 380.785 340.745 350.867 260.557 310.578 520.729 410.670 310.644 310.488 430.577 461.000 10.794 450.830 430.620 551.000 10.550 48
Lei Han, Tian Zheng, Lan Xu, Lu Fang: OccuSeg: Occupancy-aware 3D Instance Segmentation. CVPR2020
RWSeg0.739 401.000 10.899 470.759 430.753 320.823 420.282 540.691 390.658 480.582 470.594 400.547 340.628 381.000 10.795 440.868 270.728 361.000 10.692 29
3D-MPA0.737 411.000 10.933 320.785 340.794 190.831 390.279 560.588 490.695 460.616 370.559 460.556 320.650 231.000 10.809 420.875 210.696 431.000 10.608 44
Francis Engelmann, Martin Bokeloh, Alireza Fathi, Bastian Leibe, Matthias Nießner: 3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation. CVPR 2020
MTML0.731 421.000 10.992 80.779 400.609 540.746 520.308 530.867 80.601 540.607 400.539 490.519 410.550 471.000 10.824 380.869 260.729 351.000 10.616 40
Jean Lahoud, Bernard Ghanem, Marc Pollefeys, Martin R. Oswald: 3D Instance Segmentation via Multi-task Metric Learning. ICCV 2019 [oral]
OSIS0.725 431.000 10.885 510.653 550.657 510.801 450.576 290.695 380.828 240.698 250.534 500.457 470.500 540.857 460.831 370.841 390.627 531.000 10.619 39
SSEN0.724 441.000 10.926 360.781 390.661 490.845 360.596 280.529 550.764 390.653 330.489 560.461 460.500 540.859 450.765 480.872 230.761 281.000 10.577 46
Dongsu Zhang, Junha Chun, Sang Kyun Cha, Young Min Kim: Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with Deep Metric Learning. Arxiv
NeuralBF0.718 451.000 10.945 280.901 60.754 300.817 430.460 420.700 360.772 340.688 260.568 440.000 670.500 540.981 370.606 580.872 220.740 331.000 10.614 41
Weiwei Sun, Daniel Rebain, Renjie Liao, Vladimir Tankovich, Soroosh Yazdani, Kwang Moo Yi, Andrea Tagliasacchi: NeuralBF: Neural Bilateral Filtering for Top-down Instance Segmentation on Point Clouds. WACV 2023
Sparse R-CNN0.714 461.000 10.926 370.694 500.699 470.890 220.636 230.516 560.693 470.743 180.588 410.369 520.601 390.594 600.800 430.886 130.676 480.986 450.546 49
SALoss-ResNet0.695 471.000 10.855 530.579 610.589 560.735 550.484 410.588 490.856 190.634 350.571 430.298 530.500 541.000 10.824 380.818 460.702 410.935 570.545 50
Zhidong Liang, Ming Yang, Hao Li, Chunxiang Wang: 3D Instance Embedding Learning With a Structure-Aware Loss Function for Point Cloud Segmentation. IEEE Robotics and Automation Letters (IROS2020)
PanopticFusion-inst0.693 481.000 10.852 540.655 540.616 530.788 470.334 510.763 260.771 350.457 610.555 470.652 220.518 510.857 460.765 480.732 610.631 510.944 500.577 47
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
Occipital-SCS0.688 491.000 10.913 420.730 480.737 380.743 540.442 450.855 110.655 490.546 520.546 480.263 550.508 530.889 440.568 590.771 580.705 400.889 600.625 38
3D-BoNet0.687 501.000 10.887 500.836 220.587 570.643 640.550 350.620 450.724 420.522 560.501 540.243 560.512 521.000 10.751 500.807 510.661 500.909 590.612 43
Bo Yang, Jianan Wang, Ronald Clark, Qingyong Hu, Sen Wang, Andrew Markham, Niki Trigoni: Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds. NeurIPS 2019 Spotlight
ClickSeg_Instance0.685 511.000 10.818 570.600 590.715 440.795 460.557 310.533 540.591 560.601 410.519 520.429 500.638 360.938 410.706 530.817 480.624 540.944 500.502 54
PCJC0.684 521.000 10.895 490.757 440.659 500.862 310.189 630.739 310.606 530.712 240.581 420.515 420.650 230.857 460.357 640.785 560.631 520.889 600.635 37
SPG_WSIS0.678 531.000 10.880 520.836 220.701 460.727 560.273 580.607 470.706 440.541 540.515 530.174 590.600 400.857 460.716 520.846 380.711 391.000 10.506 53
One_Thing_One_Clickpermissive0.675 541.000 10.823 560.782 370.621 520.766 490.211 600.736 320.560 580.586 450.522 510.636 260.453 580.641 580.853 330.850 360.694 440.997 340.411 59
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
SegGroup_inspermissive0.637 551.000 10.923 400.593 600.561 580.746 530.143 650.504 570.766 370.485 590.442 570.372 510.530 500.714 550.815 410.775 570.673 491.000 10.431 58
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
MASCpermissive0.615 560.711 630.802 580.540 620.757 290.777 480.029 660.577 530.588 570.521 570.600 380.436 490.534 490.697 560.616 570.838 400.526 570.980 470.534 51
Chen Liu, Yasutaka Furukawa: MASC: Multi-scale Affinity with Sparse Convolution for 3D Instance Segmentation.
UNet-backbone0.605 571.000 10.909 430.764 420.603 550.704 580.415 470.301 620.548 590.461 600.394 580.267 540.386 600.857 460.649 560.817 470.504 590.959 480.356 62
3D-SISpermissive0.558 581.000 10.773 590.614 570.503 610.691 600.200 610.412 580.498 620.546 530.311 630.103 630.600 400.857 460.382 610.799 540.445 650.938 560.371 60
Ji Hou, Angela Dai, Matthias Niessner: 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans. CVPR 2019
R-PointNet0.544 590.500 660.655 650.661 530.663 480.765 500.432 460.214 650.612 520.584 460.499 550.204 580.286 640.429 630.655 550.650 660.539 560.950 490.499 55
Hier3Dcopyleft0.540 601.000 10.727 600.626 560.467 640.693 590.200 610.412 580.480 630.528 550.318 620.077 660.600 400.688 570.382 610.768 590.472 610.941 540.350 63
Tan: HCFS3D: Hierarchical Coupled Feature Selection Network for 3D Semantic and Instance Segmentation.
Region-18class0.497 610.250 680.902 440.689 510.540 590.747 510.276 570.610 460.268 670.489 580.348 590.000 670.243 670.220 660.663 540.814 490.459 630.928 580.496 56
Sem_Recon_ins0.484 620.764 620.608 670.470 640.521 600.637 650.311 520.218 640.348 660.365 650.223 640.222 570.258 650.629 590.734 510.596 670.509 580.858 630.444 57
tmp0.474 631.000 10.727 600.433 660.481 630.673 620.022 680.380 600.517 610.436 630.338 610.128 610.343 620.429 630.291 660.728 620.473 600.833 640.300 65
SemRegionNet-20cls0.470 641.000 10.727 600.447 650.481 620.678 610.024 670.380 600.518 600.440 620.339 600.128 610.350 610.429 630.212 670.711 630.465 620.833 640.290 66
ASIS0.422 650.333 670.707 630.676 520.401 650.650 630.350 500.177 660.594 550.376 640.202 650.077 650.404 590.571 610.197 680.674 650.447 640.500 670.260 67
3D-BEVIS0.401 660.667 640.687 640.419 670.137 680.587 660.188 640.235 630.359 650.211 670.093 680.080 640.311 630.571 610.382 610.754 600.300 670.874 620.357 61
Cathrin Elich, Francis Engelmann, Jonas Schult, Theodora Kontogianni, Bastian Leibe: 3D-BEVIS: Birds-Eye-View Instance Segmentation.
Sgpn_scannet0.390 670.556 650.636 660.493 630.353 660.539 670.271 590.160 670.450 640.359 660.178 660.146 600.250 660.143 670.347 650.698 640.436 660.667 660.331 64
MaskRCNN 2d->3d Proj0.261 680.903 610.081 680.008 680.233 670.175 680.280 550.106 680.150 680.203 680.175 670.480 440.218 680.143 670.542 600.404 680.153 680.393 680.049 68