The 3D semantic instance prediction task involves detecting and segmenting the object in an 3D scan mesh.

Evaluation and metrics

Our evaluation ranks all methods according to the average precision for each class. We report the mean average precision AP at overlap 0.25 (AP 25%), overlap 0.5 (AP 50%), and over overlaps in the range [0.5:0.95:0.05] (AP). Note that multiple predictions of the same ground truth instance are penalized as false positives.



This table lists the benchmark results for the 3D semantic instance scenario.




Method Infoavg ap 25%bathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
OneFormer3Dcopyleft0.896 11.000 11.000 10.913 40.858 40.951 30.786 90.837 130.916 70.908 20.778 40.803 20.750 101.000 10.976 20.926 40.882 50.995 390.849 1
Maxim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: OneFormer3D: One Transformer for Unified Point Cloud Segmentation.
UniPerception0.884 21.000 10.979 140.872 130.869 20.892 190.806 60.890 50.835 210.892 40.755 100.811 10.779 80.955 390.951 30.876 180.914 10.997 330.840 2
Spherical Mask(CtoF)0.875 31.000 10.991 90.873 120.850 50.946 50.691 180.752 270.926 40.889 60.759 80.794 40.820 21.000 10.912 130.900 70.878 91.000 10.769 14
TD3Dpermissive0.875 31.000 10.976 170.877 100.783 200.970 10.889 10.828 140.945 30.803 140.713 160.720 160.709 131.000 10.936 90.934 30.873 121.000 10.791 11
Maksim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: Top-Down Beats Bottom-Up in 3D Instance Segmentation. WACV 2024
Queryformer0.874 51.000 10.978 160.809 290.876 10.936 90.702 150.716 320.920 60.875 90.766 50.772 60.818 41.000 10.995 10.916 50.892 21.000 10.767 15
SoftGroup++0.874 51.000 10.972 180.947 10.839 80.898 180.556 320.913 20.881 130.756 160.828 20.748 100.821 11.000 10.937 80.937 10.887 31.000 10.821 5
Mask3D0.870 71.000 10.985 110.782 370.818 130.938 80.760 100.749 280.923 50.877 80.760 70.785 50.820 21.000 10.912 130.864 290.878 90.983 450.825 4
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
ExtMask3D0.867 81.000 11.000 10.756 440.816 140.940 70.795 70.760 260.862 150.888 70.739 120.763 70.774 91.000 10.929 110.878 170.879 71.000 10.819 7
SoftGrouppermissive0.865 91.000 10.969 190.860 150.860 30.913 130.558 290.899 30.911 80.760 150.828 10.736 120.802 60.981 360.919 120.875 190.877 111.000 10.820 6
Thang Vu, Kookhoi Kim, Tung M. Luu, Xuan Thanh Nguyen, Chang D. Yoo: SoftGroup for 3D Instance Segmentaiton on Point Clouds. CVPR 2022 [Oral]
MAFT0.860 101.000 10.990 100.810 280.829 90.949 40.809 50.688 390.836 200.904 30.751 110.796 30.741 111.000 10.864 310.848 360.837 171.000 10.828 3
IPCA-Inst0.851 111.000 10.968 200.884 90.842 70.862 310.693 170.812 190.888 120.677 280.783 30.698 170.807 51.000 10.911 190.865 280.865 141.000 10.757 18
SPFormerpermissive0.851 111.000 10.994 50.806 300.774 220.942 60.637 210.849 110.859 170.889 50.720 150.730 140.665 191.000 10.911 190.868 270.873 131.000 10.796 9
Sun Jiahao, Qing Chunmei, Tan Junpeng, Xu Xiangmin: Superpoint Transformer for 3D Scene Instance Segmentation. AAAI 2023 [Oral]
Mask3D_evaluation0.843 131.000 10.955 250.847 170.795 160.932 100.750 120.780 240.891 100.818 110.737 130.633 260.703 141.000 10.902 230.870 230.820 190.941 530.805 8
SIM3D0.842 141.000 10.998 30.608 570.717 410.908 140.818 40.699 360.798 280.908 10.760 60.733 130.793 71.000 10.912 130.831 410.883 41.000 10.792 10
ISBNetpermissive0.835 151.000 10.950 260.731 460.819 110.918 110.790 80.740 290.851 190.831 100.661 240.742 110.650 221.000 10.937 70.814 490.836 181.000 10.765 16
Tuan Duc Ngo, Binh-Son Hua, Khoi Nguyen: ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution. CVPR 2023
SphereSeg0.835 151.000 10.963 230.891 70.794 170.954 20.822 30.710 330.961 20.721 200.693 220.530 390.653 211.000 10.867 300.857 320.859 150.991 420.771 13
GraphCut0.832 171.000 10.922 400.724 480.798 150.902 170.701 160.856 90.859 160.715 210.706 170.748 90.640 331.000 10.934 100.862 300.880 61.000 10.729 21
TopoSeg0.832 171.000 10.981 130.933 20.819 120.826 400.524 380.841 120.811 250.681 270.759 90.687 180.727 120.981 360.911 190.883 130.853 161.000 10.756 19
PBNetpermissive0.825 191.000 10.963 220.837 200.843 60.865 260.822 20.647 420.878 140.733 180.639 310.683 190.650 221.000 10.853 320.870 240.820 201.000 10.744 20
W.Zhao, Y.Yan, C.Yang, J.Ye,X.Yang,K.Huang: Divide and Conquer: 3D Instance Segmentation With Point-Wise Binarization. ICCV 2023
SSEC0.820 201.000 10.983 120.924 30.826 100.817 430.415 470.899 40.793 300.673 290.731 140.636 240.653 201.000 10.939 60.804 510.878 81.000 10.780 12
DKNet0.815 211.000 10.930 320.844 180.765 260.915 120.534 360.805 210.805 270.807 130.654 250.763 80.650 221.000 10.794 440.881 140.766 241.000 10.758 17
Yizheng Wu, Min Shi, Shuaiyuan Du, Hao Lu, Zhiguo Cao, Weicai Zhong: 3D Instances as 1D Kernels. ECCV 2022
RPGN0.806 221.000 10.992 70.789 320.723 390.891 200.650 200.810 200.832 220.665 310.699 200.658 200.700 151.000 10.881 250.832 400.774 220.997 330.613 41
Shichao Dong, Guosheng Lin, Tzu-Yi Hung: Learning Regional Purity for Instance Segmentation on 3D Point Clouds. ECCV 2022
Box2Mask0.803 231.000 10.962 240.874 110.707 440.887 230.686 190.598 470.961 10.715 220.694 210.469 440.700 151.000 10.912 130.902 60.753 290.997 330.637 35
Julian Chibane, Francis Engelmann, Tuan Anh Tran, Gerard Pons-Moll: Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation Using Bounding Boxes. ECCV 2022
HAISpermissive0.803 231.000 10.994 50.820 240.759 270.855 320.554 330.882 60.827 240.615 370.676 230.638 230.646 311.000 10.912 130.797 540.767 230.994 400.726 22
Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang: Hierarchical Aggregation for 3D Instance Segmentation. ICCV 2021
Mask-Group0.792 251.000 10.968 210.812 250.766 250.864 270.460 410.815 180.888 110.598 410.651 280.639 220.600 390.918 420.941 40.896 90.721 361.000 10.723 23
Min Zhong, Xinghao Chen, Xiaokang Chen, Gang Zeng, Yunhe Wang: MaskGroup: Hierarchical Point Grouping and Masking for 3D Instance Segmentation. ICME 2022
CSC-Pretrained0.791 261.000 10.996 40.829 230.767 240.889 220.600 240.819 170.770 350.594 420.620 350.541 360.700 151.000 10.941 40.889 110.763 251.000 10.526 51
SSTNetpermissive0.789 271.000 10.840 540.888 80.717 400.835 360.717 140.684 400.627 500.724 190.652 270.727 150.600 391.000 10.912 130.822 440.757 281.000 10.691 29
Zhihao Liang, Zhihao Li, Songcen Xu, Mingkui Tan, Kui Jia: Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks. ICCV2021
GICN0.788 281.000 10.978 150.867 140.781 210.833 370.527 370.824 150.806 260.549 500.596 380.551 320.700 151.000 10.853 320.935 20.733 331.000 10.651 32
DANCENET0.786 291.000 10.936 290.783 350.737 360.852 340.742 130.647 420.765 370.811 120.624 340.579 290.632 361.000 10.909 220.898 80.696 410.944 490.601 44
DENet0.786 291.000 10.929 330.736 450.750 330.720 560.755 110.934 10.794 290.590 430.561 440.537 370.650 221.000 10.882 240.804 520.789 211.000 10.719 24
DualGroup0.782 311.000 10.927 340.811 260.772 230.853 330.631 230.805 210.773 320.613 380.611 360.610 270.650 220.835 530.881 250.879 160.750 311.000 10.675 30
PointGroup0.778 321.000 10.900 440.798 310.715 420.863 280.493 390.706 340.895 90.569 480.701 180.576 300.639 341.000 10.880 270.851 340.719 370.997 330.709 26
Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia: PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation. CVPR 2020 [oral]
PE0.776 331.000 10.900 450.860 150.728 380.869 240.400 480.857 80.774 310.568 490.701 190.602 280.646 310.933 410.843 350.890 100.691 450.997 330.709 25
Biao Zhang, Peter Wonka: Point Cloud Instance Segmentation using Probabilistic Embeddings. CVPR 2021
AOIA0.767 341.000 10.937 280.810 270.740 350.906 150.550 340.800 230.706 420.577 470.624 330.544 350.596 440.857 450.879 290.880 150.750 300.992 410.658 31
DD-UNet+Group0.764 351.000 10.897 470.837 190.753 300.830 390.459 430.824 150.699 440.629 350.653 260.438 470.650 221.000 10.880 270.858 310.690 461.000 10.650 33
H. Liu, R. Liu, K. Yang, J. Zhang, K. Peng, R. Stiefelhagen: HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor. ICCVW 2021
INS-Conv-instance0.762 361.000 10.923 370.765 400.785 190.905 160.600 240.655 410.646 490.683 260.647 290.530 380.650 221.000 10.824 370.830 420.693 440.944 490.644 34
Dyco3Dcopyleft0.761 371.000 10.935 300.893 60.752 320.863 290.600 240.588 480.742 390.641 330.633 320.546 340.550 460.857 450.789 460.853 330.762 260.987 430.699 27
Tong He; Chunhua Shen; Anton van den Hengel: DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution. CVPR2021
OccuSeg+instance0.742 381.000 10.923 370.785 330.745 340.867 250.557 300.578 510.729 400.670 300.644 300.488 420.577 451.000 10.794 440.830 420.620 541.000 10.550 47
Lei Han, Tian Zheng, Lan Xu, Lu Fang: OccuSeg: Occupancy-aware 3D Instance Segmentation. CVPR2020
RWSeg0.739 391.000 10.899 460.759 420.753 310.823 410.282 530.691 380.658 470.582 460.594 390.547 330.628 371.000 10.795 430.868 260.728 351.000 10.692 28
3D-MPA0.737 401.000 10.933 310.785 330.794 180.831 380.279 550.588 480.695 450.616 360.559 450.556 310.650 221.000 10.809 410.875 200.696 421.000 10.608 43
Francis Engelmann, Martin Bokeloh, Alireza Fathi, Bastian Leibe, Matthias Nießner: 3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation. CVPR 2020
MTML0.731 411.000 10.992 70.779 390.609 530.746 510.308 520.867 70.601 530.607 390.539 480.519 400.550 461.000 10.824 370.869 250.729 341.000 10.616 39
Jean Lahoud, Bernard Ghanem, Marc Pollefeys, Martin R. Oswald: 3D Instance Segmentation via Multi-task Metric Learning. ICCV 2019 [oral]
OSIS0.725 421.000 10.885 500.653 540.657 500.801 440.576 280.695 370.828 230.698 240.534 490.457 460.500 530.857 450.831 360.841 380.627 521.000 10.619 38
SSEN0.724 431.000 10.926 350.781 380.661 480.845 350.596 270.529 540.764 380.653 320.489 550.461 450.500 530.859 440.765 470.872 220.761 271.000 10.577 45
Dongsu Zhang, Junha Chun, Sang Kyun Cha, Young Min Kim: Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with Deep Metric Learning. Arxiv
NeuralBF0.718 441.000 10.945 270.901 50.754 290.817 420.460 410.700 350.772 330.688 250.568 430.000 660.500 530.981 360.606 570.872 210.740 321.000 10.614 40
Weiwei Sun, Daniel Rebain, Renjie Liao, Vladimir Tankovich, Soroosh Yazdani, Kwang Moo Yi, Andrea Tagliasacchi: NeuralBF: Neural Bilateral Filtering for Top-down Instance Segmentation on Point Clouds. WACV 2023
Sparse R-CNN0.714 451.000 10.926 360.694 490.699 460.890 210.636 220.516 550.693 460.743 170.588 400.369 510.601 380.594 590.800 420.886 120.676 470.986 440.546 48
SALoss-ResNet0.695 461.000 10.855 520.579 600.589 550.735 540.484 400.588 480.856 180.634 340.571 420.298 520.500 531.000 10.824 370.818 450.702 400.935 560.545 49
Zhidong Liang, Ming Yang, Hao Li, Chunxiang Wang: 3D Instance Embedding Learning With a Structure-Aware Loss Function for Point Cloud Segmentation. IEEE Robotics and Automation Letters (IROS2020)
PanopticFusion-inst0.693 471.000 10.852 530.655 530.616 520.788 460.334 500.763 250.771 340.457 600.555 460.652 210.518 500.857 450.765 470.732 600.631 500.944 490.577 46
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
Occipital-SCS0.688 481.000 10.913 410.730 470.737 370.743 530.442 440.855 100.655 480.546 510.546 470.263 540.508 520.889 430.568 580.771 570.705 390.889 590.625 37
3D-BoNet0.687 491.000 10.887 490.836 210.587 560.643 630.550 340.620 440.724 410.522 550.501 530.243 550.512 511.000 10.751 490.807 500.661 490.909 580.612 42
Bo Yang, Jianan Wang, Ronald Clark, Qingyong Hu, Sen Wang, Andrew Markham, Niki Trigoni: Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds. NeurIPS 2019 Spotlight
ClickSeg_Instance0.685 501.000 10.818 560.600 580.715 430.795 450.557 300.533 530.591 550.601 400.519 510.429 490.638 350.938 400.706 520.817 470.624 530.944 490.502 53
PCJC0.684 511.000 10.895 480.757 430.659 490.862 300.189 620.739 300.606 520.712 230.581 410.515 410.650 220.857 450.357 630.785 550.631 510.889 590.635 36
SPG_WSIS0.678 521.000 10.880 510.836 210.701 450.727 550.273 570.607 460.706 430.541 530.515 520.174 580.600 390.857 450.716 510.846 370.711 381.000 10.506 52
One_Thing_One_Clickpermissive0.675 531.000 10.823 550.782 360.621 510.766 480.211 590.736 310.560 570.586 440.522 500.636 250.453 570.641 570.853 320.850 350.694 430.997 330.411 58
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
SegGroup_inspermissive0.637 541.000 10.923 390.593 590.561 570.746 520.143 640.504 560.766 360.485 580.442 560.372 500.530 490.714 540.815 400.775 560.673 481.000 10.431 57
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
MASCpermissive0.615 550.711 620.802 570.540 610.757 280.777 470.029 650.577 520.588 560.521 560.600 370.436 480.534 480.697 550.616 560.838 390.526 560.980 460.534 50
Chen Liu, Yasutaka Furukawa: MASC: Multi-scale Affinity with Sparse Convolution for 3D Instance Segmentation.
UNet-backbone0.605 561.000 10.909 420.764 410.603 540.704 570.415 460.301 610.548 580.461 590.394 570.267 530.386 590.857 450.649 550.817 460.504 580.959 470.356 61
3D-SISpermissive0.558 571.000 10.773 580.614 560.503 600.691 590.200 600.412 570.498 610.546 520.311 620.103 620.600 390.857 450.382 600.799 530.445 640.938 550.371 59
Ji Hou, Angela Dai, Matthias Niessner: 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans. CVPR 2019
R-PointNet0.544 580.500 650.655 640.661 520.663 470.765 490.432 450.214 640.612 510.584 450.499 540.204 570.286 630.429 620.655 540.650 650.539 550.950 480.499 54
Hier3Dcopyleft0.540 591.000 10.727 590.626 550.467 630.693 580.200 600.412 570.480 620.528 540.318 610.077 650.600 390.688 560.382 600.768 580.472 600.941 530.350 62
Tan: HCFS3D: Hierarchical Coupled Feature Selection Network for 3D Semantic and Instance Segmentation.
Region-18class0.497 600.250 670.902 430.689 500.540 580.747 500.276 560.610 450.268 660.489 570.348 580.000 660.243 660.220 650.663 530.814 480.459 620.928 570.496 55
Sem_Recon_ins0.484 610.764 610.608 660.470 630.521 590.637 640.311 510.218 630.348 650.365 640.223 630.222 560.258 640.629 580.734 500.596 660.509 570.858 620.444 56
tmp0.474 621.000 10.727 590.433 650.481 620.673 610.022 670.380 590.517 600.436 620.338 600.128 600.343 610.429 620.291 650.728 610.473 590.833 630.300 64
SemRegionNet-20cls0.470 631.000 10.727 590.447 640.481 610.678 600.024 660.380 590.518 590.440 610.339 590.128 600.350 600.429 620.212 660.711 620.465 610.833 630.290 65
ASIS0.422 640.333 660.707 620.676 510.401 640.650 620.350 490.177 650.594 540.376 630.202 640.077 640.404 580.571 600.197 670.674 640.447 630.500 660.260 66
3D-BEVIS0.401 650.667 630.687 630.419 660.137 670.587 650.188 630.235 620.359 640.211 660.093 670.080 630.311 620.571 600.382 600.754 590.300 660.874 610.357 60
Cathrin Elich, Francis Engelmann, Jonas Schult, Theodora Kontogianni, Bastian Leibe: 3D-BEVIS: Birds-Eye-View Instance Segmentation.
Sgpn_scannet0.390 660.556 640.636 650.493 620.353 650.539 660.271 580.160 660.450 630.359 650.178 650.146 590.250 650.143 660.347 640.698 630.436 650.667 650.331 63
MaskRCNN 2d->3d Proj0.261 670.903 600.081 670.008 670.233 660.175 670.280 540.106 670.150 670.203 670.175 660.480 430.218 670.143 660.542 590.404 670.153 670.393 670.049 67