The 3D semantic instance prediction task involves detecting and segmenting the object in an 3D scan mesh.

Evaluation and metrics

Our evaluation ranks all methods according to the average precision for each class. We report the mean average precision AP at overlap 0.25 (AP 25%), overlap 0.5 (AP 50%), and over overlaps in the range [0.5:0.95:0.05] (AP). Note that multiple predictions of the same ground truth instance are penalized as false positives.



This table lists the benchmark results for the 3D semantic instance scenario.




Method Infoavg ap 25%bathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort by
SoftGrouppermissive0.865 161.000 10.969 250.860 190.860 50.913 200.558 350.899 30.911 130.760 210.828 10.736 190.802 80.981 420.919 160.875 230.877 131.000 10.820 10
Thang Vu, Kookhoi Kim, Tung M. Luu, Xuan Thanh Nguyen, Chang D. Yoo: SoftGroup for 3D Instance Segmentaiton on Point Clouds. CVPR 2022 [Oral]
SoftGroup++0.874 121.000 10.972 240.947 10.839 120.898 240.556 380.913 20.881 200.756 220.828 20.748 170.821 11.000 10.937 110.937 10.887 51.000 10.821 9
PointRel0.901 11.000 10.978 210.928 30.879 10.962 30.882 30.749 340.947 30.912 10.802 30.753 150.820 21.000 10.984 40.919 50.894 31.000 10.815 12
: Relation3D (PointRel): Enhancing Relation Modeling for Point Cloud Instance Segmentation.
InsSSM0.883 51.000 10.996 40.800 370.865 40.960 40.808 80.852 140.940 60.899 60.785 40.810 20.700 191.000 10.912 170.851 400.895 20.997 370.827 6
IPCA-Inst0.851 181.000 10.968 260.884 110.842 110.862 370.693 220.812 230.888 190.677 340.783 50.698 230.807 71.000 10.911 240.865 330.865 171.000 10.757 24
Competitor-SPFormer0.881 61.000 11.000 10.845 220.854 70.962 20.714 190.857 100.904 140.902 40.782 60.789 90.662 251.000 10.988 30.874 250.886 60.997 370.847 2
OneFormer3Dcopyleft0.896 21.000 11.000 10.913 60.858 60.951 70.786 120.837 170.916 120.908 20.778 70.803 40.750 131.000 10.976 50.926 40.882 70.995 450.849 1
Maxim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: OneFormer3D: One Transformer for Unified Point Cloud Segmentation.
Queryformer0.874 121.000 10.978 200.809 350.876 20.936 150.702 200.716 390.920 110.875 140.766 80.772 110.818 61.000 10.995 10.916 60.892 41.000 10.767 21
Mask3D0.870 141.000 10.985 150.782 440.818 170.938 140.760 140.749 340.923 90.877 130.760 90.785 100.820 21.000 10.912 170.864 340.878 110.983 510.825 7
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
Spherical Mask(CtoF)0.875 101.000 10.991 130.873 140.850 90.946 100.691 230.752 330.926 70.889 100.759 100.794 70.820 21.000 10.912 170.900 90.878 111.000 10.769 20
TopoSeg0.832 231.000 10.981 170.933 20.819 160.826 460.524 440.841 160.811 320.681 330.759 110.687 240.727 160.981 420.911 240.883 170.853 211.000 10.756 25
UniPerception0.884 41.000 10.979 180.872 160.869 30.892 250.806 90.890 50.835 280.892 70.755 120.811 10.779 100.955 450.951 60.876 220.914 10.997 370.840 4
EV3D0.877 91.000 10.996 60.873 140.854 80.950 80.691 230.783 290.926 70.889 110.754 130.794 80.820 21.000 10.912 170.900 90.860 181.000 10.779 18
MAFT0.860 171.000 10.990 140.810 340.829 130.949 90.809 70.688 450.836 270.904 30.751 140.796 60.741 141.000 10.864 370.848 420.837 221.000 10.828 5
MG-Former0.887 31.000 10.991 120.837 240.801 210.935 160.887 20.857 90.946 40.891 80.748 150.805 30.739 151.000 10.993 20.809 550.876 141.000 10.842 3
ExtMask3D0.867 151.000 11.000 10.756 510.816 190.940 120.795 100.760 320.862 220.888 120.739 160.763 130.774 111.000 10.929 150.878 210.879 91.000 10.819 11
Mask3D_evaluation0.843 201.000 10.955 310.847 210.795 230.932 170.750 160.780 300.891 160.818 170.737 170.633 320.703 181.000 10.902 290.870 280.820 250.941 590.805 13
SIM3D0.878 81.000 10.972 230.863 180.817 180.952 60.821 60.783 280.890 170.902 50.735 180.797 50.799 91.000 10.931 140.893 130.853 201.000 10.792 15
SSEC0.820 261.000 10.983 160.924 40.826 140.817 490.415 530.899 40.793 360.673 350.731 190.636 300.653 261.000 10.939 90.804 570.878 101.000 10.780 17
TST3D0.879 71.000 10.994 70.921 50.807 200.939 130.771 130.887 60.923 100.862 150.722 200.768 120.756 121.000 10.910 270.904 70.836 240.999 360.824 8
Duc Tran Dang Trung, Byeongkeun Kang, Yeejin Lee: MSTA3D: Multi-scale Twin-attention for 3D Instance Segmentation. ACM Multimedia 2024
SPFormerpermissive0.851 181.000 10.994 80.806 360.774 290.942 110.637 270.849 150.859 240.889 90.720 210.730 200.665 241.000 10.911 240.868 320.873 161.000 10.796 14
Sun Jiahao, Qing Chunmei, Tan Junpeng, Xu Xiangmin: Superpoint Transformer for 3D Scene Instance Segmentation. AAAI 2023 [Oral]
TD3Dpermissive0.875 101.000 10.976 220.877 120.783 270.970 10.889 10.828 180.945 50.803 200.713 220.720 220.709 171.000 10.936 120.934 30.873 151.000 10.791 16
Maksim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: Top-Down Beats Bottom-Up in 3D Instance Segmentation. WACV 2024
GraphCut0.832 231.000 10.922 460.724 550.798 220.902 230.701 210.856 120.859 230.715 270.706 230.748 160.640 391.000 10.934 130.862 350.880 81.000 10.729 27
PointGroup0.778 381.000 10.900 500.798 380.715 480.863 340.493 450.706 410.895 150.569 540.701 240.576 360.639 401.000 10.880 330.851 390.719 430.997 370.709 32
Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia: PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation. CVPR 2020 [oral]
PE0.776 391.000 10.900 510.860 190.728 450.869 300.400 540.857 110.774 370.568 550.701 250.602 340.646 370.933 470.843 410.890 140.691 510.997 370.709 31
Biao Zhang, Peter Wonka: Point Cloud Instance Segmentation using Probabilistic Embeddings. CVPR 2021
RPGN0.806 281.000 10.992 100.789 390.723 460.891 260.650 260.810 240.832 290.665 370.699 260.658 260.700 191.000 10.881 310.832 460.774 280.997 370.613 47
Shichao Dong, Guosheng Lin, Tzu-Yi Hung: Learning Regional Purity for Instance Segmentation on 3D Point Clouds. ECCV 2022
Box2Mask0.803 291.000 10.962 300.874 130.707 500.887 290.686 250.598 530.961 10.715 280.694 270.469 500.700 191.000 10.912 170.902 80.753 350.997 370.637 41
Julian Chibane, Francis Engelmann, Tuan Anh Tran, Gerard Pons-Moll: Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation Using Bounding Boxes. ECCV 2022
SphereSeg0.835 211.000 10.963 290.891 90.794 240.954 50.822 50.710 400.961 20.721 260.693 280.530 450.653 271.000 10.867 360.857 370.859 190.991 480.771 19
HAISpermissive0.803 291.000 10.994 80.820 300.759 340.855 380.554 390.882 70.827 310.615 430.676 290.638 290.646 371.000 10.912 170.797 600.767 290.994 460.726 28
Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang: Hierarchical Aggregation for 3D Instance Segmentation. ICCV 2021
ISBNetpermissive0.835 211.000 10.950 320.731 530.819 150.918 180.790 110.740 360.851 260.831 160.661 300.742 180.650 281.000 10.937 100.814 540.836 231.000 10.765 22
Tuan Duc Ngo, Binh-Son Hua, Khoi Nguyen: ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution. CVPR 2023
DKNet0.815 271.000 10.930 380.844 230.765 330.915 190.534 420.805 250.805 340.807 190.654 310.763 140.650 281.000 10.794 500.881 180.766 301.000 10.758 23
Yizheng Wu, Min Shi, Shuaiyuan Du, Hao Lu, Zhiguo Cao, Weicai Zhong: 3D Instances as 1D Kernels. ECCV 2022
DD-UNet+Group0.764 411.000 10.897 530.837 250.753 370.830 450.459 490.824 190.699 500.629 410.653 320.438 530.650 281.000 10.880 330.858 360.690 521.000 10.650 39
H. Liu, R. Liu, K. Yang, J. Zhang, K. Peng, R. Stiefelhagen: HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor. ICCVW 2021
SSTNetpermissive0.789 331.000 10.840 600.888 100.717 470.835 420.717 180.684 460.627 560.724 250.652 330.727 210.600 451.000 10.912 170.822 490.757 341.000 10.691 35
Zhihao Liang, Zhihao Li, Songcen Xu, Mingkui Tan, Kui Jia: Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks. ICCV2021
Mask-Group0.792 311.000 10.968 270.812 310.766 320.864 330.460 470.815 220.888 180.598 470.651 340.639 280.600 450.918 480.941 70.896 120.721 421.000 10.723 29
Min Zhong, Xinghao Chen, Xiaokang Chen, Gang Zeng, Yunhe Wang: MaskGroup: Hierarchical Point Grouping and Masking for 3D Instance Segmentation. ICME 2022
INS-Conv-instance0.762 421.000 10.923 430.765 470.785 260.905 220.600 300.655 470.646 550.683 320.647 350.530 440.650 281.000 10.824 430.830 470.693 500.944 550.644 40
OccuSeg+instance0.742 441.000 10.923 430.785 400.745 410.867 310.557 360.578 570.729 460.670 360.644 360.488 480.577 511.000 10.794 500.830 470.620 601.000 10.550 53
Lei Han, Tian Zheng, Lan Xu, Lu Fang: OccuSeg: Occupancy-aware 3D Instance Segmentation. CVPR2020
PBNetpermissive0.825 251.000 10.963 280.837 260.843 100.865 320.822 40.647 480.878 210.733 240.639 370.683 250.650 281.000 10.853 380.870 290.820 261.000 10.744 26
W.Zhao, Y.Yan, C.Yang, J.Ye,X.Yang,K.Huang: Divide and Conquer: 3D Instance Segmentation With Point-Wise Binarization. ICCV 2023
Dyco3Dcopyleft0.761 431.000 10.935 360.893 80.752 390.863 350.600 300.588 540.742 450.641 390.633 380.546 400.550 520.857 510.789 520.853 380.762 320.987 490.699 33
Tong He; Chunhua Shen; Anton van den Hengel: DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution. CVPR2021
AOIA0.767 401.000 10.937 340.810 330.740 420.906 210.550 400.800 270.706 480.577 530.624 390.544 410.596 500.857 510.879 350.880 190.750 360.992 470.658 37
DANCENET0.786 351.000 10.936 350.783 420.737 430.852 400.742 170.647 480.765 430.811 180.624 400.579 350.632 421.000 10.909 280.898 110.696 470.944 550.601 50
CSC-Pretrained0.791 321.000 10.996 40.829 290.767 310.889 280.600 300.819 210.770 410.594 480.620 410.541 420.700 191.000 10.941 70.889 150.763 311.000 10.526 57
DualGroup0.782 371.000 10.927 400.811 320.772 300.853 390.631 290.805 250.773 380.613 440.611 420.610 330.650 280.835 590.881 310.879 200.750 371.000 10.675 36
MASCpermissive0.615 610.711 680.802 630.540 670.757 350.777 530.029 710.577 580.588 620.521 620.600 430.436 540.534 540.697 610.616 620.838 450.526 620.980 520.534 56
Chen Liu, Yasutaka Furukawa: MASC: Multi-scale Affinity with Sparse Convolution for 3D Instance Segmentation.
GICN0.788 341.000 10.978 190.867 170.781 280.833 430.527 430.824 190.806 330.549 560.596 440.551 380.700 191.000 10.853 380.935 20.733 391.000 10.651 38
RWSeg0.739 451.000 10.899 520.759 490.753 380.823 470.282 590.691 440.658 530.582 520.594 450.547 390.628 431.000 10.795 490.868 310.728 411.000 10.692 34
Sparse R-CNN0.714 511.000 10.926 420.694 560.699 520.890 270.636 280.516 610.693 520.743 230.588 460.369 570.601 440.594 650.800 480.886 160.676 530.986 500.546 54
PCJC0.684 571.000 10.895 540.757 500.659 550.862 360.189 680.739 370.606 580.712 290.581 470.515 470.650 280.857 510.357 690.785 610.631 570.889 650.635 42
SALoss-ResNet0.695 521.000 10.855 580.579 660.589 610.735 600.484 460.588 540.856 250.634 400.571 480.298 580.500 591.000 10.824 430.818 500.702 460.935 620.545 55
Zhidong Liang, Ming Yang, Hao Li, Chunxiang Wang: 3D Instance Embedding Learning With a Structure-Aware Loss Function for Point Cloud Segmentation. IEEE Robotics and Automation Letters (IROS2020)
NeuralBF0.718 501.000 10.945 330.901 70.754 360.817 480.460 470.700 420.772 390.688 310.568 490.000 720.500 590.981 420.606 630.872 260.740 381.000 10.614 46
Weiwei Sun, Daniel Rebain, Renjie Liao, Vladimir Tankovich, Soroosh Yazdani, Kwang Moo Yi, Andrea Tagliasacchi: NeuralBF: Neural Bilateral Filtering for Top-down Instance Segmentation on Point Clouds. WACV 2023
DENet0.786 351.000 10.929 390.736 520.750 400.720 620.755 150.934 10.794 350.590 490.561 500.537 430.650 281.000 10.882 300.804 580.789 271.000 10.719 30
3D-MPA0.737 461.000 10.933 370.785 400.794 250.831 440.279 610.588 540.695 510.616 420.559 510.556 370.650 281.000 10.809 470.875 240.696 481.000 10.608 49
Francis Engelmann, Martin Bokeloh, Alireza Fathi, Bastian Leibe, Matthias Nießner: 3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation. CVPR 2020
PanopticFusion-inst0.693 531.000 10.852 590.655 600.616 580.788 520.334 560.763 310.771 400.457 660.555 520.652 270.518 560.857 510.765 530.732 660.631 560.944 550.577 52
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
Occipital-SCS0.688 541.000 10.913 470.730 540.737 440.743 590.442 500.855 130.655 540.546 570.546 530.263 600.508 580.889 490.568 640.771 630.705 450.889 650.625 43
MTML0.731 471.000 10.992 100.779 460.609 590.746 570.308 580.867 80.601 590.607 450.539 540.519 460.550 521.000 10.824 430.869 300.729 401.000 10.616 45
Jean Lahoud, Bernard Ghanem, Marc Pollefeys, Martin R. Oswald: 3D Instance Segmentation via Multi-task Metric Learning. ICCV 2019 [oral]
OSIS0.725 481.000 10.885 560.653 610.657 560.801 500.576 340.695 430.828 300.698 300.534 550.457 520.500 590.857 510.831 420.841 440.627 581.000 10.619 44
One_Thing_One_Clickpermissive0.675 591.000 10.823 610.782 430.621 570.766 540.211 650.736 380.560 630.586 500.522 560.636 310.453 630.641 630.853 380.850 410.694 490.997 370.411 64
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
ClickSeg_Instance0.685 561.000 10.818 620.600 640.715 490.795 510.557 360.533 590.591 610.601 460.519 570.429 550.638 410.938 460.706 580.817 520.624 590.944 550.502 59
SPG_WSIS0.678 581.000 10.880 570.836 270.701 510.727 610.273 630.607 520.706 490.541 590.515 580.174 640.600 450.857 510.716 570.846 430.711 441.000 10.506 58
3D-BoNet0.687 551.000 10.887 550.836 270.587 620.643 690.550 400.620 500.724 470.522 610.501 590.243 610.512 571.000 10.751 550.807 560.661 550.909 640.612 48
Bo Yang, Jianan Wang, Ronald Clark, Qingyong Hu, Sen Wang, Andrew Markham, Niki Trigoni: Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds. NeurIPS 2019 Spotlight
R-PointNet0.544 640.500 710.655 700.661 590.663 530.765 550.432 510.214 700.612 570.584 510.499 600.204 630.286 690.429 680.655 600.650 710.539 610.950 540.499 60
SSEN0.724 491.000 10.926 410.781 450.661 540.845 410.596 330.529 600.764 440.653 380.489 610.461 510.500 590.859 500.765 530.872 270.761 331.000 10.577 51
Dongsu Zhang, Junha Chun, Sang Kyun Cha, Young Min Kim: Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with Deep Metric Learning. Arxiv
SegGroup_inspermissive0.637 601.000 10.923 450.593 650.561 630.746 580.143 700.504 620.766 420.485 640.442 620.372 560.530 550.714 600.815 460.775 620.673 541.000 10.431 63
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
UNet-backbone0.605 621.000 10.909 480.764 480.603 600.704 630.415 520.301 670.548 640.461 650.394 630.267 590.386 650.857 510.649 610.817 510.504 640.959 530.356 67
Region-18class0.497 660.250 730.902 490.689 570.540 640.747 560.276 620.610 510.268 720.489 630.348 640.000 720.243 720.220 710.663 590.814 530.459 680.928 630.496 61
SemRegionNet-20cls0.470 691.000 10.727 650.447 700.481 670.678 660.024 720.380 650.518 650.440 670.339 650.128 660.350 660.429 680.212 720.711 680.465 670.833 690.290 71
tmp0.474 681.000 10.727 650.433 710.481 680.673 670.022 730.380 650.517 660.436 680.338 660.128 660.343 670.429 680.291 710.728 670.473 650.833 690.300 70
Hier3Dcopyleft0.540 651.000 10.727 650.626 620.467 690.693 640.200 660.412 630.480 680.528 600.318 670.077 710.600 450.688 620.382 660.768 640.472 660.941 590.350 68
Tan: HCFS3D: Hierarchical Coupled Feature Selection Network for 3D Semantic and Instance Segmentation.
3D-SISpermissive0.558 631.000 10.773 640.614 630.503 660.691 650.200 660.412 630.498 670.546 580.311 680.103 680.600 450.857 510.382 660.799 590.445 700.938 610.371 65
Ji Hou, Angela Dai, Matthias Niessner: 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans. CVPR 2019
Sem_Recon_ins0.484 670.764 670.608 720.470 690.521 650.637 700.311 570.218 690.348 710.365 700.223 690.222 620.258 700.629 640.734 560.596 720.509 630.858 680.444 62
ASIS0.422 700.333 720.707 680.676 580.401 700.650 680.350 550.177 710.594 600.376 690.202 700.077 700.404 640.571 660.197 730.674 700.447 690.500 720.260 72
Sgpn_scannet0.390 720.556 700.636 710.493 680.353 710.539 720.271 640.160 720.450 690.359 710.178 710.146 650.250 710.143 720.347 700.698 690.436 710.667 710.331 69
MaskRCNN 2d->3d Proj0.261 730.903 660.081 730.008 730.233 720.175 730.280 600.106 730.150 730.203 730.175 720.480 490.218 730.143 720.542 650.404 730.153 730.393 730.049 73
3D-BEVIS0.401 710.667 690.687 690.419 720.137 730.587 710.188 690.235 680.359 700.211 720.093 730.080 690.311 680.571 660.382 660.754 650.300 720.874 670.357 66
Cathrin Elich, Francis Engelmann, Jonas Schult, Theodora Kontogianni, Bastian Leibe: 3D-BEVIS: Birds-Eye-View Instance Segmentation.