The 3D semantic instance prediction task involves detecting and segmenting the object in an 3D scan mesh.

Evaluation and metrics

Our evaluation ranks all methods according to the average precision for each class. We report the mean average precision AP at overlap 0.25 (AP 25%), overlap 0.5 (AP 50%), and over overlaps in the range [0.5:0.95:0.05] (AP). Note that multiple predictions of the same ground truth instance are penalized as false positives.



This table lists the benchmark results for the 3D semantic instance scenario.




Method Infoavg ap 50%bathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Spherical Mask(CtoF)0.812 11.000 10.973 30.852 120.718 40.917 50.574 40.677 250.748 80.729 80.715 50.795 20.809 11.000 10.831 20.854 80.787 71.000 10.638 4
SIM3D0.805 21.000 10.971 40.863 110.686 130.924 40.552 70.739 170.674 150.740 60.666 110.807 10.789 71.000 10.803 50.866 50.775 131.000 10.639 3
OneFormer3Dcopyleft0.801 31.000 10.973 20.909 50.698 100.928 20.582 30.668 290.685 130.780 20.687 90.698 130.702 121.000 10.794 70.900 20.784 90.986 460.635 5
Maxim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: OneFormer3D: One Transformer for Unified Point Cloud Segmentation.
UniPerception0.800 41.000 10.930 60.872 90.727 30.862 180.454 130.764 130.820 10.746 50.706 70.750 30.772 80.926 390.764 120.818 230.826 10.997 340.660 2
InsSSM0.799 51.000 10.915 80.710 350.729 20.925 30.664 10.670 270.770 50.766 30.739 20.737 40.700 131.000 10.792 80.829 170.815 30.997 340.625 7
TST3D0.795 61.000 10.929 70.918 40.709 70.884 130.596 20.704 220.769 60.734 70.644 150.699 120.751 101.000 10.794 60.876 40.757 170.997 340.550 26
ExtMask3D0.789 71.000 10.988 10.756 280.706 80.912 60.429 140.647 340.806 40.755 40.673 100.689 140.772 91.000 10.789 90.852 90.811 41.000 10.617 10
Queryformer0.787 81.000 10.933 50.601 440.754 10.886 110.558 60.661 310.767 70.665 130.716 40.639 190.808 31.000 10.844 10.897 30.804 51.000 10.624 8
MAFT0.786 91.000 10.894 130.807 180.694 120.893 90.486 90.674 260.740 90.786 10.704 80.727 60.739 111.000 10.707 180.849 110.756 181.000 10.685 1
Mask3D0.780 101.000 10.786 370.716 330.696 110.885 120.500 80.714 200.810 30.672 120.715 50.679 150.809 11.000 10.831 20.833 150.787 71.000 10.602 14
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
SPFormerpermissive0.770 110.903 500.903 100.806 190.609 260.886 100.568 50.815 60.705 120.711 90.655 120.652 180.685 181.000 10.789 100.809 240.776 121.000 10.583 19
Sun Jiahao, Qing Chunmei, Tan Junpeng, Xu Xiangmin: Superpoint Transformer for 3D Scene Instance Segmentation. AAAI 2023 [Oral]
SoftGroup++0.769 121.000 10.803 300.937 10.684 140.865 150.213 290.870 20.664 170.571 190.758 10.702 100.807 41.000 10.653 250.902 10.792 61.000 10.626 6
SoftGrouppermissive0.761 131.000 10.808 260.845 130.716 50.862 170.243 260.824 40.655 190.620 140.734 30.699 110.791 60.981 330.716 160.844 120.769 141.000 10.594 17
Thang Vu, Kookhoi Kim, Tung M. Luu, Xuan Thanh Nguyen, Chang D. Yoo: SoftGroup for 3D Instance Segmentaiton on Point Clouds. CVPR 2022 [Oral]
ISBNetpermissive0.757 141.000 10.904 90.731 310.678 150.895 70.458 110.644 360.670 160.710 100.620 200.732 50.650 201.000 10.756 130.778 270.779 101.000 10.614 11
Tuan Duc Ngo, Binh-Son Hua, Khoi Nguyen: ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution. CVPR 2023
TD3Dpermissive0.751 151.000 10.774 380.867 100.621 220.934 10.404 150.706 210.812 20.605 170.633 180.626 200.690 171.000 10.640 270.820 200.777 111.000 10.612 12
Maksim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: Top-Down Beats Bottom-Up in 3D Instance Segmentation. WACV 2024
PBNetpermissive0.747 161.000 10.818 220.837 150.713 60.844 200.457 120.647 340.711 110.614 150.617 220.657 170.650 201.000 10.692 190.822 190.765 161.000 10.595 16
W.Zhao, Y.Yan, C.Yang, J.Ye,X.Yang,K.Huang: Divide and Conquer: 3D Instance Segmentation With Point-Wise Binarization. ICCV 2023
GraphCut0.732 171.000 10.788 350.724 320.642 200.859 190.248 250.787 110.618 220.596 180.653 140.722 80.583 411.000 10.766 110.861 60.825 21.000 10.504 32
IPCA-Inst0.731 181.000 10.788 360.884 80.698 90.788 360.252 240.760 140.646 200.511 270.637 170.665 160.804 51.000 10.644 260.778 280.747 201.000 10.561 23
TopoSeg0.725 191.000 10.806 290.933 20.668 170.758 400.272 230.734 190.630 210.549 230.654 130.606 210.697 160.966 360.612 310.839 130.754 191.000 10.573 20
DKNet0.718 201.000 10.814 230.782 220.619 230.872 140.224 270.751 160.569 260.677 110.585 260.724 70.633 310.981 330.515 410.819 210.736 211.000 10.617 9
Yizheng Wu, Min Shi, Shuaiyuan Du, Hao Lu, Zhiguo Cao, Weicai Zhong: 3D Instances as 1D Kernels. ECCV 2022
SSEC0.707 211.000 10.850 150.924 30.648 180.747 430.162 310.862 30.572 250.520 250.624 190.549 240.649 291.000 10.560 360.706 430.768 151.000 10.591 18
HAISpermissive0.699 221.000 10.849 160.820 160.675 160.808 300.279 210.757 150.465 320.517 260.596 240.559 230.600 351.000 10.654 240.767 300.676 250.994 420.560 24
Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang: Hierarchical Aggregation for 3D Instance Segmentation. ICCV 2021
SSTNetpermissive0.698 231.000 10.697 540.888 70.556 330.803 310.387 160.626 380.417 370.556 220.585 270.702 90.600 351.000 10.824 40.720 420.692 231.000 10.509 31
Zhihao Liang, Zhihao Li, Songcen Xu, Mingkui Tan, Kui Jia: Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks. ICCV2021
DualGroup0.694 241.000 10.799 320.811 170.622 210.817 250.376 170.805 90.590 240.487 310.568 300.525 280.650 200.835 490.600 320.829 160.655 281.000 10.526 28
SphereSeg0.680 251.000 10.856 140.744 290.618 240.893 80.151 320.651 330.713 100.537 240.579 290.430 380.651 191.000 10.389 520.744 370.697 220.991 440.601 15
DANCENET0.680 251.000 10.807 270.733 300.600 270.768 390.375 180.543 460.538 270.610 160.599 230.498 290.632 330.981 330.739 150.856 70.633 340.882 570.454 41
Box2Mask0.677 271.000 10.847 170.771 240.509 420.816 260.277 220.558 450.482 290.562 210.640 160.448 340.700 131.000 10.666 200.852 100.578 410.997 340.488 36
Julian Chibane, Francis Engelmann, Tuan Anh Tran, Gerard Pons-Moll: Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation Using Bounding Boxes. ECCV 2022
OccuSeg+instance0.672 281.000 10.758 460.682 370.576 310.842 210.477 100.504 520.524 280.567 200.585 280.451 330.557 431.000 10.751 140.797 250.563 441.000 10.467 40
Lei Han, Tian Zheng, Lan Xu, Lu Fang: OccuSeg: Occupancy-aware 3D Instance Segmentation. CVPR2020
Mask-Group0.664 291.000 10.822 210.764 270.616 250.815 270.139 360.694 240.597 230.459 350.566 310.599 220.600 350.516 590.715 170.819 220.635 321.000 10.603 13
Min Zhong, Xinghao Chen, Xiaokang Chen, Gang Zeng, Yunhe Wang: MaskGroup: Hierarchical Point Grouping and Masking for 3D Instance Segmentation. ICME 2022
INS-Conv-instance0.657 301.000 10.760 440.667 390.581 290.863 160.323 190.655 320.477 300.473 330.549 330.432 370.650 201.000 10.655 230.738 380.585 400.944 490.472 39
CSC-Pretrained0.648 311.000 10.810 240.768 250.523 400.813 280.143 350.819 50.389 400.422 440.511 370.443 350.650 201.000 10.624 290.732 390.634 331.000 10.375 48
PE0.645 321.000 10.773 400.798 210.538 350.786 370.088 440.799 100.350 440.435 420.547 340.545 250.646 300.933 380.562 350.761 330.556 490.997 340.501 34
Biao Zhang, Peter Wonka: Point Cloud Instance Segmentation using Probabilistic Embeddings. CVPR 2021
RPGN0.643 331.000 10.758 450.582 500.539 340.826 240.046 490.765 120.372 420.436 410.588 250.539 270.650 201.000 10.577 330.750 350.653 300.997 340.495 35
Shichao Dong, Guosheng Lin, Tzu-Yi Hung: Learning Regional Purity for Instance Segmentation on 3D Point Clouds. ECCV 2022
Dyco3Dcopyleft0.641 341.000 10.841 180.893 60.531 370.802 320.115 410.588 430.448 340.438 390.537 360.430 390.550 440.857 410.534 390.764 320.657 270.987 450.568 21
Tong He; Chunhua Shen; Anton van den Hengel: DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution. CVPR2021
GICN0.638 351.000 10.895 120.800 200.480 460.676 480.144 340.737 180.354 430.447 360.400 500.365 450.700 131.000 10.569 340.836 140.599 361.000 10.473 38
PointGroup0.636 361.000 10.765 410.624 410.505 440.797 330.116 400.696 230.384 410.441 370.559 320.476 310.596 381.000 10.666 200.756 340.556 480.997 340.513 30
Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia: PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation. CVPR 2020 [oral]
DD-UNet+Group0.635 370.667 520.797 340.714 340.562 320.774 380.146 330.810 80.429 360.476 320.546 350.399 410.633 311.000 10.632 280.722 410.609 351.000 10.514 29
H. Liu, R. Liu, K. Yang, J. Zhang, K. Peng, R. Stiefelhagen: HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor. ICCVW 2021
Mask3D_evaluation0.631 381.000 10.829 200.606 430.646 190.836 220.068 450.511 500.462 330.507 280.619 210.389 430.610 341.000 10.432 470.828 180.673 260.788 610.552 25
DENet0.629 391.000 10.797 330.608 420.589 280.627 520.219 280.882 10.310 460.402 490.383 520.396 420.650 201.000 10.663 220.543 600.691 241.000 10.568 22
3D-MPA0.611 401.000 10.833 190.765 260.526 390.756 410.136 380.588 430.470 310.438 400.432 460.358 470.650 200.857 410.429 480.765 310.557 471.000 10.430 43
Francis Engelmann, Martin Bokeloh, Alireza Fathi, Bastian Leibe, Matthias Nießner: 3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation. CVPR 2020
OSIS0.605 411.000 10.801 310.599 450.535 360.728 450.286 200.436 560.679 140.491 290.433 440.256 490.404 560.857 410.620 300.724 400.510 541.000 10.539 27
AOIA0.601 421.000 10.761 430.687 360.485 450.828 230.008 560.663 300.405 390.405 480.425 470.490 300.596 380.714 520.553 380.779 260.597 370.992 430.424 45
PCJC0.578 431.000 10.810 250.583 490.449 490.813 290.042 500.603 410.341 450.490 300.465 410.410 400.650 200.835 490.264 580.694 470.561 450.889 540.504 33
SSEN0.575 441.000 10.761 420.473 520.477 470.795 340.066 460.529 480.658 180.460 340.461 420.380 440.331 580.859 400.401 510.692 490.653 291.000 10.348 50
Dongsu Zhang, Junha Chun, Sang Kyun Cha, Young Min Kim: Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with Deep Metric Learning. Arxiv
RWSeg0.567 450.528 620.708 530.626 400.580 300.745 440.063 470.627 370.240 500.400 500.497 380.464 320.515 451.000 10.475 430.745 360.571 421.000 10.429 44
NeuralBF0.555 460.667 520.896 110.843 140.517 410.751 420.029 510.519 490.414 380.439 380.465 400.000 680.484 470.857 410.287 560.693 480.651 311.000 10.485 37
Weiwei Sun, Daniel Rebain, Renjie Liao, Vladimir Tankovich, Soroosh Yazdani, Kwang Moo Yi, Andrea Tagliasacchi: NeuralBF: Neural Bilateral Filtering for Top-down Instance Segmentation on Point Clouds. WACV 2023
MTML0.549 471.000 10.807 280.588 480.327 540.647 500.004 580.815 70.180 530.418 450.364 540.182 520.445 501.000 10.442 460.688 500.571 431.000 10.396 46
Jean Lahoud, Bernard Ghanem, Marc Pollefeys, Martin R. Oswald: 3D Instance Segmentation via Multi-task Metric Learning. ICCV 2019 [oral]
ClickSeg_Instance0.539 481.000 10.621 570.300 550.530 380.698 460.127 390.533 470.222 510.430 430.400 490.365 450.574 420.938 370.472 440.659 520.543 500.944 490.347 51
One_Thing_One_Clickpermissive0.529 490.667 520.718 490.777 230.399 500.683 470.000 610.669 280.138 560.391 510.374 530.539 260.360 570.641 560.556 370.774 290.593 380.997 340.251 56
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
Sparse R-CNN0.515 501.000 10.538 620.282 560.468 480.790 350.173 300.345 580.429 350.413 470.484 390.176 530.595 400.591 570.522 400.668 510.476 550.986 470.327 52
Occipital-SCS0.512 511.000 10.716 500.509 510.506 430.611 530.092 430.602 420.177 540.346 540.383 510.165 540.442 510.850 480.386 530.618 560.543 510.889 540.389 47
3D-BoNet0.488 521.000 10.672 560.590 470.301 560.484 630.098 420.620 390.306 470.341 550.259 580.125 560.434 530.796 510.402 500.499 620.513 530.909 530.439 42
Bo Yang, Jianan Wang, Ronald Clark, Qingyong Hu, Sen Wang, Andrew Markham, Niki Trigoni: Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds. NeurIPS 2019 Spotlight
PanopticFusion-inst0.478 530.667 520.712 520.595 460.259 590.550 590.000 610.613 400.175 550.250 600.434 430.437 360.411 550.857 410.485 420.591 590.267 650.944 490.359 49
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
SPG_WSIS0.470 540.667 520.685 550.677 380.372 520.562 570.000 610.482 530.244 490.316 570.298 550.052 630.442 520.857 410.267 570.702 440.559 461.000 10.287 54
SALoss-ResNet0.459 551.000 10.737 480.159 660.259 580.587 550.138 370.475 540.217 520.416 460.408 480.128 550.315 590.714 520.411 490.536 610.590 390.873 580.304 53
Zhidong Liang, Ming Yang, Hao Li, Chunxiang Wang: 3D Instance Embedding Learning With a Structure-Aware Loss Function for Point Cloud Segmentation. IEEE Robotics and Automation Letters (IROS2020)
MASCpermissive0.447 560.528 620.555 600.381 530.382 510.633 510.002 590.509 510.260 480.361 530.432 450.327 480.451 490.571 580.367 540.639 540.386 560.980 480.276 55
Chen Liu, Yasutaka Furukawa: MASC: Multi-scale Affinity with Sparse Convolution for 3D Instance Segmentation.
SegGroup_inspermissive0.445 570.667 520.773 390.185 630.317 550.656 490.000 610.407 570.134 570.381 520.267 570.217 510.476 480.714 520.452 450.629 550.514 521.000 10.222 59
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
3D-SISpermissive0.382 581.000 10.432 650.245 580.190 600.577 560.013 550.263 600.033 630.320 560.240 590.075 590.422 540.857 410.117 630.699 450.271 640.883 560.235 58
Ji Hou, Angela Dai, Matthias Niessner: 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans. CVPR 2019
Hier3Dcopyleft0.323 590.667 520.542 610.264 570.157 630.550 580.000 610.205 630.009 650.270 590.218 600.075 590.500 460.688 550.007 690.698 460.301 610.459 660.200 60
Tan: HCFS3D: Hierarchical Coupled Feature Selection Network for 3D Semantic and Instance Segmentation.
UNet-backbone0.319 600.667 520.715 510.233 590.189 610.479 640.008 560.218 610.067 620.201 620.173 610.107 570.123 640.438 600.150 600.615 570.355 570.916 520.093 68
R-PointNet0.306 610.500 640.405 660.311 540.348 530.589 540.054 480.068 660.126 580.283 580.290 560.028 640.219 620.214 630.331 550.396 660.275 620.821 600.245 57
Region-18class0.284 620.250 680.751 470.228 610.270 570.521 600.000 610.468 550.008 670.205 610.127 620.000 680.068 660.070 670.262 590.652 530.323 590.740 620.173 61
SemRegionNet-20cls0.250 630.333 650.613 580.229 600.163 620.493 610.000 610.304 590.107 590.147 650.100 640.052 620.231 600.119 650.039 650.445 640.325 580.654 630.141 64
tmp0.248 640.667 520.437 640.188 620.153 640.491 620.000 610.208 620.094 610.153 640.099 650.057 610.217 630.119 650.039 650.466 630.302 600.640 640.140 65
3D-BEVIS0.248 640.667 520.566 590.076 670.035 690.394 670.027 530.035 680.098 600.099 670.030 680.025 650.098 650.375 620.126 620.604 580.181 670.854 590.171 62
Cathrin Elich, Francis Engelmann, Jonas Schult, Theodora Kontogianni, Bastian Leibe: 3D-BEVIS: Birds-Eye-View Instance Segmentation.
Sem_Recon_ins0.227 660.764 510.486 630.069 680.098 660.426 660.017 540.067 670.015 640.172 630.100 630.096 580.054 680.183 640.135 610.366 670.260 660.614 650.168 63
ASIS0.199 670.333 650.253 680.167 650.140 650.438 650.000 610.177 640.008 660.121 660.069 660.004 670.231 610.429 610.036 670.445 650.273 630.333 680.119 67
Sgpn_scannet0.143 680.208 690.390 670.169 640.065 670.275 680.029 520.069 650.000 680.087 680.043 670.014 660.027 690.000 680.112 640.351 680.168 680.438 670.138 66
MaskRCNN 2d->3d Proj0.058 690.333 650.002 690.000 690.053 680.002 690.002 600.021 690.000 680.045 690.024 690.238 500.065 670.000 680.014 680.107 690.020 690.110 690.006 69