The 3D semantic instance prediction task involves detecting and segmenting the object in an 3D scan mesh.

Evaluation and metrics

Our evaluation ranks all methods according to the average precision for each class. We report the mean average precision AP at overlap 0.25 (AP 25%), overlap 0.5 (AP 50%), and over overlaps in the range [0.5:0.95:0.05] (AP). Note that multiple predictions of the same ground truth instance are penalized as false positives.



This table lists the benchmark results for the 3D semantic instance scenario.




Method Infoavg ap 50%bathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Spherical Mask(CtoF)0.812 11.000 10.973 30.852 110.718 30.917 30.574 30.677 250.748 70.729 70.715 40.795 10.809 11.000 10.831 30.854 70.787 71.000 10.638 3
OneFormer3Dcopyleft0.801 21.000 10.973 20.909 50.698 90.928 20.582 20.668 280.685 120.780 20.687 80.698 110.702 121.000 10.794 70.900 20.784 90.986 450.635 4
Maxim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: OneFormer3D: One Transformer for Unified Point Cloud Segmentation.
UniPerception0.800 31.000 10.930 60.872 90.727 20.862 170.454 120.764 130.820 10.746 50.706 60.750 20.772 80.926 380.764 110.818 210.826 10.997 340.660 2
TST3D0.795 41.000 10.929 70.918 40.709 60.884 110.596 10.704 210.769 50.734 60.644 140.699 100.751 101.000 10.794 60.876 40.757 160.997 340.550 25
ExtMask3D0.789 51.000 10.988 10.756 270.706 70.912 40.429 130.647 330.806 40.755 40.673 100.689 120.772 91.000 10.789 80.852 80.811 31.000 10.617 9
Queryformer0.787 61.000 10.933 50.601 420.754 10.886 90.558 50.661 300.767 60.665 120.716 30.639 180.808 31.000 10.844 10.897 30.804 41.000 10.624 6
MAFT0.786 71.000 10.894 120.807 170.694 110.893 70.486 80.674 260.740 80.786 10.704 70.727 40.739 111.000 10.707 170.849 100.756 171.000 10.685 1
Mask3D0.780 81.000 10.786 360.716 320.696 100.885 100.500 70.714 190.810 30.672 110.715 40.679 140.809 11.000 10.831 30.833 140.787 71.000 10.602 13
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
SPFormerpermissive0.770 90.903 490.903 90.806 180.609 240.886 80.568 40.815 60.705 110.711 80.655 110.652 170.685 171.000 10.789 90.809 220.776 121.000 10.583 18
Sun Jiahao, Qing Chunmei, Tan Junpeng, Xu Xiangmin: Superpoint Transformer for 3D Scene Instance Segmentation. AAAI 2023 [Oral]
SoftGroup++0.769 101.000 10.803 290.937 10.684 120.865 140.213 280.870 20.664 150.571 180.758 10.702 80.807 41.000 10.653 240.902 10.792 61.000 10.626 5
SIM3D0.766 111.000 10.948 40.582 480.599 260.882 120.510 60.701 220.632 190.772 30.685 90.687 130.782 71.000 10.833 20.756 320.798 51.000 10.622 7
SoftGrouppermissive0.761 121.000 10.808 250.845 120.716 40.862 160.243 250.824 40.655 170.620 130.734 20.699 90.791 60.981 320.716 150.844 110.769 131.000 10.594 16
Thang Vu, Kookhoi Kim, Tung M. Luu, Xuan Thanh Nguyen, Chang D. Yoo: SoftGroup for 3D Instance Segmentaiton on Point Clouds. CVPR 2022 [Oral]
ISBNetpermissive0.757 131.000 10.904 80.731 300.678 130.895 50.458 100.644 350.670 140.710 90.620 190.732 30.650 191.000 10.756 120.778 250.779 101.000 10.614 10
Tuan Duc Ngo, Binh-Son Hua, Khoi Nguyen: ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution. CVPR 2023
TD3Dpermissive0.751 141.000 10.774 370.867 100.621 200.934 10.404 140.706 200.812 20.605 160.633 170.626 190.690 161.000 10.640 260.820 180.777 111.000 10.612 11
Maksim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: Top-Down Beats Bottom-Up in 3D Instance Segmentation. WACV 2024
PBNetpermissive0.747 151.000 10.818 210.837 140.713 50.844 190.457 110.647 330.711 100.614 140.617 210.657 160.650 191.000 10.692 180.822 170.765 151.000 10.595 15
W.Zhao, Y.Yan, C.Yang, J.Ye,X.Yang,K.Huang: Divide and Conquer: 3D Instance Segmentation With Point-Wise Binarization. ICCV 2023
GraphCut0.732 161.000 10.788 340.724 310.642 180.859 180.248 240.787 110.618 210.596 170.653 130.722 60.583 401.000 10.766 100.861 50.825 21.000 10.504 31
IPCA-Inst0.731 171.000 10.788 350.884 80.698 80.788 350.252 230.760 140.646 180.511 260.637 160.665 150.804 51.000 10.644 250.778 260.747 191.000 10.561 22
TopoSeg0.725 181.000 10.806 280.933 20.668 150.758 390.272 220.734 180.630 200.549 220.654 120.606 200.697 150.966 350.612 300.839 120.754 181.000 10.573 19
DKNet0.718 191.000 10.814 220.782 210.619 210.872 130.224 260.751 160.569 250.677 100.585 250.724 50.633 300.981 320.515 400.819 190.736 201.000 10.617 8
Yizheng Wu, Min Shi, Shuaiyuan Du, Hao Lu, Zhiguo Cao, Weicai Zhong: 3D Instances as 1D Kernels. ECCV 2022
SSEC0.707 201.000 10.850 140.924 30.648 160.747 420.162 300.862 30.572 240.520 240.624 180.549 230.649 281.000 10.560 350.706 420.768 141.000 10.591 17
HAISpermissive0.699 211.000 10.849 150.820 150.675 140.808 290.279 200.757 150.465 310.517 250.596 230.559 220.600 341.000 10.654 230.767 280.676 240.994 410.560 23
Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang: Hierarchical Aggregation for 3D Instance Segmentation. ICCV 2021
SSTNetpermissive0.698 221.000 10.697 530.888 70.556 320.803 300.387 150.626 370.417 360.556 210.585 260.702 70.600 341.000 10.824 50.720 410.692 221.000 10.509 30
Zhihao Liang, Zhihao Li, Songcen Xu, Mingkui Tan, Kui Jia: Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks. ICCV2021
DualGroup0.694 231.000 10.799 310.811 160.622 190.817 240.376 160.805 90.590 230.487 300.568 290.525 270.650 190.835 480.600 310.829 150.655 271.000 10.526 27
DANCENET0.680 241.000 10.807 260.733 290.600 250.768 380.375 170.543 450.538 260.610 150.599 220.498 280.632 320.981 320.739 140.856 60.633 330.882 560.454 40
SphereSeg0.680 241.000 10.856 130.744 280.618 220.893 60.151 310.651 320.713 90.537 230.579 280.430 370.651 181.000 10.389 510.744 360.697 210.991 430.601 14
Box2Mask0.677 261.000 10.847 160.771 230.509 410.816 250.277 210.558 440.482 280.562 200.640 150.448 330.700 131.000 10.666 190.852 90.578 400.997 340.488 35
Julian Chibane, Francis Engelmann, Tuan Anh Tran, Gerard Pons-Moll: Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation Using Bounding Boxes. ECCV 2022
OccuSeg+instance0.672 271.000 10.758 450.682 350.576 300.842 200.477 90.504 510.524 270.567 190.585 270.451 320.557 421.000 10.751 130.797 230.563 431.000 10.467 39
Lei Han, Tian Zheng, Lan Xu, Lu Fang: OccuSeg: Occupancy-aware 3D Instance Segmentation. CVPR2020
Mask-Group0.664 281.000 10.822 200.764 260.616 230.815 260.139 350.694 240.597 220.459 340.566 300.599 210.600 340.516 580.715 160.819 200.635 311.000 10.603 12
Min Zhong, Xinghao Chen, Xiaokang Chen, Gang Zeng, Yunhe Wang: MaskGroup: Hierarchical Point Grouping and Masking for 3D Instance Segmentation. ICME 2022
INS-Conv-instance0.657 291.000 10.760 430.667 370.581 280.863 150.323 180.655 310.477 290.473 320.549 320.432 360.650 191.000 10.655 220.738 370.585 390.944 480.472 38
CSC-Pretrained0.648 301.000 10.810 230.768 240.523 390.813 270.143 340.819 50.389 390.422 430.511 360.443 340.650 191.000 10.624 280.732 380.634 321.000 10.375 47
PE0.645 311.000 10.773 390.798 200.538 340.786 360.088 430.799 100.350 430.435 410.547 330.545 240.646 290.933 370.562 340.761 310.556 480.997 340.501 33
Biao Zhang, Peter Wonka: Point Cloud Instance Segmentation using Probabilistic Embeddings. CVPR 2021
RPGN0.643 321.000 10.758 440.582 490.539 330.826 230.046 480.765 120.372 410.436 400.588 240.539 260.650 191.000 10.577 320.750 340.653 290.997 340.495 34
Shichao Dong, Guosheng Lin, Tzu-Yi Hung: Learning Regional Purity for Instance Segmentation on 3D Point Clouds. ECCV 2022
Dyco3Dcopyleft0.641 331.000 10.841 170.893 60.531 360.802 310.115 400.588 420.448 330.438 380.537 350.430 380.550 430.857 400.534 380.764 300.657 260.987 440.568 20
Tong He; Chunhua Shen; Anton van den Hengel: DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution. CVPR2021
GICN0.638 341.000 10.895 110.800 190.480 450.676 470.144 330.737 170.354 420.447 350.400 490.365 440.700 131.000 10.569 330.836 130.599 351.000 10.473 37
PointGroup0.636 351.000 10.765 400.624 390.505 430.797 320.116 390.696 230.384 400.441 360.559 310.476 300.596 371.000 10.666 190.756 330.556 470.997 340.513 29
Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia: PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation. CVPR 2020 [oral]
DD-UNet+Group0.635 360.667 510.797 330.714 330.562 310.774 370.146 320.810 80.429 350.476 310.546 340.399 400.633 301.000 10.632 270.722 400.609 341.000 10.514 28
H. Liu, R. Liu, K. Yang, J. Zhang, K. Peng, R. Stiefelhagen: HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor. ICCVW 2021
Mask3D_evaluation0.631 371.000 10.829 190.606 410.646 170.836 210.068 440.511 490.462 320.507 270.619 200.389 420.610 331.000 10.432 460.828 160.673 250.788 600.552 24
DENet0.629 381.000 10.797 320.608 400.589 270.627 510.219 270.882 10.310 450.402 480.383 510.396 410.650 191.000 10.663 210.543 590.691 231.000 10.568 21
3D-MPA0.611 391.000 10.833 180.765 250.526 380.756 400.136 370.588 420.470 300.438 390.432 450.358 460.650 190.857 400.429 470.765 290.557 461.000 10.430 42
Francis Engelmann, Martin Bokeloh, Alireza Fathi, Bastian Leibe, Matthias Nie├čner: 3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation. CVPR 2020
OSIS0.605 401.000 10.801 300.599 430.535 350.728 440.286 190.436 550.679 130.491 280.433 430.256 480.404 550.857 400.620 290.724 390.510 531.000 10.539 26
AOIA0.601 411.000 10.761 420.687 340.485 440.828 220.008 550.663 290.405 380.405 470.425 460.490 290.596 370.714 510.553 370.779 240.597 360.992 420.424 44
PCJC0.578 421.000 10.810 240.583 470.449 480.813 280.042 490.603 400.341 440.490 290.465 400.410 390.650 190.835 480.264 570.694 460.561 440.889 530.504 32
SSEN0.575 431.000 10.761 410.473 510.477 460.795 330.066 450.529 470.658 160.460 330.461 410.380 430.331 570.859 390.401 500.692 480.653 281.000 10.348 49
Dongsu Zhang, Junha Chun, Sang Kyun Cha, Young Min Kim: Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with Deep Metric Learning. Arxiv
RWSeg0.567 440.528 610.708 520.626 380.580 290.745 430.063 460.627 360.240 490.400 490.497 370.464 310.515 441.000 10.475 420.745 350.571 411.000 10.429 43
NeuralBF0.555 450.667 510.896 100.843 130.517 400.751 410.029 500.519 480.414 370.439 370.465 390.000 670.484 460.857 400.287 550.693 470.651 301.000 10.485 36
Weiwei Sun, Daniel Rebain, Renjie Liao, Vladimir Tankovich, Soroosh Yazdani, Kwang Moo Yi, Andrea Tagliasacchi: NeuralBF: Neural Bilateral Filtering for Top-down Instance Segmentation on Point Clouds. WACV 2023
MTML0.549 461.000 10.807 270.588 460.327 530.647 490.004 570.815 70.180 520.418 440.364 530.182 510.445 491.000 10.442 450.688 490.571 421.000 10.396 45
Jean Lahoud, Bernard Ghanem, Marc Pollefeys, Martin R. Oswald: 3D Instance Segmentation via Multi-task Metric Learning. ICCV 2019 [oral]
ClickSeg_Instance0.539 471.000 10.621 560.300 540.530 370.698 450.127 380.533 460.222 500.430 420.400 480.365 440.574 410.938 360.472 430.659 510.543 490.944 480.347 50
One_Thing_One_Clickpermissive0.529 480.667 510.718 480.777 220.399 490.683 460.000 600.669 270.138 550.391 500.374 520.539 250.360 560.641 550.556 360.774 270.593 370.997 340.251 55
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
Sparse R-CNN0.515 491.000 10.538 610.282 550.468 470.790 340.173 290.345 570.429 340.413 460.484 380.176 520.595 390.591 560.522 390.668 500.476 540.986 460.327 51
Occipital-SCS0.512 501.000 10.716 490.509 500.506 420.611 520.092 420.602 410.177 530.346 530.383 500.165 530.442 500.850 470.386 520.618 550.543 500.889 530.389 46
3D-BoNet0.488 511.000 10.672 550.590 450.301 550.484 620.098 410.620 380.306 460.341 540.259 570.125 550.434 520.796 500.402 490.499 610.513 520.909 520.439 41
Bo Yang, Jianan Wang, Ronald Clark, Qingyong Hu, Sen Wang, Andrew Markham, Niki Trigoni: Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds. NeurIPS 2019 Spotlight
PanopticFusion-inst0.478 520.667 510.712 510.595 440.259 580.550 580.000 600.613 390.175 540.250 590.434 420.437 350.411 540.857 400.485 410.591 580.267 640.944 480.359 48
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
SPG_WSIS0.470 530.667 510.685 540.677 360.372 510.562 560.000 600.482 520.244 480.316 560.298 540.052 620.442 510.857 400.267 560.702 430.559 451.000 10.287 53
SALoss-ResNet0.459 541.000 10.737 470.159 650.259 570.587 540.138 360.475 530.217 510.416 450.408 470.128 540.315 580.714 510.411 480.536 600.590 380.873 570.304 52
Zhidong Liang, Ming Yang, Hao Li, Chunxiang Wang: 3D Instance Embedding Learning With a Structure-Aware Loss Function for Point Cloud Segmentation. IEEE Robotics and Automation Letters (IROS2020)
MASCpermissive0.447 550.528 610.555 590.381 520.382 500.633 500.002 580.509 500.260 470.361 520.432 440.327 470.451 480.571 570.367 530.639 530.386 550.980 470.276 54
Chen Liu, Yasutaka Furukawa: MASC: Multi-scale Affinity with Sparse Convolution for 3D Instance Segmentation.
SegGroup_inspermissive0.445 560.667 510.773 380.185 620.317 540.656 480.000 600.407 560.134 560.381 510.267 560.217 500.476 470.714 510.452 440.629 540.514 511.000 10.222 58
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
3D-SISpermissive0.382 571.000 10.432 640.245 570.190 590.577 550.013 540.263 590.033 620.320 550.240 580.075 580.422 530.857 400.117 620.699 440.271 630.883 550.235 57
Ji Hou, Angela Dai, Matthias Niessner: 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans. CVPR 2019
Hier3Dcopyleft0.323 580.667 510.542 600.264 560.157 620.550 570.000 600.205 620.009 640.270 580.218 590.075 580.500 450.688 540.007 680.698 450.301 600.459 650.200 59
Tan: HCFS3D: Hierarchical Coupled Feature Selection Network for 3D Semantic and Instance Segmentation.
UNet-backbone0.319 590.667 510.715 500.233 580.189 600.479 630.008 550.218 600.067 610.201 610.173 600.107 560.123 630.438 590.150 590.615 560.355 560.916 510.093 67
R-PointNet0.306 600.500 630.405 650.311 530.348 520.589 530.054 470.068 650.126 570.283 570.290 550.028 630.219 610.214 620.331 540.396 650.275 610.821 590.245 56
Region-18class0.284 610.250 670.751 460.228 600.270 560.521 590.000 600.468 540.008 660.205 600.127 610.000 670.068 650.070 660.262 580.652 520.323 580.740 610.173 60
SemRegionNet-20cls0.250 620.333 640.613 570.229 590.163 610.493 600.000 600.304 580.107 580.147 640.100 630.052 610.231 590.119 640.039 640.445 630.325 570.654 620.141 63
tmp0.248 630.667 510.437 630.188 610.153 630.491 610.000 600.208 610.094 600.153 630.099 640.057 600.217 620.119 640.039 640.466 620.302 590.640 630.140 64
3D-BEVIS0.248 630.667 510.566 580.076 660.035 680.394 660.027 520.035 670.098 590.099 660.030 670.025 640.098 640.375 610.126 610.604 570.181 660.854 580.171 61
Cathrin Elich, Francis Engelmann, Jonas Schult, Theodora Kontogianni, Bastian Leibe: 3D-BEVIS: Birds-Eye-View Instance Segmentation.
Sem_Recon_ins0.227 650.764 500.486 620.069 670.098 650.426 650.017 530.067 660.015 630.172 620.100 620.096 570.054 670.183 630.135 600.366 660.260 650.614 640.168 62
ASIS0.199 660.333 640.253 670.167 640.140 640.438 640.000 600.177 630.008 650.121 650.069 650.004 660.231 600.429 600.036 660.445 640.273 620.333 670.119 66
Sgpn_scannet0.143 670.208 680.390 660.169 630.065 660.275 670.029 510.069 640.000 670.087 670.043 660.014 650.027 680.000 670.112 630.351 670.168 670.438 660.138 65
MaskRCNN 2d->3d Proj0.058 680.333 640.002 680.000 680.053 670.002 680.002 590.021 680.000 670.045 680.024 680.238 490.065 660.000 670.014 670.107 680.020 680.110 680.006 68