The 3D semantic instance prediction task involves detecting and segmenting the object in an 3D scan mesh.

Evaluation and metrics

Our evaluation ranks all methods according to the average precision for each class. We report the mean average precision AP at overlap 0.25 (AP 25%), overlap 0.5 (AP 50%), and over overlaps in the range [0.5:0.95:0.05] (AP). Note that multiple predictions of the same ground truth instance are penalized as false positives.



This table lists the benchmark results for the 3D semantic instance scenario.




Method Infoavg ap 50%bathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Spherical Mask(CtoF)0.812 11.000 10.973 30.852 100.718 30.917 30.574 20.677 240.748 60.729 60.715 40.795 10.809 11.000 10.831 30.854 60.787 71.000 10.638 3
OneFormer3Dcopyleft0.801 21.000 10.973 20.909 40.698 80.928 20.582 10.668 270.685 110.780 20.687 80.698 100.702 111.000 10.794 60.900 20.784 90.986 440.635 4
Maxim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: OneFormer3D: One Transformer for Unified Point Cloud Segmentation.
UniPerception0.800 31.000 10.930 60.872 80.727 20.862 160.454 110.764 130.820 10.746 50.706 60.750 20.772 80.926 370.764 100.818 200.826 10.997 340.660 2
ExtMask3D0.789 41.000 10.988 10.756 260.706 60.912 40.429 120.647 320.806 40.755 40.673 100.689 110.772 91.000 10.789 70.852 70.811 31.000 10.617 9
Queryformer0.787 51.000 10.933 50.601 410.754 10.886 90.558 40.661 290.767 50.665 110.716 30.639 170.808 31.000 10.844 10.897 30.804 41.000 10.624 6
MAFT0.786 61.000 10.894 110.807 160.694 100.893 70.486 70.674 250.740 70.786 10.704 70.727 40.739 101.000 10.707 160.849 90.756 161.000 10.685 1
Mask3D0.780 71.000 10.786 350.716 310.696 90.885 100.500 60.714 190.810 30.672 100.715 40.679 130.809 11.000 10.831 30.833 130.787 71.000 10.602 13
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
SPFormerpermissive0.770 80.903 480.903 80.806 170.609 230.886 80.568 30.815 60.705 100.711 70.655 110.652 160.685 161.000 10.789 80.809 210.776 121.000 10.583 18
Sun Jiahao, Qing Chunmei, Tan Junpeng, Xu Xiangmin: Superpoint Transformer for 3D Scene Instance Segmentation. AAAI 2023 [Oral]
SoftGroup++0.769 91.000 10.803 280.937 10.684 110.865 130.213 270.870 20.664 140.571 170.758 10.702 80.807 41.000 10.653 230.902 10.792 61.000 10.626 5
SIM3D0.766 101.000 10.948 40.582 470.599 250.882 110.510 50.701 210.632 180.772 30.685 90.687 120.782 71.000 10.833 20.756 310.798 51.000 10.622 7
SoftGrouppermissive0.761 111.000 10.808 240.845 110.716 40.862 150.243 240.824 40.655 160.620 120.734 20.699 90.791 60.981 310.716 140.844 100.769 131.000 10.594 16
Thang Vu, Kookhoi Kim, Tung M. Luu, Xuan Thanh Nguyen, Chang D. Yoo: SoftGroup for 3D Instance Segmentaiton on Point Clouds. CVPR 2022 [Oral]
ISBNetpermissive0.757 121.000 10.904 70.731 290.678 120.895 50.458 90.644 340.670 130.710 80.620 180.732 30.650 181.000 10.756 110.778 240.779 101.000 10.614 10
Tuan Duc Ngo, Binh-Son Hua, Khoi Nguyen: ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution. CVPR 2023
TD3Dpermissive0.751 131.000 10.774 360.867 90.621 190.934 10.404 130.706 200.812 20.605 150.633 160.626 180.690 151.000 10.640 250.820 170.777 111.000 10.612 11
Maksim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: Top-Down Beats Bottom-Up in 3D Instance Segmentation. WACV 2024
PBNetpermissive0.747 141.000 10.818 200.837 130.713 50.844 180.457 100.647 320.711 90.614 130.617 200.657 150.650 181.000 10.692 170.822 160.765 151.000 10.595 15
W.Zhao, Y.Yan, C.Yang, J.Ye,X.Yang,K.Huang: Divide and Conquer: 3D Instance Segmentation With Point-Wise Binarization. ICCV 2023
GraphCut0.732 151.000 10.788 330.724 300.642 170.859 170.248 230.787 110.618 200.596 160.653 130.722 60.583 391.000 10.766 90.861 40.825 21.000 10.504 30
IPCA-Inst0.731 161.000 10.788 340.884 70.698 70.788 340.252 220.760 140.646 170.511 250.637 150.665 140.804 51.000 10.644 240.778 250.747 181.000 10.561 22
TopoSeg0.725 171.000 10.806 270.933 20.668 140.758 380.272 210.734 180.630 190.549 210.654 120.606 190.697 140.966 340.612 290.839 110.754 171.000 10.573 19
DKNet0.718 181.000 10.814 210.782 200.619 200.872 120.224 250.751 160.569 240.677 90.585 240.724 50.633 290.981 310.515 390.819 180.736 191.000 10.617 8
Yizheng Wu, Min Shi, Shuaiyuan Du, Hao Lu, Zhiguo Cao, Weicai Zhong: 3D Instances as 1D Kernels. ECCV 2022
SSEC0.707 191.000 10.850 130.924 30.648 150.747 410.162 290.862 30.572 230.520 230.624 170.549 220.649 271.000 10.560 340.706 410.768 141.000 10.591 17
HAISpermissive0.699 201.000 10.849 140.820 140.675 130.808 280.279 190.757 150.465 300.517 240.596 220.559 210.600 331.000 10.654 220.767 270.676 230.994 400.560 23
Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang: Hierarchical Aggregation for 3D Instance Segmentation. ICCV 2021
SSTNetpermissive0.698 211.000 10.697 520.888 60.556 310.803 290.387 140.626 360.417 350.556 200.585 250.702 70.600 331.000 10.824 50.720 400.692 211.000 10.509 29
Zhihao Liang, Zhihao Li, Songcen Xu, Mingkui Tan, Kui Jia: Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks. ICCV2021
DualGroup0.694 221.000 10.799 300.811 150.622 180.817 230.376 150.805 90.590 220.487 290.568 280.525 260.650 180.835 470.600 300.829 140.655 261.000 10.526 26
SphereSeg0.680 231.000 10.856 120.744 270.618 210.893 60.151 300.651 310.713 80.537 220.579 270.430 360.651 171.000 10.389 500.744 350.697 200.991 420.601 14
DANCENET0.680 231.000 10.807 250.733 280.600 240.768 370.375 160.543 440.538 250.610 140.599 210.498 270.632 310.981 310.739 130.856 50.633 320.882 550.454 39
Box2Mask0.677 251.000 10.847 150.771 220.509 400.816 240.277 200.558 430.482 270.562 190.640 140.448 320.700 121.000 10.666 180.852 80.578 390.997 340.488 34
Julian Chibane, Francis Engelmann, Tuan Anh Tran, Gerard Pons-Moll: Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation Using Bounding Boxes. ECCV 2022
OccuSeg+instance0.672 261.000 10.758 440.682 340.576 290.842 190.477 80.504 500.524 260.567 180.585 260.451 310.557 411.000 10.751 120.797 220.563 421.000 10.467 38
Lei Han, Tian Zheng, Lan Xu, Lu Fang: OccuSeg: Occupancy-aware 3D Instance Segmentation. CVPR2020
Mask-Group0.664 271.000 10.822 190.764 250.616 220.815 250.139 340.694 230.597 210.459 330.566 290.599 200.600 330.516 570.715 150.819 190.635 301.000 10.603 12
Min Zhong, Xinghao Chen, Xiaokang Chen, Gang Zeng, Yunhe Wang: MaskGroup: Hierarchical Point Grouping and Masking for 3D Instance Segmentation. ICME 2022
INS-Conv-instance0.657 281.000 10.760 420.667 360.581 270.863 140.323 170.655 300.477 280.473 310.549 310.432 350.650 181.000 10.655 210.738 360.585 380.944 470.472 37
CSC-Pretrained0.648 291.000 10.810 220.768 230.523 380.813 260.143 330.819 50.389 380.422 420.511 350.443 330.650 181.000 10.624 270.732 370.634 311.000 10.375 46
PE0.645 301.000 10.773 380.798 190.538 330.786 350.088 420.799 100.350 420.435 400.547 320.545 230.646 280.933 360.562 330.761 300.556 470.997 340.501 32
Biao Zhang, Peter Wonka: Point Cloud Instance Segmentation using Probabilistic Embeddings. CVPR 2021
RPGN0.643 311.000 10.758 430.582 480.539 320.826 220.046 470.765 120.372 400.436 390.588 230.539 250.650 181.000 10.577 310.750 330.653 280.997 340.495 33
Shichao Dong, Guosheng Lin, Tzu-Yi Hung: Learning Regional Purity for Instance Segmentation on 3D Point Clouds. ECCV 2022
Dyco3Dcopyleft0.641 321.000 10.841 160.893 50.531 350.802 300.115 390.588 410.448 320.438 370.537 340.430 370.550 420.857 390.534 370.764 290.657 250.987 430.568 20
Tong He; Chunhua Shen; Anton van den Hengel: DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution. CVPR2021
GICN0.638 331.000 10.895 100.800 180.480 440.676 460.144 320.737 170.354 410.447 340.400 480.365 430.700 121.000 10.569 320.836 120.599 341.000 10.473 36
PointGroup0.636 341.000 10.765 390.624 380.505 420.797 310.116 380.696 220.384 390.441 350.559 300.476 290.596 361.000 10.666 180.756 320.556 460.997 340.513 28
Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia: PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation. CVPR 2020 [oral]
DD-UNet+Group0.635 350.667 500.797 320.714 320.562 300.774 360.146 310.810 80.429 340.476 300.546 330.399 390.633 291.000 10.632 260.722 390.609 331.000 10.514 27
H. Liu, R. Liu, K. Yang, J. Zhang, K. Peng, R. Stiefelhagen: HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor. ICCVW 2021
Mask3D_evaluation0.631 361.000 10.829 180.606 400.646 160.836 200.068 430.511 480.462 310.507 260.619 190.389 410.610 321.000 10.432 450.828 150.673 240.788 590.552 24
DENet0.629 371.000 10.797 310.608 390.589 260.627 500.219 260.882 10.310 440.402 470.383 500.396 400.650 181.000 10.663 200.543 580.691 221.000 10.568 21
3D-MPA0.611 381.000 10.833 170.765 240.526 370.756 390.136 360.588 410.470 290.438 380.432 440.358 450.650 180.857 390.429 460.765 280.557 451.000 10.430 41
Francis Engelmann, Martin Bokeloh, Alireza Fathi, Bastian Leibe, Matthias Nießner: 3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation. CVPR 2020
OSIS0.605 391.000 10.801 290.599 420.535 340.728 430.286 180.436 540.679 120.491 270.433 420.256 470.404 540.857 390.620 280.724 380.510 521.000 10.539 25
AOIA0.601 401.000 10.761 410.687 330.485 430.828 210.008 540.663 280.405 370.405 460.425 450.490 280.596 360.714 500.553 360.779 230.597 350.992 410.424 43
PCJC0.578 411.000 10.810 230.583 460.449 470.813 270.042 480.603 390.341 430.490 280.465 390.410 380.650 180.835 470.264 560.694 450.561 430.889 520.504 31
SSEN0.575 421.000 10.761 400.473 500.477 450.795 320.066 440.529 460.658 150.460 320.461 400.380 420.331 560.859 380.401 490.692 470.653 271.000 10.348 48
Dongsu Zhang, Junha Chun, Sang Kyun Cha, Young Min Kim: Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with Deep Metric Learning. Arxiv
RWSeg0.567 430.528 600.708 510.626 370.580 280.745 420.063 450.627 350.240 480.400 480.497 360.464 300.515 431.000 10.475 410.745 340.571 401.000 10.429 42
NeuralBF0.555 440.667 500.896 90.843 120.517 390.751 400.029 490.519 470.414 360.439 360.465 380.000 660.484 450.857 390.287 540.693 460.651 291.000 10.485 35
Weiwei Sun, Daniel Rebain, Renjie Liao, Vladimir Tankovich, Soroosh Yazdani, Kwang Moo Yi, Andrea Tagliasacchi: NeuralBF: Neural Bilateral Filtering for Top-down Instance Segmentation on Point Clouds. WACV 2023
MTML0.549 451.000 10.807 260.588 450.327 520.647 480.004 560.815 70.180 510.418 430.364 520.182 500.445 481.000 10.442 440.688 480.571 411.000 10.396 44
Jean Lahoud, Bernard Ghanem, Marc Pollefeys, Martin R. Oswald: 3D Instance Segmentation via Multi-task Metric Learning. ICCV 2019 [oral]
ClickSeg_Instance0.539 461.000 10.621 550.300 530.530 360.698 440.127 370.533 450.222 490.430 410.400 470.365 430.574 400.938 350.472 420.659 500.543 480.944 470.347 49
One_Thing_One_Clickpermissive0.529 470.667 500.718 470.777 210.399 480.683 450.000 590.669 260.138 540.391 490.374 510.539 240.360 550.641 540.556 350.774 260.593 360.997 340.251 54
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
Sparse R-CNN0.515 481.000 10.538 600.282 540.468 460.790 330.173 280.345 560.429 330.413 450.484 370.176 510.595 380.591 550.522 380.668 490.476 530.986 450.327 50
Occipital-SCS0.512 491.000 10.716 480.509 490.506 410.611 510.092 410.602 400.177 520.346 520.383 490.165 520.442 490.850 460.386 510.618 540.543 490.889 520.389 45
3D-BoNet0.488 501.000 10.672 540.590 440.301 540.484 610.098 400.620 370.306 450.341 530.259 560.125 540.434 510.796 490.402 480.499 600.513 510.909 510.439 40
Bo Yang, Jianan Wang, Ronald Clark, Qingyong Hu, Sen Wang, Andrew Markham, Niki Trigoni: Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds. NeurIPS 2019 Spotlight
PanopticFusion-inst0.478 510.667 500.712 500.595 430.259 570.550 570.000 590.613 380.175 530.250 580.434 410.437 340.411 530.857 390.485 400.591 570.267 630.944 470.359 47
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
SPG_WSIS0.470 520.667 500.685 530.677 350.372 500.562 550.000 590.482 510.244 470.316 550.298 530.052 610.442 500.857 390.267 550.702 420.559 441.000 10.287 52
SALoss-ResNet0.459 531.000 10.737 460.159 640.259 560.587 530.138 350.475 520.217 500.416 440.408 460.128 530.315 570.714 500.411 470.536 590.590 370.873 560.304 51
Zhidong Liang, Ming Yang, Hao Li, Chunxiang Wang: 3D Instance Embedding Learning With a Structure-Aware Loss Function for Point Cloud Segmentation. IEEE Robotics and Automation Letters (IROS2020)
MASCpermissive0.447 540.528 600.555 580.381 510.382 490.633 490.002 570.509 490.260 460.361 510.432 430.327 460.451 470.571 560.367 520.639 520.386 540.980 460.276 53
Chen Liu, Yasutaka Furukawa: MASC: Multi-scale Affinity with Sparse Convolution for 3D Instance Segmentation.
SegGroup_inspermissive0.445 550.667 500.773 370.185 610.317 530.656 470.000 590.407 550.134 550.381 500.267 550.217 490.476 460.714 500.452 430.629 530.514 501.000 10.222 57
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
3D-SISpermissive0.382 561.000 10.432 630.245 560.190 580.577 540.013 530.263 580.033 610.320 540.240 570.075 570.422 520.857 390.117 610.699 430.271 620.883 540.235 56
Ji Hou, Angela Dai, Matthias Niessner: 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans. CVPR 2019
Hier3Dcopyleft0.323 570.667 500.542 590.264 550.157 610.550 560.000 590.205 610.009 630.270 570.218 580.075 570.500 440.688 530.007 670.698 440.301 590.459 640.200 58
Tan: HCFS3D: Hierarchical Coupled Feature Selection Network for 3D Semantic and Instance Segmentation.
UNet-backbone0.319 580.667 500.715 490.233 570.189 590.479 620.008 540.218 590.067 600.201 600.173 590.107 550.123 620.438 580.150 580.615 550.355 550.916 500.093 66
R-PointNet0.306 590.500 620.405 640.311 520.348 510.589 520.054 460.068 640.126 560.283 560.290 540.028 620.219 600.214 610.331 530.396 640.275 600.821 580.245 55
Region-18class0.284 600.250 660.751 450.228 590.270 550.521 580.000 590.468 530.008 650.205 590.127 600.000 660.068 640.070 650.262 570.652 510.323 570.740 600.173 59
SemRegionNet-20cls0.250 610.333 630.613 560.229 580.163 600.493 590.000 590.304 570.107 570.147 630.100 620.052 600.231 580.119 630.039 630.445 620.325 560.654 610.141 62
tmp0.248 620.667 500.437 620.188 600.153 620.491 600.000 590.208 600.094 590.153 620.099 630.057 590.217 610.119 630.039 630.466 610.302 580.640 620.140 63
3D-BEVIS0.248 620.667 500.566 570.076 650.035 670.394 650.027 510.035 660.098 580.099 650.030 660.025 630.098 630.375 600.126 600.604 560.181 650.854 570.171 60
Cathrin Elich, Francis Engelmann, Jonas Schult, Theodora Kontogianni, Bastian Leibe: 3D-BEVIS: Birds-Eye-View Instance Segmentation.
Sem_Recon_ins0.227 640.764 490.486 610.069 660.098 640.426 640.017 520.067 650.015 620.172 610.100 610.096 560.054 660.183 620.135 590.366 650.260 640.614 630.168 61
ASIS0.199 650.333 630.253 660.167 630.140 630.438 630.000 590.177 620.008 640.121 640.069 640.004 650.231 590.429 590.036 650.445 630.273 610.333 660.119 65
Sgpn_scannet0.143 660.208 670.390 650.169 620.065 650.275 660.029 500.069 630.000 660.087 660.043 650.014 640.027 670.000 660.112 620.351 660.168 660.438 650.138 64
MaskRCNN 2d->3d Proj0.058 670.333 630.002 670.000 670.053 660.002 670.002 580.021 670.000 660.045 670.024 670.238 480.065 650.000 660.014 660.107 670.020 670.110 670.006 67