The 3D semantic instance prediction task involves detecting and segmenting the object in an 3D scan mesh.

Evaluation and metrics

Our evaluation ranks all methods according to the average precision for each class. We report the mean average precision AP at overlap 0.25 (AP 25%), overlap 0.5 (AP 50%), and over overlaps in the range [0.5:0.95:0.05] (AP). Note that multiple predictions of the same ground truth instance are penalized as false positives.



This table lists the benchmark results for the 3D semantic instance scenario.




Method Infoavg ap 50%bathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
PointRel0.816 11.000 10.971 70.908 60.743 20.923 60.573 60.714 220.695 170.734 80.747 20.725 100.809 11.000 10.814 80.899 30.820 31.000 10.610 16
: Relation3D (PointRel): Enhancing Relation Modeling for Point Cloud Instance Segmentation.
Spherical Mask(CtoF)0.812 21.000 10.973 60.852 130.718 50.917 80.574 40.677 290.748 100.729 120.715 60.795 20.809 11.000 10.831 40.854 90.787 101.000 10.638 5
EV3D0.811 31.000 10.968 80.852 130.717 60.921 70.574 50.677 290.748 100.730 110.703 110.795 20.809 11.000 10.831 40.854 90.778 141.000 10.638 6
SIM3D0.803 41.000 10.967 90.863 120.692 170.924 50.552 100.732 210.667 210.732 100.662 150.796 10.789 91.000 10.803 90.864 60.766 191.000 10.643 4
OneFormer3Dcopyleft0.801 51.000 10.973 50.909 50.698 140.928 30.582 30.668 340.685 180.780 20.687 130.698 180.702 141.000 10.794 110.900 20.784 120.986 510.635 7
Maxim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: OneFormer3D: One Transformer for Unified Point Cloud Segmentation.
Competitor-SPFormer0.800 61.000 10.986 20.845 150.705 120.915 90.532 120.733 200.757 90.733 90.708 80.698 170.648 340.981 370.890 10.830 180.796 70.997 380.644 3
UniPerception0.800 61.000 10.930 110.872 100.727 40.862 230.454 180.764 130.820 10.746 60.706 90.750 50.772 100.926 440.764 170.818 260.826 10.997 380.660 2
InsSSM0.799 81.000 10.915 130.710 400.729 30.925 40.664 10.670 320.770 60.766 30.739 30.737 60.700 151.000 10.792 120.829 200.815 40.997 380.625 9
Lei Yao, Yi Wang, Moyun Liu, Lap-Pui Chau: SGIFormer: Semantic-guided and Geometric-enhanced Interleaving Transformer for 3D Instance Segmentation. TCSVT, 2024
TST3D0.795 91.000 10.929 120.918 40.709 90.884 180.596 20.704 250.769 70.734 70.644 200.699 160.751 121.000 10.794 100.876 50.757 220.997 380.550 31
Duc Tran Dang Trung, Byeongkeun Kang, Yeejin Lee: MSTA3D: Multi-scale Twin-attention for 3D Instance Segmentation. ACM Multimedia 2024
MG-Former0.791 101.000 10.980 40.837 180.626 250.897 110.543 110.759 150.800 50.766 40.659 160.769 40.697 181.000 10.791 130.707 470.791 91.000 10.610 15
ExtMask3D0.789 111.000 10.988 10.756 330.706 110.912 100.429 190.647 390.806 40.755 50.673 140.689 190.772 111.000 10.789 140.852 110.811 51.000 10.617 12
Queryformer0.787 121.000 10.933 100.601 490.754 10.886 160.558 90.661 360.767 80.665 180.716 50.639 240.808 51.000 10.844 30.897 40.804 61.000 10.624 10
MAFT0.786 131.000 10.894 180.807 220.694 160.893 140.486 140.674 310.740 120.786 10.704 100.727 90.739 131.000 10.707 230.849 130.756 231.000 10.685 1
KmaxOneFormerNetpermissive0.783 140.903 540.981 30.794 260.706 100.931 20.561 80.701 260.706 150.727 130.697 120.731 80.689 211.000 10.856 20.750 380.761 211.000 10.599 20
Mask3D0.780 151.000 10.786 420.716 380.696 150.885 170.500 130.714 220.810 30.672 170.715 60.679 200.809 11.000 10.831 40.833 170.787 101.000 10.602 18
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
SPFormerpermissive0.770 160.903 540.903 150.806 230.609 310.886 150.568 70.815 60.705 160.711 140.655 170.652 230.685 221.000 10.789 150.809 270.776 161.000 10.583 24
Sun Jiahao, Qing Chunmei, Tan Junpeng, Xu Xiangmin: Superpoint Transformer for 3D Scene Instance Segmentation. AAAI 2023 [Oral]
SoftGroup++0.769 171.000 10.803 350.937 10.684 180.865 200.213 340.870 20.664 220.571 240.758 10.702 140.807 61.000 10.653 300.902 10.792 81.000 10.626 8
SoftGrouppermissive0.761 181.000 10.808 310.845 150.716 70.862 220.243 310.824 40.655 240.620 190.734 40.699 150.791 80.981 370.716 210.844 140.769 171.000 10.594 22
Thang Vu, Kookhoi Kim, Tung M. Luu, Xuan Thanh Nguyen, Chang D. Yoo: SoftGroup for 3D Instance Segmentaiton on Point Clouds. CVPR 2022 [Oral]
ISBNetpermissive0.757 191.000 10.904 140.731 360.678 190.895 120.458 160.644 410.670 200.710 150.620 250.732 70.650 241.000 10.756 180.778 300.779 131.000 10.614 13
Tuan Duc Ngo, Binh-Son Hua, Khoi Nguyen: ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution. CVPR 2023
TD3Dpermissive0.751 201.000 10.774 430.867 110.621 270.934 10.404 200.706 240.812 20.605 220.633 230.626 250.690 201.000 10.640 320.820 230.777 151.000 10.612 14
Maksim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: Top-Down Beats Bottom-Up in 3D Instance Segmentation. WACV 2024
PBNetpermissive0.747 211.000 10.818 270.837 190.713 80.844 250.457 170.647 390.711 140.614 200.617 270.657 220.650 241.000 10.692 240.822 220.765 201.000 10.595 21
W.Zhao, Y.Yan, C.Yang, J.Ye,X.Yang,K.Huang: Divide and Conquer: 3D Instance Segmentation With Point-Wise Binarization. ICCV 2023
GraphCut0.732 221.000 10.788 400.724 370.642 240.859 240.248 300.787 110.618 270.596 230.653 190.722 120.583 461.000 10.766 160.861 70.825 21.000 10.504 37
IPCA-Inst0.731 231.000 10.788 410.884 90.698 130.788 410.252 290.760 140.646 250.511 320.637 220.665 210.804 71.000 10.644 310.778 310.747 251.000 10.561 28
TopoSeg0.725 241.000 10.806 340.933 20.668 210.758 450.272 280.734 190.630 260.549 280.654 180.606 260.697 190.966 410.612 360.839 150.754 241.000 10.573 25
DKNet0.718 251.000 10.814 280.782 270.619 280.872 190.224 320.751 170.569 310.677 160.585 310.724 110.633 360.981 370.515 460.819 240.736 261.000 10.617 11
Yizheng Wu, Min Shi, Shuaiyuan Du, Hao Lu, Zhiguo Cao, Weicai Zhong: 3D Instances as 1D Kernels. ECCV 2022
SSEC0.707 261.000 10.850 200.924 30.648 220.747 480.162 360.862 30.572 300.520 300.624 240.549 290.649 331.000 10.560 410.706 480.768 181.000 10.591 23
HAISpermissive0.699 271.000 10.849 210.820 200.675 200.808 350.279 260.757 160.465 370.517 310.596 290.559 280.600 401.000 10.654 290.767 330.676 300.994 470.560 29
Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang: Hierarchical Aggregation for 3D Instance Segmentation. ICCV 2021
SSTNetpermissive0.698 281.000 10.697 590.888 80.556 380.803 360.387 210.626 430.417 420.556 270.585 320.702 130.600 401.000 10.824 70.720 460.692 281.000 10.509 36
Zhihao Liang, Zhihao Li, Songcen Xu, Mingkui Tan, Kui Jia: Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks. ICCV2021
DualGroup0.694 291.000 10.799 370.811 210.622 260.817 300.376 220.805 90.590 290.487 360.568 350.525 330.650 240.835 540.600 370.829 190.655 331.000 10.526 33
SphereSeg0.680 301.000 10.856 190.744 340.618 290.893 130.151 370.651 380.713 130.537 290.579 340.430 430.651 231.000 10.389 570.744 410.697 270.991 490.601 19
DANCENET0.680 301.000 10.807 320.733 350.600 320.768 440.375 230.543 510.538 320.610 210.599 280.498 340.632 380.981 370.739 200.856 80.633 390.882 620.454 46
Box2Mask0.677 321.000 10.847 220.771 290.509 470.816 310.277 270.558 500.482 340.562 260.640 210.448 390.700 151.000 10.666 250.852 120.578 460.997 380.488 41
Julian Chibane, Francis Engelmann, Tuan Anh Tran, Gerard Pons-Moll: Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation Using Bounding Boxes. ECCV 2022
OccuSeg+instance0.672 331.000 10.758 510.682 420.576 360.842 260.477 150.504 570.524 330.567 250.585 330.451 380.557 481.000 10.751 190.797 280.563 491.000 10.467 45
Lei Han, Tian Zheng, Lan Xu, Lu Fang: OccuSeg: Occupancy-aware 3D Instance Segmentation. CVPR2020
Mask-Group0.664 341.000 10.822 260.764 320.616 300.815 320.139 410.694 280.597 280.459 400.566 360.599 270.600 400.516 640.715 220.819 250.635 371.000 10.603 17
Min Zhong, Xinghao Chen, Xiaokang Chen, Gang Zeng, Yunhe Wang: MaskGroup: Hierarchical Point Grouping and Masking for 3D Instance Segmentation. ICME 2022
INS-Conv-instance0.657 351.000 10.760 490.667 440.581 340.863 210.323 240.655 370.477 350.473 380.549 380.432 420.650 241.000 10.655 280.738 420.585 450.944 540.472 44
CSC-Pretrained0.648 361.000 10.810 290.768 300.523 450.813 330.143 400.819 50.389 450.422 490.511 420.443 400.650 241.000 10.624 340.732 430.634 381.000 10.375 53
PE0.645 371.000 10.773 450.798 250.538 400.786 420.088 490.799 100.350 490.435 470.547 390.545 300.646 350.933 430.562 400.761 360.556 540.997 380.501 39
Biao Zhang, Peter Wonka: Point Cloud Instance Segmentation using Probabilistic Embeddings. CVPR 2021
RPGN0.643 381.000 10.758 500.582 550.539 390.826 290.046 540.765 120.372 470.436 460.588 300.539 320.650 241.000 10.577 380.750 390.653 350.997 380.495 40
Shichao Dong, Guosheng Lin, Tzu-Yi Hung: Learning Regional Purity for Instance Segmentation on 3D Point Clouds. ECCV 2022
Dyco3Dcopyleft0.641 391.000 10.841 230.893 70.531 420.802 370.115 460.588 480.448 390.438 440.537 410.430 440.550 490.857 460.534 440.764 350.657 320.987 500.568 26
Tong He; Chunhua Shen; Anton van den Hengel: DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution. CVPR2021
GICN0.638 401.000 10.895 170.800 240.480 510.676 530.144 390.737 180.354 480.447 410.400 550.365 500.700 151.000 10.569 390.836 160.599 411.000 10.473 43
PointGroup0.636 411.000 10.765 460.624 460.505 490.797 380.116 450.696 270.384 460.441 420.559 370.476 360.596 431.000 10.666 250.756 370.556 530.997 380.513 35
Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia: PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation. CVPR 2020 [oral]
DD-UNet+Group0.635 420.667 570.797 390.714 390.562 370.774 430.146 380.810 80.429 410.476 370.546 400.399 460.633 361.000 10.632 330.722 450.609 401.000 10.514 34
H. Liu, R. Liu, K. Yang, J. Zhang, K. Peng, R. Stiefelhagen: HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor. ICCVW 2021
Mask3D_evaluation0.631 431.000 10.829 250.606 480.646 230.836 270.068 500.511 550.462 380.507 330.619 260.389 480.610 391.000 10.432 520.828 210.673 310.788 660.552 30
DENet0.629 441.000 10.797 380.608 470.589 330.627 570.219 330.882 10.310 510.402 540.383 570.396 470.650 241.000 10.663 270.543 650.691 291.000 10.568 27
3D-MPA0.611 451.000 10.833 240.765 310.526 440.756 460.136 430.588 480.470 360.438 450.432 510.358 520.650 240.857 460.429 530.765 340.557 521.000 10.430 48
Francis Engelmann, Martin Bokeloh, Alireza Fathi, Bastian Leibe, Matthias Nießner: 3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation. CVPR 2020
OSIS0.605 461.000 10.801 360.599 500.535 410.728 500.286 250.436 610.679 190.491 340.433 490.256 540.404 610.857 460.620 350.724 440.510 591.000 10.539 32
AOIA0.601 471.000 10.761 480.687 410.485 500.828 280.008 610.663 350.405 440.405 530.425 520.490 350.596 430.714 570.553 430.779 290.597 420.992 480.424 50
PCJC0.578 481.000 10.810 300.583 540.449 540.813 340.042 550.603 460.341 500.490 350.465 460.410 450.650 240.835 540.264 630.694 520.561 500.889 590.504 38
SSEN0.575 491.000 10.761 470.473 570.477 520.795 390.066 510.529 530.658 230.460 390.461 470.380 490.331 630.859 450.401 560.692 540.653 341.000 10.348 55
Dongsu Zhang, Junha Chun, Sang Kyun Cha, Young Min Kim: Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with Deep Metric Learning. Arxiv
RWSeg0.567 500.528 670.708 580.626 450.580 350.745 490.063 520.627 420.240 550.400 550.497 430.464 370.515 501.000 10.475 480.745 400.571 471.000 10.429 49
NeuralBF0.555 510.667 570.896 160.843 170.517 460.751 470.029 560.519 540.414 430.439 430.465 450.000 730.484 520.857 460.287 610.693 530.651 361.000 10.485 42
Weiwei Sun, Daniel Rebain, Renjie Liao, Vladimir Tankovich, Soroosh Yazdani, Kwang Moo Yi, Andrea Tagliasacchi: NeuralBF: Neural Bilateral Filtering for Top-down Instance Segmentation on Point Clouds. WACV 2023
MTML0.549 521.000 10.807 330.588 530.327 590.647 550.004 630.815 70.180 580.418 500.364 590.182 570.445 551.000 10.442 510.688 550.571 481.000 10.396 51
Jean Lahoud, Bernard Ghanem, Marc Pollefeys, Martin R. Oswald: 3D Instance Segmentation via Multi-task Metric Learning. ICCV 2019 [oral]
ClickSeg_Instance0.539 531.000 10.621 620.300 600.530 430.698 510.127 440.533 520.222 560.430 480.400 540.365 500.574 470.938 420.472 490.659 570.543 550.944 540.347 56
One_Thing_One_Clickpermissive0.529 540.667 570.718 540.777 280.399 550.683 520.000 660.669 330.138 610.391 560.374 580.539 310.360 620.641 610.556 420.774 320.593 430.997 380.251 61
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
Sparse R-CNN0.515 551.000 10.538 670.282 610.468 530.790 400.173 350.345 630.429 400.413 520.484 440.176 580.595 450.591 620.522 450.668 560.476 600.986 520.327 57
Occipital-SCS0.512 561.000 10.716 550.509 560.506 480.611 580.092 480.602 470.177 590.346 590.383 560.165 590.442 560.850 530.386 580.618 610.543 560.889 590.389 52
3D-BoNet0.488 571.000 10.672 610.590 520.301 610.484 680.098 470.620 440.306 520.341 600.259 630.125 610.434 580.796 560.402 550.499 670.513 580.909 580.439 47
Bo Yang, Jianan Wang, Ronald Clark, Qingyong Hu, Sen Wang, Andrew Markham, Niki Trigoni: Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds. NeurIPS 2019 Spotlight
PanopticFusion-inst0.478 580.667 570.712 570.595 510.259 640.550 640.000 660.613 450.175 600.250 650.434 480.437 410.411 600.857 460.485 470.591 640.267 700.944 540.359 54
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
SPG_WSIS0.470 590.667 570.685 600.677 430.372 570.562 620.000 660.482 580.244 540.316 620.298 600.052 680.442 570.857 460.267 620.702 490.559 511.000 10.287 59
SALoss-ResNet0.459 601.000 10.737 530.159 710.259 630.587 600.138 420.475 590.217 570.416 510.408 530.128 600.315 640.714 570.411 540.536 660.590 440.873 630.304 58
Zhidong Liang, Ming Yang, Hao Li, Chunxiang Wang: 3D Instance Embedding Learning With a Structure-Aware Loss Function for Point Cloud Segmentation. IEEE Robotics and Automation Letters (IROS2020)
MASCpermissive0.447 610.528 670.555 650.381 580.382 560.633 560.002 640.509 560.260 530.361 580.432 500.327 530.451 540.571 630.367 590.639 590.386 610.980 530.276 60
Chen Liu, Yasutaka Furukawa: MASC: Multi-scale Affinity with Sparse Convolution for 3D Instance Segmentation.
SegGroup_inspermissive0.445 620.667 570.773 440.185 680.317 600.656 540.000 660.407 620.134 620.381 570.267 620.217 560.476 530.714 570.452 500.629 600.514 571.000 10.222 64
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
3D-SISpermissive0.382 631.000 10.432 700.245 630.190 650.577 610.013 600.263 650.033 680.320 610.240 640.075 640.422 590.857 460.117 680.699 500.271 690.883 610.235 63
Ji Hou, Angela Dai, Matthias Niessner: 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans. CVPR 2019
Hier3Dcopyleft0.323 640.667 570.542 660.264 620.157 680.550 630.000 660.205 680.009 700.270 640.218 650.075 640.500 510.688 600.007 740.698 510.301 660.459 710.200 65
Tan: HCFS3D: Hierarchical Coupled Feature Selection Network for 3D Semantic and Instance Segmentation.
UNet-backbone0.319 650.667 570.715 560.233 640.189 660.479 690.008 610.218 660.067 670.201 670.173 660.107 620.123 690.438 650.150 650.615 620.355 620.916 570.093 73
R-PointNet0.306 660.500 690.405 710.311 590.348 580.589 590.054 530.068 710.126 630.283 630.290 610.028 690.219 670.214 680.331 600.396 710.275 670.821 650.245 62
Region-18class0.284 670.250 730.751 520.228 660.270 620.521 650.000 660.468 600.008 720.205 660.127 670.000 730.068 710.070 720.262 640.652 580.323 640.740 670.173 66
SemRegionNet-20cls0.250 680.333 700.613 630.229 650.163 670.493 660.000 660.304 640.107 640.147 700.100 690.052 670.231 650.119 700.039 700.445 690.325 630.654 680.141 69
3D-BEVIS0.248 690.667 570.566 640.076 720.035 740.394 720.027 580.035 730.098 650.099 720.030 730.025 700.098 700.375 670.126 670.604 630.181 720.854 640.171 67
Cathrin Elich, Francis Engelmann, Jonas Schult, Theodora Kontogianni, Bastian Leibe: 3D-BEVIS: Birds-Eye-View Instance Segmentation.
tmp0.248 690.667 570.437 690.188 670.153 690.491 670.000 660.208 670.094 660.153 690.099 700.057 660.217 680.119 700.039 700.466 680.302 650.640 690.140 70
Sem_Recon_ins0.227 710.764 560.486 680.069 730.098 710.426 710.017 590.067 720.015 690.172 680.100 680.096 630.054 730.183 690.135 660.366 720.260 710.614 700.168 68
ASIS0.199 720.333 700.253 730.167 700.140 700.438 700.000 660.177 690.008 710.121 710.069 710.004 720.231 660.429 660.036 720.445 700.273 680.333 730.119 72
Sgpn_scannet0.143 730.208 740.390 720.169 690.065 720.275 730.029 570.069 700.000 730.087 730.043 720.014 710.027 740.000 730.112 690.351 730.168 730.438 720.138 71
MaskRCNN 2d->3d Proj0.058 740.333 700.002 740.000 740.053 730.002 740.002 650.021 740.000 730.045 740.024 740.238 550.065 720.000 730.014 730.107 740.020 740.110 740.006 74