The 3D semantic instance prediction task involves detecting and segmenting the object in an 3D scan mesh.

Evaluation and metrics

Our evaluation ranks all methods according to the average precision for each class. We report the mean average precision AP at overlap 0.25 (AP 25%), overlap 0.5 (AP 50%), and over overlaps in the range [0.5:0.95:0.05] (AP). Note that multiple predictions of the same ground truth instance are penalized as false positives.



This table lists the benchmark results for the 3D semantic instance scenario.




Method Infoavg ap 50%bathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
TD3Dpermissive0.751 191.000 10.774 420.867 110.621 260.934 10.404 190.706 240.812 20.605 210.633 220.626 240.690 201.000 10.640 310.820 230.777 151.000 10.612 14
Maksim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: Top-Down Beats Bottom-Up in 3D Instance Segmentation. WACV 2024
OneFormer3Dcopyleft0.801 51.000 10.973 40.909 50.698 130.928 20.582 30.668 330.685 170.780 20.687 120.698 170.702 141.000 10.794 100.900 20.784 120.986 500.635 7
Maxim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: OneFormer3D: One Transformer for Unified Point Cloud Segmentation.
InsSSM0.799 81.000 10.915 120.710 390.729 30.925 30.664 10.670 310.770 60.766 30.739 30.737 60.700 151.000 10.792 110.829 200.815 40.997 370.625 9
SIM3D0.803 41.000 10.967 80.863 120.692 160.924 40.552 90.732 210.667 200.732 100.662 140.796 10.789 91.000 10.803 80.864 60.766 191.000 10.643 4
PointRel0.816 11.000 10.971 60.908 60.743 20.923 50.573 60.714 220.695 160.734 80.747 20.725 90.809 11.000 10.814 70.899 30.820 31.000 10.610 16
: Relation3D (PointRel): Enhancing Relation Modeling for Point Cloud Instance Segmentation.
EV3D0.811 31.000 10.968 70.852 130.717 60.921 60.574 50.677 280.748 100.730 110.703 110.795 20.809 11.000 10.831 30.854 90.778 141.000 10.638 6
Spherical Mask(CtoF)0.812 21.000 10.973 50.852 130.718 50.917 70.574 40.677 280.748 100.729 120.715 60.795 20.809 11.000 10.831 30.854 90.787 101.000 10.638 5
Competitor-SPFormer0.800 61.000 10.986 20.845 150.705 110.915 80.532 110.733 200.757 90.733 90.708 80.698 160.648 330.981 360.890 10.830 180.796 70.997 370.644 3
ExtMask3D0.789 111.000 10.988 10.756 320.706 100.912 90.429 180.647 380.806 40.755 50.673 130.689 180.772 111.000 10.789 130.852 110.811 51.000 10.617 12
MG-Former0.791 101.000 10.980 30.837 180.626 240.897 100.543 100.759 150.800 50.766 40.659 150.769 40.697 181.000 10.791 120.707 460.791 91.000 10.610 15
ISBNetpermissive0.757 181.000 10.904 130.731 350.678 180.895 110.458 150.644 400.670 190.710 140.620 240.732 70.650 231.000 10.756 170.778 300.779 131.000 10.614 13
Tuan Duc Ngo, Binh-Son Hua, Khoi Nguyen: ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution. CVPR 2023
SphereSeg0.680 291.000 10.856 180.744 330.618 280.893 120.151 360.651 370.713 130.537 280.579 330.430 420.651 221.000 10.389 560.744 400.697 260.991 480.601 19
MAFT0.786 131.000 10.894 170.807 220.694 150.893 130.486 130.674 300.740 120.786 10.704 100.727 80.739 131.000 10.707 220.849 130.756 221.000 10.685 1
SPFormerpermissive0.770 150.903 540.903 140.806 230.609 300.886 140.568 70.815 60.705 150.711 130.655 160.652 220.685 211.000 10.789 140.809 270.776 161.000 10.583 23
Sun Jiahao, Qing Chunmei, Tan Junpeng, Xu Xiangmin: Superpoint Transformer for 3D Scene Instance Segmentation. AAAI 2023 [Oral]
Queryformer0.787 121.000 10.933 90.601 480.754 10.886 150.558 80.661 350.767 80.665 170.716 50.639 230.808 51.000 10.844 20.897 40.804 61.000 10.624 10
Mask3D0.780 141.000 10.786 410.716 370.696 140.885 160.500 120.714 220.810 30.672 160.715 60.679 190.809 11.000 10.831 30.833 170.787 101.000 10.602 18
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
TST3D0.795 91.000 10.929 110.918 40.709 90.884 170.596 20.704 250.769 70.734 70.644 190.699 150.751 121.000 10.794 90.876 50.757 210.997 370.550 30
Duc Tran Dang Trung, Byeongkeun Kang, Yeejin Lee: MSTA3D: Multi-scale Twin-attention for 3D Instance Segmentation. ACM Multimedia 2024
DKNet0.718 241.000 10.814 270.782 260.619 270.872 180.224 310.751 170.569 300.677 150.585 300.724 100.633 350.981 360.515 450.819 240.736 251.000 10.617 11
Yizheng Wu, Min Shi, Shuaiyuan Du, Hao Lu, Zhiguo Cao, Weicai Zhong: 3D Instances as 1D Kernels. ECCV 2022
SoftGroup++0.769 161.000 10.803 340.937 10.684 170.865 190.213 330.870 20.664 210.571 230.758 10.702 130.807 61.000 10.653 290.902 10.792 81.000 10.626 8
INS-Conv-instance0.657 341.000 10.760 480.667 430.581 330.863 200.323 230.655 360.477 340.473 370.549 370.432 410.650 231.000 10.655 270.738 410.585 440.944 530.472 43
SoftGrouppermissive0.761 171.000 10.808 300.845 150.716 70.862 210.243 300.824 40.655 230.620 180.734 40.699 140.791 80.981 360.716 200.844 140.769 171.000 10.594 21
Thang Vu, Kookhoi Kim, Tung M. Luu, Xuan Thanh Nguyen, Chang D. Yoo: SoftGroup for 3D Instance Segmentaiton on Point Clouds. CVPR 2022 [Oral]
UniPerception0.800 61.000 10.930 100.872 100.727 40.862 220.454 170.764 130.820 10.746 60.706 90.750 50.772 100.926 430.764 160.818 260.826 10.997 370.660 2
GraphCut0.732 211.000 10.788 390.724 360.642 230.859 230.248 290.787 110.618 260.596 220.653 180.722 110.583 451.000 10.766 150.861 70.825 21.000 10.504 36
PBNetpermissive0.747 201.000 10.818 260.837 190.713 80.844 240.457 160.647 380.711 140.614 190.617 260.657 210.650 231.000 10.692 230.822 220.765 201.000 10.595 20
W.Zhao, Y.Yan, C.Yang, J.Ye,X.Yang,K.Huang: Divide and Conquer: 3D Instance Segmentation With Point-Wise Binarization. ICCV 2023
OccuSeg+instance0.672 321.000 10.758 500.682 410.576 350.842 250.477 140.504 560.524 320.567 240.585 320.451 370.557 471.000 10.751 180.797 280.563 481.000 10.467 44
Lei Han, Tian Zheng, Lan Xu, Lu Fang: OccuSeg: Occupancy-aware 3D Instance Segmentation. CVPR2020
Mask3D_evaluation0.631 421.000 10.829 240.606 470.646 220.836 260.068 490.511 540.462 370.507 320.619 250.389 470.610 381.000 10.432 510.828 210.673 300.788 650.552 29
AOIA0.601 461.000 10.761 470.687 400.485 490.828 270.008 600.663 340.405 430.405 520.425 510.490 340.596 420.714 560.553 420.779 290.597 410.992 470.424 49
RPGN0.643 371.000 10.758 490.582 540.539 380.826 280.046 530.765 120.372 460.436 450.588 290.539 310.650 231.000 10.577 370.750 380.653 340.997 370.495 39
Shichao Dong, Guosheng Lin, Tzu-Yi Hung: Learning Regional Purity for Instance Segmentation on 3D Point Clouds. ECCV 2022
DualGroup0.694 281.000 10.799 360.811 210.622 250.817 290.376 210.805 90.590 280.487 350.568 340.525 320.650 230.835 530.600 360.829 190.655 321.000 10.526 32
Box2Mask0.677 311.000 10.847 210.771 280.509 460.816 300.277 260.558 490.482 330.562 250.640 200.448 380.700 151.000 10.666 240.852 120.578 450.997 370.488 40
Julian Chibane, Francis Engelmann, Tuan Anh Tran, Gerard Pons-Moll: Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation Using Bounding Boxes. ECCV 2022
Mask-Group0.664 331.000 10.822 250.764 310.616 290.815 310.139 400.694 270.597 270.459 390.566 350.599 260.600 390.516 630.715 210.819 250.635 361.000 10.603 17
Min Zhong, Xinghao Chen, Xiaokang Chen, Gang Zeng, Yunhe Wang: MaskGroup: Hierarchical Point Grouping and Masking for 3D Instance Segmentation. ICME 2022
CSC-Pretrained0.648 351.000 10.810 280.768 290.523 440.813 320.143 390.819 50.389 440.422 480.511 410.443 390.650 231.000 10.624 330.732 420.634 371.000 10.375 52
PCJC0.578 471.000 10.810 290.583 530.449 530.813 330.042 540.603 450.341 490.490 340.465 450.410 440.650 230.835 530.264 620.694 510.561 490.889 580.504 37
HAISpermissive0.699 261.000 10.849 200.820 200.675 190.808 340.279 250.757 160.465 360.517 300.596 280.559 270.600 391.000 10.654 280.767 330.676 290.994 460.560 28
Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang: Hierarchical Aggregation for 3D Instance Segmentation. ICCV 2021
SSTNetpermissive0.698 271.000 10.697 580.888 80.556 370.803 350.387 200.626 420.417 410.556 260.585 310.702 120.600 391.000 10.824 60.720 450.692 271.000 10.509 35
Zhihao Liang, Zhihao Li, Songcen Xu, Mingkui Tan, Kui Jia: Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks. ICCV2021
Dyco3Dcopyleft0.641 381.000 10.841 220.893 70.531 410.802 360.115 450.588 470.448 380.438 430.537 400.430 430.550 480.857 450.534 430.764 350.657 310.987 490.568 25
Tong He; Chunhua Shen; Anton van den Hengel: DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution. CVPR2021
PointGroup0.636 401.000 10.765 450.624 450.505 480.797 370.116 440.696 260.384 450.441 410.559 360.476 350.596 421.000 10.666 240.756 370.556 520.997 370.513 34
Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia: PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation. CVPR 2020 [oral]
SSEN0.575 481.000 10.761 460.473 560.477 510.795 380.066 500.529 520.658 220.460 380.461 460.380 480.331 620.859 440.401 550.692 530.653 331.000 10.348 54
Dongsu Zhang, Junha Chun, Sang Kyun Cha, Young Min Kim: Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with Deep Metric Learning. Arxiv
Sparse R-CNN0.515 541.000 10.538 660.282 600.468 520.790 390.173 340.345 620.429 390.413 510.484 430.176 570.595 440.591 610.522 440.668 550.476 590.986 510.327 56
IPCA-Inst0.731 221.000 10.788 400.884 90.698 120.788 400.252 280.760 140.646 240.511 310.637 210.665 200.804 71.000 10.644 300.778 310.747 241.000 10.561 27
PE0.645 361.000 10.773 440.798 250.538 390.786 410.088 480.799 100.350 480.435 460.547 380.545 290.646 340.933 420.562 390.761 360.556 530.997 370.501 38
Biao Zhang, Peter Wonka: Point Cloud Instance Segmentation using Probabilistic Embeddings. CVPR 2021
DD-UNet+Group0.635 410.667 560.797 380.714 380.562 360.774 420.146 370.810 80.429 400.476 360.546 390.399 450.633 351.000 10.632 320.722 440.609 391.000 10.514 33
H. Liu, R. Liu, K. Yang, J. Zhang, K. Peng, R. Stiefelhagen: HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor. ICCVW 2021
DANCENET0.680 291.000 10.807 310.733 340.600 310.768 430.375 220.543 500.538 310.610 200.599 270.498 330.632 370.981 360.739 190.856 80.633 380.882 610.454 45
TopoSeg0.725 231.000 10.806 330.933 20.668 200.758 440.272 270.734 190.630 250.549 270.654 170.606 250.697 190.966 400.612 350.839 150.754 231.000 10.573 24
3D-MPA0.611 441.000 10.833 230.765 300.526 430.756 450.136 420.588 470.470 350.438 440.432 500.358 510.650 230.857 450.429 520.765 340.557 511.000 10.430 47
Francis Engelmann, Martin Bokeloh, Alireza Fathi, Bastian Leibe, Matthias Nießner: 3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation. CVPR 2020
NeuralBF0.555 500.667 560.896 150.843 170.517 450.751 460.029 550.519 530.414 420.439 420.465 440.000 720.484 510.857 450.287 600.693 520.651 351.000 10.485 41
Weiwei Sun, Daniel Rebain, Renjie Liao, Vladimir Tankovich, Soroosh Yazdani, Kwang Moo Yi, Andrea Tagliasacchi: NeuralBF: Neural Bilateral Filtering for Top-down Instance Segmentation on Point Clouds. WACV 2023
SSEC0.707 251.000 10.850 190.924 30.648 210.747 470.162 350.862 30.572 290.520 290.624 230.549 280.649 321.000 10.560 400.706 470.768 181.000 10.591 22
RWSeg0.567 490.528 660.708 570.626 440.580 340.745 480.063 510.627 410.240 540.400 540.497 420.464 360.515 491.000 10.475 470.745 390.571 461.000 10.429 48
OSIS0.605 451.000 10.801 350.599 490.535 400.728 490.286 240.436 600.679 180.491 330.433 480.256 530.404 600.857 450.620 340.724 430.510 581.000 10.539 31
ClickSeg_Instance0.539 521.000 10.621 610.300 590.530 420.698 500.127 430.533 510.222 550.430 470.400 530.365 490.574 460.938 410.472 480.659 560.543 540.944 530.347 55
One_Thing_One_Clickpermissive0.529 530.667 560.718 530.777 270.399 540.683 510.000 650.669 320.138 600.391 550.374 570.539 300.360 610.641 600.556 410.774 320.593 420.997 370.251 60
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
GICN0.638 391.000 10.895 160.800 240.480 500.676 520.144 380.737 180.354 470.447 400.400 540.365 490.700 151.000 10.569 380.836 160.599 401.000 10.473 42
SegGroup_inspermissive0.445 610.667 560.773 430.185 670.317 590.656 530.000 650.407 610.134 610.381 560.267 610.217 550.476 520.714 560.452 490.629 590.514 561.000 10.222 63
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
MTML0.549 511.000 10.807 320.588 520.327 580.647 540.004 620.815 70.180 570.418 490.364 580.182 560.445 541.000 10.442 500.688 540.571 471.000 10.396 50
Jean Lahoud, Bernard Ghanem, Marc Pollefeys, Martin R. Oswald: 3D Instance Segmentation via Multi-task Metric Learning. ICCV 2019 [oral]
MASCpermissive0.447 600.528 660.555 640.381 570.382 550.633 550.002 630.509 550.260 520.361 570.432 490.327 520.451 530.571 620.367 580.639 580.386 600.980 520.276 59
Chen Liu, Yasutaka Furukawa: MASC: Multi-scale Affinity with Sparse Convolution for 3D Instance Segmentation.
DENet0.629 431.000 10.797 370.608 460.589 320.627 560.219 320.882 10.310 500.402 530.383 560.396 460.650 231.000 10.663 260.543 640.691 281.000 10.568 26
Occipital-SCS0.512 551.000 10.716 540.509 550.506 470.611 570.092 470.602 460.177 580.346 580.383 550.165 580.442 550.850 520.386 570.618 600.543 550.889 580.389 51
R-PointNet0.306 650.500 680.405 700.311 580.348 570.589 580.054 520.068 700.126 620.283 620.290 600.028 680.219 660.214 670.331 590.396 700.275 660.821 640.245 61
SALoss-ResNet0.459 591.000 10.737 520.159 700.259 620.587 590.138 410.475 580.217 560.416 500.408 520.128 590.315 630.714 560.411 530.536 650.590 430.873 620.304 57
Zhidong Liang, Ming Yang, Hao Li, Chunxiang Wang: 3D Instance Embedding Learning With a Structure-Aware Loss Function for Point Cloud Segmentation. IEEE Robotics and Automation Letters (IROS2020)
3D-SISpermissive0.382 621.000 10.432 690.245 620.190 640.577 600.013 590.263 640.033 670.320 600.240 630.075 630.422 580.857 450.117 670.699 490.271 680.883 600.235 62
Ji Hou, Angela Dai, Matthias Niessner: 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans. CVPR 2019
SPG_WSIS0.470 580.667 560.685 590.677 420.372 560.562 610.000 650.482 570.244 530.316 610.298 590.052 670.442 560.857 450.267 610.702 480.559 501.000 10.287 58
Hier3Dcopyleft0.323 630.667 560.542 650.264 610.157 670.550 620.000 650.205 670.009 690.270 630.218 640.075 630.500 500.688 590.007 730.698 500.301 650.459 700.200 64
Tan: HCFS3D: Hierarchical Coupled Feature Selection Network for 3D Semantic and Instance Segmentation.
PanopticFusion-inst0.478 570.667 560.712 560.595 500.259 630.550 630.000 650.613 440.175 590.250 640.434 470.437 400.411 590.857 450.485 460.591 630.267 690.944 530.359 53
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
Region-18class0.284 660.250 720.751 510.228 650.270 610.521 640.000 650.468 590.008 710.205 650.127 660.000 720.068 700.070 710.262 630.652 570.323 630.740 660.173 65
SemRegionNet-20cls0.250 670.333 690.613 620.229 640.163 660.493 650.000 650.304 630.107 630.147 690.100 680.052 660.231 640.119 690.039 690.445 680.325 620.654 670.141 68
tmp0.248 680.667 560.437 680.188 660.153 680.491 660.000 650.208 660.094 650.153 680.099 690.057 650.217 670.119 690.039 690.466 670.302 640.640 680.140 69
3D-BoNet0.488 561.000 10.672 600.590 510.301 600.484 670.098 460.620 430.306 510.341 590.259 620.125 600.434 570.796 550.402 540.499 660.513 570.909 570.439 46
Bo Yang, Jianan Wang, Ronald Clark, Qingyong Hu, Sen Wang, Andrew Markham, Niki Trigoni: Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds. NeurIPS 2019 Spotlight
UNet-backbone0.319 640.667 560.715 550.233 630.189 650.479 680.008 600.218 650.067 660.201 660.173 650.107 610.123 680.438 640.150 640.615 610.355 610.916 560.093 72
ASIS0.199 710.333 690.253 720.167 690.140 690.438 690.000 650.177 680.008 700.121 700.069 700.004 710.231 650.429 650.036 710.445 690.273 670.333 720.119 71
Sem_Recon_ins0.227 700.764 550.486 670.069 720.098 700.426 700.017 580.067 710.015 680.172 670.100 670.096 620.054 720.183 680.135 650.366 710.260 700.614 690.168 67
3D-BEVIS0.248 680.667 560.566 630.076 710.035 730.394 710.027 570.035 720.098 640.099 710.030 720.025 690.098 690.375 660.126 660.604 620.181 710.854 630.171 66
Cathrin Elich, Francis Engelmann, Jonas Schult, Theodora Kontogianni, Bastian Leibe: 3D-BEVIS: Birds-Eye-View Instance Segmentation.
Sgpn_scannet0.143 720.208 730.390 710.169 680.065 710.275 720.029 560.069 690.000 720.087 720.043 710.014 700.027 730.000 720.112 680.351 720.168 720.438 710.138 70
MaskRCNN 2d->3d Proj0.058 730.333 690.002 730.000 730.053 720.002 730.002 640.021 730.000 720.045 730.024 730.238 540.065 710.000 720.014 720.107 730.020 730.110 730.006 73