The 3D semantic instance prediction task involves detecting and segmenting the object in an 3D scan mesh.

Evaluation and metrics

Our evaluation ranks all methods according to the average precision for each class. We report the mean average precision AP at overlap 0.25 (AP 25%), overlap 0.5 (AP 50%), and over overlaps in the range [0.5:0.95:0.05] (AP). Note that multiple predictions of the same ground truth instance are penalized as false positives.



This table lists the benchmark results for the 3D semantic instance scenario.




Method Infoavg ap 50%bathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
UniPerception0.800 71.000 10.930 120.872 100.727 40.862 250.454 200.764 130.820 10.746 70.706 110.750 70.772 100.926 460.764 190.818 270.826 20.997 400.660 3
Competitor-MAFT0.816 11.000 10.983 30.872 100.718 50.941 10.588 40.652 390.819 20.776 30.720 50.780 50.769 121.000 10.797 110.813 280.798 81.000 10.659 4
TD3Dpermissive0.751 221.000 10.774 450.867 120.621 290.934 30.404 220.706 240.812 30.605 240.633 250.626 270.690 221.000 10.640 340.820 240.777 171.000 10.612 16
Maksim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: Top-Down Beats Bottom-Up in 3D Instance Segmentation. WACV 2024
Mask3D0.780 171.000 10.786 440.716 400.696 160.885 190.500 150.714 220.810 40.672 190.715 80.679 220.809 11.000 10.831 40.833 180.787 121.000 10.602 20
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
ExtMask3D0.789 131.000 10.988 10.756 350.706 120.912 120.429 210.647 410.806 50.755 60.673 160.689 210.772 111.000 10.789 160.852 110.811 61.000 10.617 14
MG-Former0.791 121.000 10.980 50.837 190.626 270.897 130.543 130.759 150.800 60.766 50.659 180.769 60.697 191.000 10.791 150.707 490.791 111.000 10.610 17
InsSSM0.799 91.000 10.915 140.710 420.729 30.925 60.664 10.670 330.770 70.766 40.739 30.737 80.700 161.000 10.792 140.829 210.815 50.997 400.625 11
Lei Yao, Yi Wang, Moyun Liu, Lap-Pui Chau: SGIFormer: Semantic-guided and Geometric-enhanced Interleaving Transformer for 3D Instance Segmentation. TCSVT, 2024
TST3D0.795 111.000 10.929 130.918 40.709 100.884 200.596 30.704 250.769 80.734 90.644 220.699 180.751 131.000 10.794 120.876 50.757 240.997 400.550 33
Duc Tran Dang Trung, Byeongkeun Kang, Yeejin Lee: MSTA3D: Multi-scale Twin-attention for 3D Instance Segmentation. ACM Multimedia 2024
Queryformer0.787 141.000 10.933 110.601 510.754 10.886 180.558 110.661 370.767 90.665 200.716 70.639 260.808 51.000 10.844 30.897 40.804 71.000 10.624 12
Competitor-SPFormer0.800 71.000 10.986 20.845 160.705 130.915 110.532 140.733 200.757 100.733 110.708 100.698 190.648 360.981 390.890 10.830 190.796 90.997 400.644 5
Spherical Mask(CtoF)0.812 31.000 10.973 70.852 140.718 60.917 100.574 60.677 300.748 110.729 140.715 80.795 20.809 11.000 10.831 40.854 90.787 121.000 10.638 7
EV3D0.811 41.000 10.968 90.852 140.717 70.921 90.574 70.677 300.748 110.730 130.703 130.795 20.809 11.000 10.831 40.854 90.778 161.000 10.638 8
MAFT0.786 151.000 10.894 190.807 230.694 170.893 160.486 160.674 320.740 130.786 10.704 120.727 110.739 141.000 10.707 250.849 130.756 251.000 10.685 2
SphereSeg0.680 321.000 10.856 210.744 360.618 310.893 150.151 390.651 400.713 140.537 310.579 360.430 450.651 251.000 10.389 590.744 430.697 290.991 510.601 21
PBNetpermissive0.747 231.000 10.818 290.837 200.713 90.844 270.457 190.647 410.711 150.614 220.617 290.657 240.650 261.000 10.692 260.822 230.765 221.000 10.595 23
Weiguang Zhao, Yuyao Yan, Chaolong Yang, Jianan Ye, Xi Yang, Kaizhu Huang: Divide and Conquer: 3D Instance Segmentation With Point-Wise Binarization. ICCV 2023
KmaxOneFormerNetpermissive0.783 160.903 560.981 40.794 270.706 110.931 40.561 100.701 260.706 160.727 150.697 140.731 100.689 231.000 10.856 20.750 400.761 231.000 10.599 22
SPFormerpermissive0.770 180.903 560.903 160.806 240.609 330.886 170.568 90.815 60.705 170.711 160.655 190.652 250.685 241.000 10.789 170.809 290.776 181.000 10.583 26
Sun Jiahao, Qing Chunmei, Tan Junpeng, Xu Xiangmin: Superpoint Transformer for 3D Scene Instance Segmentation. AAAI 2023 [Oral]
PointRel0.816 11.000 10.971 80.908 60.743 20.923 80.573 80.714 220.695 180.734 100.747 20.725 120.809 11.000 10.814 90.899 30.820 41.000 10.610 18
: Relation3D (PointRel): Enhancing Relation Modeling for Point Cloud Instance Segmentation.
OneFormer3Dcopyleft0.801 61.000 10.973 60.909 50.698 150.928 50.582 50.668 350.685 190.780 20.687 150.698 200.702 151.000 10.794 130.900 20.784 140.986 530.635 9
Maxim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: OneFormer3D: One Transformer for Unified Point Cloud Segmentation.
OSIS0.605 481.000 10.801 380.599 520.535 430.728 520.286 270.436 630.679 200.491 360.433 510.256 560.404 630.857 480.620 370.724 460.510 611.000 10.539 34
ISBNetpermissive0.757 211.000 10.904 150.731 380.678 210.895 140.458 180.644 430.670 210.710 170.620 270.732 90.650 261.000 10.756 200.778 320.779 151.000 10.614 15
Tuan Duc Ngo, Binh-Son Hua, Khoi Nguyen: ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution. CVPR 2023
SIM3D0.803 51.000 10.967 100.863 130.692 190.924 70.552 120.732 210.667 220.732 120.662 170.796 10.789 91.000 10.803 100.864 60.766 211.000 10.643 6
SoftGroup++0.769 191.000 10.803 370.937 10.684 200.865 220.213 360.870 20.664 230.571 260.758 10.702 160.807 61.000 10.653 320.902 10.792 101.000 10.626 10
DCD0.798 101.000 10.878 200.792 280.693 180.936 20.596 20.685 290.663 240.736 80.717 60.788 40.693 211.000 10.825 70.840 150.837 11.000 10.689 1
SSEN0.575 511.000 10.761 490.473 590.477 540.795 410.066 530.529 550.658 250.460 410.461 490.380 510.331 650.859 470.401 580.692 560.653 361.000 10.348 57
Dongsu Zhang, Junha Chun, Sang Kyun Cha, Young Min Kim: Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with Deep Metric Learning. Arxiv
SoftGrouppermissive0.761 201.000 10.808 330.845 160.716 80.862 240.243 330.824 40.655 260.620 210.734 40.699 170.791 80.981 390.716 230.844 140.769 191.000 10.594 24
Thang Vu, Kookhoi Kim, Tung M. Luu, Xuan Thanh Nguyen, Chang D. Yoo: SoftGroup for 3D Instance Segmentaiton on Point Clouds. CVPR 2022 [Oral]
IPCA-Inst0.731 251.000 10.788 430.884 90.698 140.788 430.252 310.760 140.646 270.511 340.637 240.665 230.804 71.000 10.644 330.778 330.747 271.000 10.561 30
TopoSeg0.725 261.000 10.806 360.933 20.668 230.758 470.272 300.734 190.630 280.549 300.654 200.606 280.697 200.966 430.612 380.839 160.754 261.000 10.573 27
GraphCut0.732 241.000 10.788 420.724 390.642 260.859 260.248 320.787 110.618 290.596 250.653 210.722 140.583 481.000 10.766 180.861 70.825 31.000 10.504 39
Mask-Group0.664 361.000 10.822 280.764 340.616 320.815 340.139 430.694 280.597 300.459 420.566 380.599 290.600 420.516 660.715 240.819 260.635 391.000 10.603 19
Min Zhong, Xinghao Chen, Xiaokang Chen, Gang Zeng, Yunhe Wang: MaskGroup: Hierarchical Point Grouping and Masking for 3D Instance Segmentation. ICME 2022
DualGroup0.694 311.000 10.799 390.811 220.622 280.817 320.376 240.805 90.590 310.487 380.568 370.525 350.650 260.835 560.600 390.829 200.655 351.000 10.526 35
SSEC0.707 281.000 10.850 220.924 30.648 240.747 500.162 380.862 30.572 320.520 320.624 260.549 310.649 351.000 10.560 430.706 500.768 201.000 10.591 25
DKNet0.718 271.000 10.814 300.782 290.619 300.872 210.224 340.751 170.569 330.677 180.585 330.724 130.633 380.981 390.515 480.819 250.736 281.000 10.617 13
Yizheng Wu, Min Shi, Shuaiyuan Du, Hao Lu, Zhiguo Cao, Weicai Zhong: 3D Instances as 1D Kernels. ECCV 2022
DANCENET0.680 321.000 10.807 340.733 370.600 340.768 460.375 250.543 530.538 340.610 230.599 300.498 360.632 400.981 390.739 220.856 80.633 410.882 640.454 48
OccuSeg+instance0.672 351.000 10.758 530.682 440.576 380.842 280.477 170.504 590.524 350.567 270.585 350.451 400.557 501.000 10.751 210.797 300.563 511.000 10.467 47
Lei Han, Tian Zheng, Lan Xu, Lu Fang: OccuSeg: Occupancy-aware 3D Instance Segmentation. CVPR2020
Box2Mask0.677 341.000 10.847 240.771 310.509 490.816 330.277 290.558 520.482 360.562 280.640 230.448 410.700 161.000 10.666 270.852 120.578 480.997 400.488 43
Julian Chibane, Francis Engelmann, Tuan Anh Tran, Gerard Pons-Moll: Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation Using Bounding Boxes. ECCV 2022
INS-Conv-instance0.657 371.000 10.760 510.667 460.581 360.863 230.323 260.655 380.477 370.473 400.549 400.432 440.650 261.000 10.655 300.738 440.585 470.944 560.472 46
3D-MPA0.611 471.000 10.833 260.765 330.526 460.756 480.136 450.588 500.470 380.438 470.432 530.358 540.650 260.857 480.429 550.765 360.557 541.000 10.430 50
Francis Engelmann, Martin Bokeloh, Alireza Fathi, Bastian Leibe, Matthias Nießner: 3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation. CVPR 2020
HAISpermissive0.699 291.000 10.849 230.820 210.675 220.808 370.279 280.757 160.465 390.517 330.596 310.559 300.600 421.000 10.654 310.767 350.676 320.994 490.560 31
Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang: Hierarchical Aggregation for 3D Instance Segmentation. ICCV 2021
Mask3D_evaluation0.631 451.000 10.829 270.606 500.646 250.836 290.068 520.511 570.462 400.507 350.619 280.389 500.610 411.000 10.432 540.828 220.673 330.788 680.552 32
Dyco3Dcopyleft0.641 411.000 10.841 250.893 70.531 440.802 390.115 480.588 500.448 410.438 460.537 430.430 460.550 510.857 480.534 460.764 370.657 340.987 520.568 28
Tong He; Chunhua Shen; Anton van den Hengel: DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution. CVPR2021
Sparse R-CNN0.515 571.000 10.538 690.282 630.468 550.790 420.173 370.345 650.429 420.413 540.484 460.176 600.595 470.591 640.522 470.668 580.476 620.986 540.327 59
DD-UNet+Group0.635 440.667 590.797 410.714 410.562 390.774 450.146 400.810 80.429 430.476 390.546 420.399 480.633 381.000 10.632 350.722 470.609 421.000 10.514 36
H. Liu, R. Liu, K. Yang, J. Zhang, K. Peng, R. Stiefelhagen: HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor. ICCVW 2021
SSTNetpermissive0.698 301.000 10.697 610.888 80.556 400.803 380.387 230.626 450.417 440.556 290.585 340.702 150.600 421.000 10.824 80.720 480.692 301.000 10.509 38
Zhihao Liang, Zhihao Li, Songcen Xu, Mingkui Tan, Kui Jia: Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks. ICCV2021
NeuralBF0.555 530.667 590.896 170.843 180.517 480.751 490.029 580.519 560.414 450.439 450.465 470.000 750.484 540.857 480.287 630.693 550.651 381.000 10.485 44
Weiwei Sun, Daniel Rebain, Renjie Liao, Vladimir Tankovich, Soroosh Yazdani, Kwang Moo Yi, Andrea Tagliasacchi: NeuralBF: Neural Bilateral Filtering for Top-down Instance Segmentation on Point Clouds. WACV 2023
AOIA0.601 491.000 10.761 500.687 430.485 520.828 300.008 630.663 360.405 460.405 550.425 540.490 370.596 450.714 590.553 450.779 310.597 440.992 500.424 52
CSC-Pretrained0.648 381.000 10.810 310.768 320.523 470.813 350.143 420.819 50.389 470.422 510.511 440.443 420.650 261.000 10.624 360.732 450.634 401.000 10.375 55
PointGroup0.636 431.000 10.765 480.624 480.505 510.797 400.116 470.696 270.384 480.441 440.559 390.476 380.596 451.000 10.666 270.756 390.556 550.997 400.513 37
Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia: PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation. CVPR 2020 [oral]
RPGN0.643 401.000 10.758 520.582 570.539 410.826 310.046 560.765 120.372 490.436 480.588 320.539 340.650 261.000 10.577 400.750 410.653 370.997 400.495 42
Shichao Dong, Guosheng Lin, Tzu-Yi Hung: Learning Regional Purity for Instance Segmentation on 3D Point Clouds. ECCV 2022
GICN0.638 421.000 10.895 180.800 250.480 530.676 550.144 410.737 180.354 500.447 430.400 570.365 520.700 161.000 10.569 410.836 170.599 431.000 10.473 45
PE0.645 391.000 10.773 470.798 260.538 420.786 440.088 510.799 100.350 510.435 490.547 410.545 320.646 370.933 450.562 420.761 380.556 560.997 400.501 41
Biao Zhang, Peter Wonka: Point Cloud Instance Segmentation using Probabilistic Embeddings. CVPR 2021
PCJC0.578 501.000 10.810 320.583 560.449 560.813 360.042 570.603 480.341 520.490 370.465 480.410 470.650 260.835 560.264 650.694 540.561 520.889 610.504 40
DENet0.629 461.000 10.797 400.608 490.589 350.627 590.219 350.882 10.310 530.402 560.383 590.396 490.650 261.000 10.663 290.543 670.691 311.000 10.568 29
3D-BoNet0.488 591.000 10.672 630.590 540.301 630.484 700.098 490.620 460.306 540.341 620.259 650.125 630.434 600.796 580.402 570.499 690.513 600.909 600.439 49
Bo Yang, Jianan Wang, Ronald Clark, Qingyong Hu, Sen Wang, Andrew Markham, Niki Trigoni: Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds. NeurIPS 2019 Spotlight
MASCpermissive0.447 630.528 690.555 670.381 600.382 580.633 580.002 660.509 580.260 550.361 600.432 520.327 550.451 560.571 650.367 610.639 610.386 630.980 550.276 62
Chen Liu, Yasutaka Furukawa: MASC: Multi-scale Affinity with Sparse Convolution for 3D Instance Segmentation.
SPG_WSIS0.470 610.667 590.685 620.677 450.372 590.562 640.000 680.482 600.244 560.316 640.298 620.052 700.442 590.857 480.267 640.702 510.559 531.000 10.287 61
RWSeg0.567 520.528 690.708 600.626 470.580 370.745 510.063 540.627 440.240 570.400 570.497 450.464 390.515 521.000 10.475 500.745 420.571 491.000 10.429 51
ClickSeg_Instance0.539 551.000 10.621 640.300 620.530 450.698 530.127 460.533 540.222 580.430 500.400 560.365 520.574 490.938 440.472 510.659 590.543 570.944 560.347 58
SALoss-ResNet0.459 621.000 10.737 550.159 730.259 650.587 620.138 440.475 610.217 590.416 530.408 550.128 620.315 660.714 590.411 560.536 680.590 460.873 650.304 60
Zhidong Liang, Ming Yang, Hao Li, Chunxiang Wang: 3D Instance Embedding Learning With a Structure-Aware Loss Function for Point Cloud Segmentation. IEEE Robotics and Automation Letters (IROS2020)
MTML0.549 541.000 10.807 350.588 550.327 610.647 570.004 650.815 70.180 600.418 520.364 610.182 590.445 571.000 10.442 530.688 570.571 501.000 10.396 53
Jean Lahoud, Bernard Ghanem, Marc Pollefeys, Martin R. Oswald: 3D Instance Segmentation via Multi-task Metric Learning. ICCV 2019 [oral]
Occipital-SCS0.512 581.000 10.716 570.509 580.506 500.611 600.092 500.602 490.177 610.346 610.383 580.165 610.442 580.850 550.386 600.618 630.543 580.889 610.389 54
PanopticFusion-inst0.478 600.667 590.712 590.595 530.259 660.550 660.000 680.613 470.175 620.250 670.434 500.437 430.411 620.857 480.485 490.591 660.267 720.944 560.359 56
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
One_Thing_One_Clickpermissive0.529 560.667 590.718 560.777 300.399 570.683 540.000 680.669 340.138 630.391 580.374 600.539 330.360 640.641 630.556 440.774 340.593 450.997 400.251 63
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
SegGroup_inspermissive0.445 640.667 590.773 460.185 700.317 620.656 560.000 680.407 640.134 640.381 590.267 640.217 580.476 550.714 590.452 520.629 620.514 591.000 10.222 66
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
R-PointNet0.306 680.500 710.405 730.311 610.348 600.589 610.054 550.068 730.126 650.283 650.290 630.028 710.219 690.214 700.331 620.396 730.275 690.821 670.245 64
SemRegionNet-20cls0.250 700.333 720.613 650.229 670.163 690.493 680.000 680.304 660.107 660.147 720.100 710.052 690.231 670.119 720.039 720.445 710.325 650.654 700.141 71
3D-BEVIS0.248 710.667 590.566 660.076 740.035 760.394 740.027 600.035 750.098 670.099 740.030 750.025 720.098 720.375 690.126 690.604 650.181 740.854 660.171 69
Cathrin Elich, Francis Engelmann, Jonas Schult, Theodora Kontogianni, Bastian Leibe: 3D-BEVIS: Birds-Eye-View Instance Segmentation.
tmp0.248 710.667 590.437 710.188 690.153 710.491 690.000 680.208 690.094 680.153 710.099 720.057 680.217 700.119 720.039 720.466 700.302 670.640 710.140 72
UNet-backbone0.319 670.667 590.715 580.233 660.189 680.479 710.008 630.218 680.067 690.201 690.173 680.107 640.123 710.438 670.150 670.615 640.355 640.916 590.093 75
3D-SISpermissive0.382 651.000 10.432 720.245 650.190 670.577 630.013 620.263 670.033 700.320 630.240 660.075 660.422 610.857 480.117 700.699 520.271 710.883 630.235 65
Ji Hou, Angela Dai, Matthias Niessner: 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans. CVPR 2019
Sem_Recon_ins0.227 730.764 580.486 700.069 750.098 730.426 730.017 610.067 740.015 710.172 700.100 700.096 650.054 750.183 710.135 680.366 740.260 730.614 720.168 70
Hier3Dcopyleft0.323 660.667 590.542 680.264 640.157 700.550 650.000 680.205 700.009 720.270 660.218 670.075 660.500 530.688 620.007 760.698 530.301 680.459 730.200 67
Tan: HCFS3D: Hierarchical Coupled Feature Selection Network for 3D Semantic and Instance Segmentation.
ASIS0.199 740.333 720.253 750.167 720.140 720.438 720.000 680.177 710.008 730.121 730.069 730.004 740.231 680.429 680.036 740.445 720.273 700.333 750.119 74
Region-18class0.284 690.250 750.751 540.228 680.270 640.521 670.000 680.468 620.008 740.205 680.127 690.000 750.068 730.070 740.262 660.652 600.323 660.740 690.173 68
Sgpn_scannet0.143 750.208 760.390 740.169 710.065 740.275 750.029 590.069 720.000 750.087 750.043 740.014 730.027 760.000 750.112 710.351 750.168 750.438 740.138 73
MaskRCNN 2d->3d Proj0.058 760.333 720.002 760.000 760.053 750.002 760.002 670.021 760.000 750.045 760.024 760.238 570.065 740.000 750.014 750.107 760.020 760.110 760.006 76