The 3D semantic instance prediction task involves detecting and segmenting the object in an 3D scan mesh.

Evaluation and metrics

Our evaluation ranks all methods according to the average precision for each class. We report the mean average precision AP at overlap 0.25 (AP 25%), overlap 0.5 (AP 50%), and over overlaps in the range [0.5:0.95:0.05] (AP). Note that multiple predictions of the same ground truth instance are penalized as false positives.



This table lists the benchmark results for the 3D semantic instance scenario.




Method Infoavg ap 50%bathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sort bysort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort by
MAFT0.786 141.000 10.894 190.807 230.694 170.893 150.486 150.674 310.740 130.786 10.704 110.727 100.739 141.000 10.707 240.849 130.756 241.000 10.685 1
OneFormer3Dcopyleft0.801 61.000 10.973 60.909 50.698 150.928 40.582 40.668 340.685 190.780 20.687 140.698 190.702 151.000 10.794 120.900 20.784 130.986 520.635 8
Maxim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: OneFormer3D: One Transformer for Unified Point Cloud Segmentation.
Competitor-MAFT0.816 11.000 10.983 30.872 100.718 50.941 10.588 30.652 380.819 20.776 30.720 50.780 40.769 121.000 10.797 100.813 270.798 71.000 10.659 3
InsSSM0.799 91.000 10.915 140.710 410.729 30.925 50.664 10.670 320.770 70.766 40.739 30.737 70.700 161.000 10.792 130.829 200.815 40.997 390.625 10
Lei Yao, Yi Wang, Moyun Liu, Lap-Pui Chau: SGIFormer: Semantic-guided and Geometric-enhanced Interleaving Transformer for 3D Instance Segmentation. TCSVT, 2024
MG-Former0.791 111.000 10.980 50.837 190.626 260.897 120.543 120.759 150.800 60.766 50.659 170.769 50.697 191.000 10.791 140.707 480.791 101.000 10.610 16
ExtMask3D0.789 121.000 10.988 10.756 340.706 120.912 110.429 200.647 400.806 50.755 60.673 150.689 200.772 111.000 10.789 150.852 110.811 51.000 10.617 13
UniPerception0.800 71.000 10.930 120.872 100.727 40.862 240.454 190.764 130.820 10.746 70.706 100.750 60.772 100.926 450.764 180.818 260.826 10.997 390.660 2
TST3D0.795 101.000 10.929 130.918 40.709 100.884 190.596 20.704 250.769 80.734 80.644 210.699 170.751 131.000 10.794 110.876 50.757 230.997 390.550 32
Duc Tran Dang Trung, Byeongkeun Kang, Yeejin Lee: MSTA3D: Multi-scale Twin-attention for 3D Instance Segmentation. ACM Multimedia 2024
PointRel0.816 11.000 10.971 80.908 60.743 20.923 70.573 70.714 220.695 180.734 90.747 20.725 110.809 11.000 10.814 80.899 30.820 31.000 10.610 17
: Relation3D (PointRel): Enhancing Relation Modeling for Point Cloud Instance Segmentation.
Competitor-SPFormer0.800 71.000 10.986 20.845 160.705 130.915 100.532 130.733 200.757 100.733 100.708 90.698 180.648 350.981 380.890 10.830 180.796 80.997 390.644 4
SIM3D0.803 51.000 10.967 100.863 130.692 180.924 60.552 110.732 210.667 220.732 110.662 160.796 10.789 91.000 10.803 90.864 60.766 201.000 10.643 5
EV3D0.811 41.000 10.968 90.852 140.717 70.921 80.574 60.677 290.748 110.730 120.703 120.795 20.809 11.000 10.831 40.854 90.778 151.000 10.638 7
Spherical Mask(CtoF)0.812 31.000 10.973 70.852 140.718 60.917 90.574 50.677 290.748 110.729 130.715 70.795 20.809 11.000 10.831 40.854 90.787 111.000 10.638 6
KmaxOneFormerNetpermissive0.783 150.903 550.981 40.794 270.706 110.931 30.561 90.701 260.706 160.727 140.697 130.731 90.689 221.000 10.856 20.750 390.761 221.000 10.599 21
SPFormerpermissive0.770 170.903 550.903 160.806 240.609 320.886 160.568 80.815 60.705 170.711 150.655 180.652 240.685 231.000 10.789 160.809 280.776 171.000 10.583 25
Sun Jiahao, Qing Chunmei, Tan Junpeng, Xu Xiangmin: Superpoint Transformer for 3D Scene Instance Segmentation. AAAI 2023 [Oral]
ISBNetpermissive0.757 201.000 10.904 150.731 370.678 200.895 130.458 170.644 420.670 210.710 160.620 260.732 80.650 251.000 10.756 190.778 310.779 141.000 10.614 14
Tuan Duc Ngo, Binh-Son Hua, Khoi Nguyen: ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution. CVPR 2023
DKNet0.718 261.000 10.814 290.782 280.619 290.872 200.224 330.751 170.569 320.677 170.585 320.724 120.633 370.981 380.515 470.819 240.736 271.000 10.617 12
Yizheng Wu, Min Shi, Shuaiyuan Du, Hao Lu, Zhiguo Cao, Weicai Zhong: 3D Instances as 1D Kernels. ECCV 2022
Mask3D0.780 161.000 10.786 430.716 390.696 160.885 180.500 140.714 220.810 40.672 180.715 70.679 210.809 11.000 10.831 40.833 170.787 111.000 10.602 19
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
Queryformer0.787 131.000 10.933 110.601 500.754 10.886 170.558 100.661 360.767 90.665 190.716 60.639 250.808 51.000 10.844 30.897 40.804 61.000 10.624 11
SoftGrouppermissive0.761 191.000 10.808 320.845 160.716 80.862 230.243 320.824 40.655 250.620 200.734 40.699 160.791 80.981 380.716 220.844 140.769 181.000 10.594 23
Thang Vu, Kookhoi Kim, Tung M. Luu, Xuan Thanh Nguyen, Chang D. Yoo: SoftGroup for 3D Instance Segmentaiton on Point Clouds. CVPR 2022 [Oral]
PBNetpermissive0.747 221.000 10.818 280.837 200.713 90.844 260.457 180.647 400.711 150.614 210.617 280.657 230.650 251.000 10.692 250.822 220.765 211.000 10.595 22
W.Zhao, Y.Yan, C.Yang, J.Ye,X.Yang,K.Huang: Divide and Conquer: 3D Instance Segmentation With Point-Wise Binarization. ICCV 2023
DANCENET0.680 311.000 10.807 330.733 360.600 330.768 450.375 240.543 520.538 330.610 220.599 290.498 350.632 390.981 380.739 210.856 80.633 400.882 630.454 47
TD3Dpermissive0.751 211.000 10.774 440.867 120.621 280.934 20.404 210.706 240.812 30.605 230.633 240.626 260.690 211.000 10.640 330.820 230.777 161.000 10.612 15
Maksim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: Top-Down Beats Bottom-Up in 3D Instance Segmentation. WACV 2024
GraphCut0.732 231.000 10.788 410.724 380.642 250.859 250.248 310.787 110.618 280.596 240.653 200.722 130.583 471.000 10.766 170.861 70.825 21.000 10.504 38
SoftGroup++0.769 181.000 10.803 360.937 10.684 190.865 210.213 350.870 20.664 230.571 250.758 10.702 150.807 61.000 10.653 310.902 10.792 91.000 10.626 9
OccuSeg+instance0.672 341.000 10.758 520.682 430.576 370.842 270.477 160.504 580.524 340.567 260.585 340.451 390.557 491.000 10.751 200.797 290.563 501.000 10.467 46
Lei Han, Tian Zheng, Lan Xu, Lu Fang: OccuSeg: Occupancy-aware 3D Instance Segmentation. CVPR2020
Box2Mask0.677 331.000 10.847 230.771 300.509 480.816 320.277 280.558 510.482 350.562 270.640 220.448 400.700 161.000 10.666 260.852 120.578 470.997 390.488 42
Julian Chibane, Francis Engelmann, Tuan Anh Tran, Gerard Pons-Moll: Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation Using Bounding Boxes. ECCV 2022
SSTNetpermissive0.698 291.000 10.697 600.888 80.556 390.803 370.387 220.626 440.417 430.556 280.585 330.702 140.600 411.000 10.824 70.720 470.692 291.000 10.509 37
Zhihao Liang, Zhihao Li, Songcen Xu, Mingkui Tan, Kui Jia: Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks. ICCV2021
TopoSeg0.725 251.000 10.806 350.933 20.668 220.758 460.272 290.734 190.630 270.549 290.654 190.606 270.697 200.966 420.612 370.839 150.754 251.000 10.573 26
SphereSeg0.680 311.000 10.856 200.744 350.618 300.893 140.151 380.651 390.713 140.537 300.579 350.430 440.651 241.000 10.389 580.744 420.697 280.991 500.601 20
SSEC0.707 271.000 10.850 210.924 30.648 230.747 490.162 370.862 30.572 310.520 310.624 250.549 300.649 341.000 10.560 420.706 490.768 191.000 10.591 24
HAISpermissive0.699 281.000 10.849 220.820 210.675 210.808 360.279 270.757 160.465 380.517 320.596 300.559 290.600 411.000 10.654 300.767 340.676 310.994 480.560 30
Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang: Hierarchical Aggregation for 3D Instance Segmentation. ICCV 2021
IPCA-Inst0.731 241.000 10.788 420.884 90.698 140.788 420.252 300.760 140.646 260.511 330.637 230.665 220.804 71.000 10.644 320.778 320.747 261.000 10.561 29
Mask3D_evaluation0.631 441.000 10.829 260.606 490.646 240.836 280.068 510.511 560.462 390.507 340.619 270.389 490.610 401.000 10.432 530.828 210.673 320.788 670.552 31
OSIS0.605 471.000 10.801 370.599 510.535 420.728 510.286 260.436 620.679 200.491 350.433 500.256 550.404 620.857 470.620 360.724 450.510 601.000 10.539 33
PCJC0.578 491.000 10.810 310.583 550.449 550.813 350.042 560.603 470.341 510.490 360.465 470.410 460.650 250.835 550.264 640.694 530.561 510.889 600.504 39
DualGroup0.694 301.000 10.799 380.811 220.622 270.817 310.376 230.805 90.590 300.487 370.568 360.525 340.650 250.835 550.600 380.829 190.655 341.000 10.526 34
DD-UNet+Group0.635 430.667 580.797 400.714 400.562 380.774 440.146 390.810 80.429 420.476 380.546 410.399 470.633 371.000 10.632 340.722 460.609 411.000 10.514 35
H. Liu, R. Liu, K. Yang, J. Zhang, K. Peng, R. Stiefelhagen: HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor. ICCVW 2021
INS-Conv-instance0.657 361.000 10.760 500.667 450.581 350.863 220.323 250.655 370.477 360.473 390.549 390.432 430.650 251.000 10.655 290.738 430.585 460.944 550.472 45
SSEN0.575 501.000 10.761 480.473 580.477 530.795 400.066 520.529 540.658 240.460 400.461 480.380 500.331 640.859 460.401 570.692 550.653 351.000 10.348 56
Dongsu Zhang, Junha Chun, Sang Kyun Cha, Young Min Kim: Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with Deep Metric Learning. Arxiv
Mask-Group0.664 351.000 10.822 270.764 330.616 310.815 330.139 420.694 280.597 290.459 410.566 370.599 280.600 410.516 650.715 230.819 250.635 381.000 10.603 18
Min Zhong, Xinghao Chen, Xiaokang Chen, Gang Zeng, Yunhe Wang: MaskGroup: Hierarchical Point Grouping and Masking for 3D Instance Segmentation. ICME 2022
GICN0.638 411.000 10.895 180.800 250.480 520.676 540.144 400.737 180.354 490.447 420.400 560.365 510.700 161.000 10.569 400.836 160.599 421.000 10.473 44
PointGroup0.636 421.000 10.765 470.624 470.505 500.797 390.116 460.696 270.384 470.441 430.559 380.476 370.596 441.000 10.666 260.756 380.556 540.997 390.513 36
Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia: PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation. CVPR 2020 [oral]
NeuralBF0.555 520.667 580.896 170.843 180.517 470.751 480.029 570.519 550.414 440.439 440.465 460.000 740.484 530.857 470.287 620.693 540.651 371.000 10.485 43
Weiwei Sun, Daniel Rebain, Renjie Liao, Vladimir Tankovich, Soroosh Yazdani, Kwang Moo Yi, Andrea Tagliasacchi: NeuralBF: Neural Bilateral Filtering for Top-down Instance Segmentation on Point Clouds. WACV 2023
Dyco3Dcopyleft0.641 401.000 10.841 240.893 70.531 430.802 380.115 470.588 490.448 400.438 450.537 420.430 450.550 500.857 470.534 450.764 360.657 330.987 510.568 27
Tong He; Chunhua Shen; Anton van den Hengel: DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution. CVPR2021
3D-MPA0.611 461.000 10.833 250.765 320.526 450.756 470.136 440.588 490.470 370.438 460.432 520.358 530.650 250.857 470.429 540.765 350.557 531.000 10.430 49
Francis Engelmann, Martin Bokeloh, Alireza Fathi, Bastian Leibe, Matthias Nießner: 3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation. CVPR 2020
RPGN0.643 391.000 10.758 510.582 560.539 400.826 300.046 550.765 120.372 480.436 470.588 310.539 330.650 251.000 10.577 390.750 400.653 360.997 390.495 41
Shichao Dong, Guosheng Lin, Tzu-Yi Hung: Learning Regional Purity for Instance Segmentation on 3D Point Clouds. ECCV 2022
PE0.645 381.000 10.773 460.798 260.538 410.786 430.088 500.799 100.350 500.435 480.547 400.545 310.646 360.933 440.562 410.761 370.556 550.997 390.501 40
Biao Zhang, Peter Wonka: Point Cloud Instance Segmentation using Probabilistic Embeddings. CVPR 2021
ClickSeg_Instance0.539 541.000 10.621 630.300 610.530 440.698 520.127 450.533 530.222 570.430 490.400 550.365 510.574 480.938 430.472 500.659 580.543 560.944 550.347 57
CSC-Pretrained0.648 371.000 10.810 300.768 310.523 460.813 340.143 410.819 50.389 460.422 500.511 430.443 410.650 251.000 10.624 350.732 440.634 391.000 10.375 54
MTML0.549 531.000 10.807 340.588 540.327 600.647 560.004 640.815 70.180 590.418 510.364 600.182 580.445 561.000 10.442 520.688 560.571 491.000 10.396 52
Jean Lahoud, Bernard Ghanem, Marc Pollefeys, Martin R. Oswald: 3D Instance Segmentation via Multi-task Metric Learning. ICCV 2019 [oral]
SALoss-ResNet0.459 611.000 10.737 540.159 720.259 640.587 610.138 430.475 600.217 580.416 520.408 540.128 610.315 650.714 580.411 550.536 670.590 450.873 640.304 59
Zhidong Liang, Ming Yang, Hao Li, Chunxiang Wang: 3D Instance Embedding Learning With a Structure-Aware Loss Function for Point Cloud Segmentation. IEEE Robotics and Automation Letters (IROS2020)
Sparse R-CNN0.515 561.000 10.538 680.282 620.468 540.790 410.173 360.345 640.429 410.413 530.484 450.176 590.595 460.591 630.522 460.668 570.476 610.986 530.327 58
AOIA0.601 481.000 10.761 490.687 420.485 510.828 290.008 620.663 350.405 450.405 540.425 530.490 360.596 440.714 580.553 440.779 300.597 430.992 490.424 51
DENet0.629 451.000 10.797 390.608 480.589 340.627 580.219 340.882 10.310 520.402 550.383 580.396 480.650 251.000 10.663 280.543 660.691 301.000 10.568 28
RWSeg0.567 510.528 680.708 590.626 460.580 360.745 500.063 530.627 430.240 560.400 560.497 440.464 380.515 511.000 10.475 490.745 410.571 481.000 10.429 50
One_Thing_One_Clickpermissive0.529 550.667 580.718 550.777 290.399 560.683 530.000 670.669 330.138 620.391 570.374 590.539 320.360 630.641 620.556 430.774 330.593 440.997 390.251 62
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
SegGroup_inspermissive0.445 630.667 580.773 450.185 690.317 610.656 550.000 670.407 630.134 630.381 580.267 630.217 570.476 540.714 580.452 510.629 610.514 581.000 10.222 65
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
MASCpermissive0.447 620.528 680.555 660.381 590.382 570.633 570.002 650.509 570.260 540.361 590.432 510.327 540.451 550.571 640.367 600.639 600.386 620.980 540.276 61
Chen Liu, Yasutaka Furukawa: MASC: Multi-scale Affinity with Sparse Convolution for 3D Instance Segmentation.
Occipital-SCS0.512 571.000 10.716 560.509 570.506 490.611 590.092 490.602 480.177 600.346 600.383 570.165 600.442 570.850 540.386 590.618 620.543 570.889 600.389 53
3D-BoNet0.488 581.000 10.672 620.590 530.301 620.484 690.098 480.620 450.306 530.341 610.259 640.125 620.434 590.796 570.402 560.499 680.513 590.909 590.439 48
Bo Yang, Jianan Wang, Ronald Clark, Qingyong Hu, Sen Wang, Andrew Markham, Niki Trigoni: Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds. NeurIPS 2019 Spotlight
3D-SISpermissive0.382 641.000 10.432 710.245 640.190 660.577 620.013 610.263 660.033 690.320 620.240 650.075 650.422 600.857 470.117 690.699 510.271 700.883 620.235 64
Ji Hou, Angela Dai, Matthias Niessner: 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans. CVPR 2019
SPG_WSIS0.470 600.667 580.685 610.677 440.372 580.562 630.000 670.482 590.244 550.316 630.298 610.052 690.442 580.857 470.267 630.702 500.559 521.000 10.287 60
R-PointNet0.306 670.500 700.405 720.311 600.348 590.589 600.054 540.068 720.126 640.283 640.290 620.028 700.219 680.214 690.331 610.396 720.275 680.821 660.245 63
Hier3Dcopyleft0.323 650.667 580.542 670.264 630.157 690.550 640.000 670.205 690.009 710.270 650.218 660.075 650.500 520.688 610.007 750.698 520.301 670.459 720.200 66
Tan: HCFS3D: Hierarchical Coupled Feature Selection Network for 3D Semantic and Instance Segmentation.
PanopticFusion-inst0.478 590.667 580.712 580.595 520.259 650.550 650.000 670.613 460.175 610.250 660.434 490.437 420.411 610.857 470.485 480.591 650.267 710.944 550.359 55
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
Region-18class0.284 680.250 740.751 530.228 670.270 630.521 660.000 670.468 610.008 730.205 670.127 680.000 740.068 720.070 730.262 650.652 590.323 650.740 680.173 67
UNet-backbone0.319 660.667 580.715 570.233 650.189 670.479 700.008 620.218 670.067 680.201 680.173 670.107 630.123 700.438 660.150 660.615 630.355 630.916 580.093 74
Sem_Recon_ins0.227 720.764 570.486 690.069 740.098 720.426 720.017 600.067 730.015 700.172 690.100 690.096 640.054 740.183 700.135 670.366 730.260 720.614 710.168 69
tmp0.248 700.667 580.437 700.188 680.153 700.491 680.000 670.208 680.094 670.153 700.099 710.057 670.217 690.119 710.039 710.466 690.302 660.640 700.140 71
SemRegionNet-20cls0.250 690.333 710.613 640.229 660.163 680.493 670.000 670.304 650.107 650.147 710.100 700.052 680.231 660.119 710.039 710.445 700.325 640.654 690.141 70
ASIS0.199 730.333 710.253 740.167 710.140 710.438 710.000 670.177 700.008 720.121 720.069 720.004 730.231 670.429 670.036 730.445 710.273 690.333 740.119 73
3D-BEVIS0.248 700.667 580.566 650.076 730.035 750.394 730.027 590.035 740.098 660.099 730.030 740.025 710.098 710.375 680.126 680.604 640.181 730.854 650.171 68
Cathrin Elich, Francis Engelmann, Jonas Schult, Theodora Kontogianni, Bastian Leibe: 3D-BEVIS: Birds-Eye-View Instance Segmentation.
Sgpn_scannet0.143 740.208 750.390 730.169 700.065 730.275 740.029 580.069 710.000 740.087 740.043 730.014 720.027 750.000 740.112 700.351 740.168 740.438 730.138 72
MaskRCNN 2d->3d Proj0.058 750.333 710.002 750.000 750.053 740.002 750.002 660.021 750.000 740.045 750.024 750.238 560.065 730.000 740.014 740.107 750.020 750.110 750.006 75