The 3D semantic instance prediction task involves detecting and segmenting the object in an 3D scan mesh.

Evaluation and metrics

Our evaluation ranks all methods according to the average precision for each class. We report the mean average precision AP at overlap 0.25 (AP 25%), overlap 0.5 (AP 50%), and over overlaps in the range [0.5:0.95:0.05] (AP). Note that multiple predictions of the same ground truth instance are penalized as false positives.



This table lists the benchmark results for the 3D semantic instance scenario.




Method Infoavg ap 50%bathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Competitor-MAFT0.816 11.000 10.983 30.872 100.718 50.941 10.588 40.652 390.819 20.776 30.720 50.780 50.769 121.000 10.797 110.813 290.798 81.000 10.659 4
PointRel0.816 11.000 10.971 80.908 60.743 20.923 80.573 80.714 220.695 180.734 100.747 20.725 120.809 11.000 10.814 90.899 30.820 41.000 10.610 18
: Relation3D (PointRel): Enhancing Relation Modeling for Point Cloud Instance Segmentation.
Spherical Mask(CtoF)0.812 31.000 10.973 70.852 140.718 60.917 100.574 60.677 300.748 110.729 140.715 80.795 20.809 11.000 10.831 40.854 90.787 121.000 10.638 7
EV3D0.811 41.000 10.968 90.852 140.717 70.921 90.574 70.677 300.748 110.730 130.703 130.795 20.809 11.000 10.831 40.854 90.778 161.000 10.638 8
SIM3D0.803 51.000 10.967 100.863 130.692 190.924 70.552 120.732 210.667 230.732 120.662 170.796 10.789 91.000 10.803 100.864 60.766 211.000 10.643 6
OneFormer3Dcopyleft0.801 61.000 10.973 60.909 50.698 150.928 50.582 50.668 350.685 190.780 20.687 150.698 200.702 151.000 10.794 130.900 20.784 140.986 530.635 9
Maxim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: OneFormer3D: One Transformer for Unified Point Cloud Segmentation.
UniPerception0.800 71.000 10.930 120.872 100.727 40.862 250.454 200.764 130.820 10.746 70.706 110.750 70.772 100.926 470.764 190.818 270.826 20.997 400.660 3
Competitor-SPFormer0.800 71.000 10.986 20.845 160.705 130.915 110.532 140.733 200.757 100.733 110.708 100.698 190.648 370.981 400.890 10.830 190.796 90.997 400.644 5
InsSSM0.799 91.000 10.915 140.710 420.729 30.925 60.664 10.670 330.770 70.766 40.739 30.737 80.700 161.000 10.792 140.829 210.815 50.997 400.625 11
Lei Yao, Yi Wang, Moyun Liu, Lap-Pui Chau: SGIFormer: Semantic-guided and Geometric-enhanced Interleaving Transformer for 3D Instance Segmentation. TCSVT, 2024
DCD0.798 101.000 10.878 210.792 280.693 180.936 20.596 20.685 290.663 250.736 80.717 60.788 40.693 211.000 10.825 70.840 150.837 11.000 10.689 1
TST3D0.795 111.000 10.929 130.918 40.709 100.884 200.596 30.704 250.769 80.734 90.644 220.699 180.751 131.000 10.794 120.876 50.757 240.997 400.550 34
Duc Tran Dang Trung, Byeongkeun Kang, Yeejin Lee: MSTA3D: Multi-scale Twin-attention for 3D Instance Segmentation. ACM Multimedia 2024
MG-Former0.791 121.000 10.980 50.837 190.626 270.897 130.543 130.759 150.800 60.766 50.659 180.769 60.697 191.000 10.791 150.707 500.791 111.000 10.610 17
ExtMask3D0.789 131.000 10.988 10.756 350.706 120.912 120.429 210.647 410.806 50.755 60.673 160.689 210.772 111.000 10.789 160.852 110.811 61.000 10.617 14
Queryformer0.787 141.000 10.933 110.601 520.754 10.886 180.558 110.661 370.767 90.665 200.716 70.639 270.808 51.000 10.844 30.897 40.804 71.000 10.624 12
MAFT0.786 151.000 10.894 190.807 230.694 170.893 160.486 160.674 320.740 130.786 10.704 120.727 110.739 141.000 10.707 260.849 130.756 251.000 10.685 2
KmaxOneFormerNetpermissive0.783 160.903 570.981 40.794 270.706 110.931 40.561 100.701 260.706 160.727 150.697 140.731 100.689 231.000 10.856 20.750 410.761 231.000 10.599 22
Mask3D0.780 171.000 10.786 450.716 400.696 160.885 190.500 150.714 220.810 40.672 190.715 80.679 220.809 11.000 10.831 40.833 180.787 121.000 10.602 20
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
SPFormerpermissive0.770 180.903 570.903 160.806 240.609 340.886 170.568 90.815 60.705 170.711 160.655 190.652 260.685 241.000 10.789 170.809 300.776 181.000 10.583 26
Sun Jiahao, Qing Chunmei, Tan Junpeng, Xu Xiangmin: Superpoint Transformer for 3D Scene Instance Segmentation. AAAI 2023 [Oral]
SoftGroup++0.769 191.000 10.803 380.937 10.684 200.865 220.213 370.870 20.664 240.571 270.758 10.702 160.807 61.000 10.653 330.902 10.792 101.000 10.626 10
SoftGrouppermissive0.761 201.000 10.808 340.845 160.716 80.862 240.243 340.824 40.655 270.620 210.734 40.699 170.791 80.981 400.716 230.844 140.769 191.000 10.594 24
Thang Vu, Kookhoi Kim, Tung M. Luu, Xuan Thanh Nguyen, Chang D. Yoo: SoftGroup for 3D Instance Segmentaiton on Point Clouds. CVPR 2022 [Oral]
ISBNetpermissive0.757 211.000 10.904 150.731 380.678 210.895 140.458 180.644 430.670 220.710 170.620 270.732 90.650 271.000 10.756 200.778 330.779 151.000 10.614 15
Tuan Duc Ngo, Binh-Son Hua, Khoi Nguyen: ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution. CVPR 2023
TD3Dpermissive0.751 221.000 10.774 460.867 120.621 290.934 30.404 220.706 240.812 30.605 240.633 250.626 280.690 221.000 10.640 350.820 240.777 171.000 10.612 16
Maksim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: Top-Down Beats Bottom-Up in 3D Instance Segmentation. WACV 2024
PBNetpermissive0.747 231.000 10.818 300.837 200.713 90.844 270.457 190.647 410.711 150.614 220.617 290.657 250.650 271.000 10.692 270.822 230.765 221.000 10.595 23
Weiguang Zhao, Yuyao Yan, Chaolong Yang, Jianan Ye, Xi Yang, Kaizhu Huang: Divide and Conquer: 3D Instance Segmentation With Point-Wise Binarization. ICCV 2023
GraphCut0.732 241.000 10.788 430.724 390.642 260.859 260.248 330.787 110.618 300.596 250.653 210.722 140.583 491.000 10.766 180.861 70.825 31.000 10.504 40
IPCA-Inst0.731 251.000 10.788 440.884 90.698 140.788 430.252 320.760 140.646 280.511 350.637 240.665 240.804 71.000 10.644 340.778 340.747 271.000 10.561 30
TopoSeg0.725 261.000 10.806 370.933 20.668 230.758 480.272 310.734 190.630 290.549 310.654 200.606 290.697 200.966 440.612 390.839 160.754 261.000 10.573 27
DKNet0.718 271.000 10.814 310.782 290.619 310.872 210.224 350.751 170.569 340.677 180.585 340.724 130.633 390.981 400.515 490.819 250.736 281.000 10.617 13
Yizheng Wu, Min Shi, Shuaiyuan Du, Hao Lu, Zhiguo Cao, Weicai Zhong: 3D Instances as 1D Kernels. ECCV 2022
SSEC0.707 281.000 10.850 230.924 30.648 240.747 510.162 390.862 30.572 330.520 330.624 260.549 320.649 361.000 10.560 440.706 510.768 201.000 10.591 25
HAISpermissive0.699 291.000 10.849 240.820 210.675 220.808 370.279 290.757 160.465 400.517 340.596 310.559 310.600 431.000 10.654 320.767 360.676 320.994 490.560 31
Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang: Hierarchical Aggregation for 3D Instance Segmentation. ICCV 2021
SSTNetpermissive0.698 301.000 10.697 620.888 80.556 410.803 380.387 230.626 450.417 450.556 300.585 350.702 150.600 431.000 10.824 80.720 490.692 301.000 10.509 39
Zhihao Liang, Zhihao Li, Songcen Xu, Mingkui Tan, Kui Jia: Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks. ICCV2021
DualGroup0.694 311.000 10.799 400.811 220.622 280.817 320.376 240.805 90.590 320.487 390.568 380.525 360.650 270.835 570.600 400.829 200.655 351.000 10.526 36
ODIN - Inspermissive0.693 321.000 10.880 200.647 470.620 300.779 450.336 260.501 600.681 200.577 260.595 320.679 230.683 251.000 10.709 250.816 280.637 390.770 690.557 32
Ayush Jain, Pushkal Katara, Nikolaos Gkanatsios, Adam W. Harley, Gabriel Sarch, Kriti Aggarwal, Vishrav Chaudhary, Katerina Fragkiadaki: ODIN: A Single Model for 2D and 3D Segmentation. CVPR 2024
DANCENET0.680 331.000 10.807 350.733 370.600 350.768 470.375 250.543 530.538 350.610 230.599 300.498 370.632 410.981 400.739 220.856 80.633 420.882 640.454 49
SphereSeg0.680 331.000 10.856 220.744 360.618 320.893 150.151 400.651 400.713 140.537 320.579 370.430 460.651 261.000 10.389 600.744 440.697 290.991 510.601 21
Box2Mask0.677 351.000 10.847 250.771 310.509 500.816 330.277 300.558 520.482 370.562 290.640 230.448 420.700 161.000 10.666 280.852 120.578 490.997 400.488 44
Julian Chibane, Francis Engelmann, Tuan Anh Tran, Gerard Pons-Moll: Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation Using Bounding Boxes. ECCV 2022
OccuSeg+instance0.672 361.000 10.758 540.682 440.576 390.842 280.477 170.504 590.524 360.567 280.585 360.451 410.557 511.000 10.751 210.797 310.563 521.000 10.467 48
Lei Han, Tian Zheng, Lan Xu, Lu Fang: OccuSeg: Occupancy-aware 3D Instance Segmentation. CVPR2020
Mask-Group0.664 371.000 10.822 290.764 340.616 330.815 340.139 440.694 280.597 310.459 430.566 390.599 300.600 430.516 670.715 240.819 260.635 401.000 10.603 19
Min Zhong, Xinghao Chen, Xiaokang Chen, Gang Zeng, Yunhe Wang: MaskGroup: Hierarchical Point Grouping and Masking for 3D Instance Segmentation. ICME 2022
INS-Conv-instance0.657 381.000 10.760 520.667 460.581 370.863 230.323 270.655 380.477 380.473 410.549 410.432 450.650 271.000 10.655 310.738 450.585 480.944 560.472 47
CSC-Pretrained0.648 391.000 10.810 320.768 320.523 480.813 350.143 430.819 50.389 480.422 520.511 450.443 430.650 271.000 10.624 370.732 460.634 411.000 10.375 56
PE0.645 401.000 10.773 480.798 260.538 430.786 440.088 520.799 100.350 520.435 500.547 420.545 330.646 380.933 460.562 430.761 390.556 570.997 400.501 42
Biao Zhang, Peter Wonka: Point Cloud Instance Segmentation using Probabilistic Embeddings. CVPR 2021
RPGN0.643 411.000 10.758 530.582 580.539 420.826 310.046 570.765 120.372 500.436 490.588 330.539 350.650 271.000 10.577 410.750 420.653 370.997 400.495 43
Shichao Dong, Guosheng Lin, Tzu-Yi Hung: Learning Regional Purity for Instance Segmentation on 3D Point Clouds. ECCV 2022
Dyco3Dcopyleft0.641 421.000 10.841 260.893 70.531 450.802 390.115 490.588 500.448 420.438 470.537 440.430 470.550 520.857 490.534 470.764 380.657 340.987 520.568 28
Tong He; Chunhua Shen; Anton van den Hengel: DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution. CVPR2021
GICN0.638 431.000 10.895 180.800 250.480 540.676 560.144 420.737 180.354 510.447 440.400 580.365 530.700 161.000 10.569 420.836 170.599 441.000 10.473 46
PointGroup0.636 441.000 10.765 490.624 490.505 520.797 400.116 480.696 270.384 490.441 450.559 400.476 390.596 461.000 10.666 280.756 400.556 560.997 400.513 38
Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia: PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation. CVPR 2020 [oral]
DD-UNet+Group0.635 450.667 600.797 420.714 410.562 400.774 460.146 410.810 80.429 440.476 400.546 430.399 490.633 391.000 10.632 360.722 480.609 431.000 10.514 37
H. Liu, R. Liu, K. Yang, J. Zhang, K. Peng, R. Stiefelhagen: HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor. ICCVW 2021
Mask3D_evaluation0.631 461.000 10.829 280.606 510.646 250.836 290.068 530.511 570.462 410.507 360.619 280.389 510.610 421.000 10.432 550.828 220.673 330.788 680.552 33
DENet0.629 471.000 10.797 410.608 500.589 360.627 600.219 360.882 10.310 540.402 570.383 600.396 500.650 271.000 10.663 300.543 680.691 311.000 10.568 29
3D-MPA0.611 481.000 10.833 270.765 330.526 470.756 490.136 460.588 500.470 390.438 480.432 540.358 550.650 270.857 490.429 560.765 370.557 551.000 10.430 51
Francis Engelmann, Martin Bokeloh, Alireza Fathi, Bastian Leibe, Matthias Nießner: 3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation. CVPR 2020
OSIS0.605 491.000 10.801 390.599 530.535 440.728 530.286 280.436 640.679 210.491 370.433 520.256 570.404 640.857 490.620 380.724 470.510 621.000 10.539 35
AOIA0.601 501.000 10.761 510.687 430.485 530.828 300.008 640.663 360.405 470.405 560.425 550.490 380.596 460.714 600.553 460.779 320.597 450.992 500.424 53
PCJC0.578 511.000 10.810 330.583 570.449 570.813 360.042 580.603 480.341 530.490 380.465 490.410 480.650 270.835 570.264 660.694 550.561 530.889 610.504 41
SSEN0.575 521.000 10.761 500.473 600.477 550.795 410.066 540.529 550.658 260.460 420.461 500.380 520.331 660.859 480.401 590.692 570.653 361.000 10.348 58
Dongsu Zhang, Junha Chun, Sang Kyun Cha, Young Min Kim: Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with Deep Metric Learning. Arxiv
RWSeg0.567 530.528 700.708 610.626 480.580 380.745 520.063 550.627 440.240 580.400 580.497 460.464 400.515 531.000 10.475 510.745 430.571 501.000 10.429 52
NeuralBF0.555 540.667 600.896 170.843 180.517 490.751 500.029 590.519 560.414 460.439 460.465 480.000 760.484 550.857 490.287 640.693 560.651 381.000 10.485 45
Weiwei Sun, Daniel Rebain, Renjie Liao, Vladimir Tankovich, Soroosh Yazdani, Kwang Moo Yi, Andrea Tagliasacchi: NeuralBF: Neural Bilateral Filtering for Top-down Instance Segmentation on Point Clouds. WACV 2023
MTML0.549 551.000 10.807 360.588 560.327 620.647 580.004 660.815 70.180 610.418 530.364 620.182 600.445 581.000 10.442 540.688 580.571 511.000 10.396 54
Jean Lahoud, Bernard Ghanem, Marc Pollefeys, Martin R. Oswald: 3D Instance Segmentation via Multi-task Metric Learning. ICCV 2019 [oral]
ClickSeg_Instance0.539 561.000 10.621 650.300 630.530 460.698 540.127 470.533 540.222 590.430 510.400 570.365 530.574 500.938 450.472 520.659 600.543 580.944 560.347 59
One_Thing_One_Clickpermissive0.529 570.667 600.718 570.777 300.399 580.683 550.000 690.669 340.138 640.391 590.374 610.539 340.360 650.641 640.556 450.774 350.593 460.997 400.251 64
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
Sparse R-CNN0.515 581.000 10.538 700.282 640.468 560.790 420.173 380.345 660.429 430.413 550.484 470.176 610.595 480.591 650.522 480.668 590.476 630.986 540.327 60
Occipital-SCS0.512 591.000 10.716 580.509 590.506 510.611 610.092 510.602 490.177 620.346 620.383 590.165 620.442 590.850 560.386 610.618 640.543 590.889 610.389 55
3D-BoNet0.488 601.000 10.672 640.590 550.301 640.484 710.098 500.620 460.306 550.341 630.259 660.125 640.434 610.796 590.402 580.499 700.513 610.909 600.439 50
Bo Yang, Jianan Wang, Ronald Clark, Qingyong Hu, Sen Wang, Andrew Markham, Niki Trigoni: Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds. NeurIPS 2019 Spotlight
PanopticFusion-inst0.478 610.667 600.712 600.595 540.259 670.550 670.000 690.613 470.175 630.250 680.434 510.437 440.411 630.857 490.485 500.591 670.267 730.944 560.359 57
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
SPG_WSIS0.470 620.667 600.685 630.677 450.372 600.562 650.000 690.482 610.244 570.316 650.298 630.052 710.442 600.857 490.267 650.702 520.559 541.000 10.287 62
SALoss-ResNet0.459 631.000 10.737 560.159 740.259 660.587 630.138 450.475 620.217 600.416 540.408 560.128 630.315 670.714 600.411 570.536 690.590 470.873 650.304 61
Zhidong Liang, Ming Yang, Hao Li, Chunxiang Wang: 3D Instance Embedding Learning With a Structure-Aware Loss Function for Point Cloud Segmentation. IEEE Robotics and Automation Letters (IROS2020)
MASCpermissive0.447 640.528 700.555 680.381 610.382 590.633 590.002 670.509 580.260 560.361 610.432 530.327 560.451 570.571 660.367 620.639 620.386 640.980 550.276 63
Chen Liu, Yasutaka Furukawa: MASC: Multi-scale Affinity with Sparse Convolution for 3D Instance Segmentation.
SegGroup_inspermissive0.445 650.667 600.773 470.185 710.317 630.656 570.000 690.407 650.134 650.381 600.267 650.217 590.476 560.714 600.452 530.629 630.514 601.000 10.222 67
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
3D-SISpermissive0.382 661.000 10.432 730.245 660.190 680.577 640.013 630.263 680.033 710.320 640.240 670.075 670.422 620.857 490.117 710.699 530.271 720.883 630.235 66
Ji Hou, Angela Dai, Matthias Niessner: 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans. CVPR 2019
Hier3Dcopyleft0.323 670.667 600.542 690.264 650.157 710.550 660.000 690.205 710.009 730.270 670.218 680.075 670.500 540.688 630.007 770.698 540.301 690.459 740.200 68
Tan: HCFS3D: Hierarchical Coupled Feature Selection Network for 3D Semantic and Instance Segmentation.
UNet-backbone0.319 680.667 600.715 590.233 670.189 690.479 720.008 640.218 690.067 700.201 700.173 690.107 650.123 720.438 680.150 680.615 650.355 650.916 590.093 76
R-PointNet0.306 690.500 720.405 740.311 620.348 610.589 620.054 560.068 740.126 660.283 660.290 640.028 720.219 700.214 710.331 630.396 740.275 700.821 670.245 65
Region-18class0.284 700.250 760.751 550.228 690.270 650.521 680.000 690.468 630.008 750.205 690.127 700.000 760.068 740.070 750.262 670.652 610.323 670.740 700.173 69
SemRegionNet-20cls0.250 710.333 730.613 660.229 680.163 700.493 690.000 690.304 670.107 670.147 730.100 720.052 700.231 680.119 730.039 730.445 720.325 660.654 710.141 72
tmp0.248 720.667 600.437 720.188 700.153 720.491 700.000 690.208 700.094 690.153 720.099 730.057 690.217 710.119 730.039 730.466 710.302 680.640 720.140 73
3D-BEVIS0.248 720.667 600.566 670.076 750.035 770.394 750.027 610.035 760.098 680.099 750.030 760.025 730.098 730.375 700.126 700.604 660.181 750.854 660.171 70
Cathrin Elich, Francis Engelmann, Jonas Schult, Theodora Kontogianni, Bastian Leibe: 3D-BEVIS: Birds-Eye-View Instance Segmentation.
Sem_Recon_ins0.227 740.764 590.486 710.069 760.098 740.426 740.017 620.067 750.015 720.172 710.100 710.096 660.054 760.183 720.135 690.366 750.260 740.614 730.168 71
ASIS0.199 750.333 730.253 760.167 730.140 730.438 730.000 690.177 720.008 740.121 740.069 740.004 750.231 690.429 690.036 750.445 730.273 710.333 760.119 75
Sgpn_scannet0.143 760.208 770.390 750.169 720.065 750.275 760.029 600.069 730.000 760.087 760.043 750.014 740.027 770.000 760.112 720.351 760.168 760.438 750.138 74
MaskRCNN 2d->3d Proj0.058 770.333 730.002 770.000 770.053 760.002 770.002 680.021 770.000 760.045 770.024 770.238 580.065 750.000 760.014 760.107 770.020 770.110 770.006 77