The 3D semantic instance prediction task involves detecting and segmenting the object in an 3D scan mesh.

Evaluation and metrics

Our evaluation ranks all methods according to the average precision for each class. We report the mean average precision AP at overlap 0.25 (AP 25%), overlap 0.5 (AP 50%), and over overlaps in the range [0.5:0.95:0.05] (AP). Note that multiple predictions of the same ground truth instance are penalized as false positives.



This table lists the benchmark results for the 3D semantic instance scenario.




Method Infoavg ap 50%bathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
Mask3D0.780 11.000 10.786 210.716 200.696 30.885 30.500 20.714 170.810 10.672 30.715 30.679 60.809 11.000 10.831 10.833 70.787 31.000 10.602 4
SPFormer0.770 20.903 320.903 10.806 80.609 110.886 20.568 10.815 60.705 30.711 10.655 40.652 80.685 81.000 10.789 30.809 100.776 41.000 10.583 7
SoftGroup++0.769 31.000 10.803 160.937 10.684 40.865 50.213 150.870 20.664 40.571 60.758 10.702 40.807 21.000 10.653 140.902 10.792 21.000 10.626 1
SoftGrouppermissive0.761 41.000 10.808 130.845 60.716 10.862 70.243 120.824 30.655 60.620 40.734 20.699 50.791 40.981 210.716 60.844 40.769 51.000 10.594 6
Thang Vu, Kookhoi Kim, Tung M. Luu, Xuan Thanh Nguyen, Chang D. Yoo: SoftGroup for 3D Instance Segmentaiton on Point Clouds. CVPR 2022 [Oral]
GraphCut0.732 51.000 10.788 190.724 190.642 70.859 80.248 110.787 100.618 100.596 50.653 60.722 20.583 251.000 10.766 40.861 20.825 11.000 10.504 17
IPCA-Inst0.731 61.000 10.788 200.884 50.698 20.788 220.252 100.760 120.646 70.511 140.637 80.665 70.804 31.000 10.644 150.778 120.747 71.000 10.561 12
TopoSeg0.725 71.000 10.806 150.933 20.668 60.758 250.272 80.734 160.630 80.549 100.654 50.606 90.697 70.966 230.612 180.839 50.754 61.000 10.573 8
DKNet0.718 81.000 10.814 100.782 110.619 80.872 40.224 130.751 140.569 120.677 20.585 110.724 10.633 170.981 210.515 250.819 80.736 81.000 10.617 2
Yizheng Wu, Min Shi, Shuaiyuan Du, Hao Lu, Zhiguo Cao, Weicai Zhong: 3D Instances as 1D Kernels. ECCV 2022
SSEC0.700 91.000 10.848 50.763 170.609 120.792 200.262 90.824 30.627 90.535 120.547 190.493 150.600 191.000 10.712 80.731 240.689 121.000 10.563 11
HAISpermissive0.699 101.000 10.849 40.820 70.675 50.808 150.279 60.757 130.465 170.517 130.596 90.559 110.600 191.000 10.654 130.767 140.676 130.994 280.560 13
Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang: Hierarchical Aggregation for 3D Instance Segmentation. ICCV 2021
SSTNetpermissive0.698 111.000 10.697 360.888 40.556 180.803 160.387 40.626 240.417 210.556 90.585 120.702 30.600 191.000 10.824 20.720 260.692 101.000 10.509 16
Zhihao Liang, Zhihao Li, Songcen Xu, Mingkui Tan, Kui Jia: Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks. ICCV2021
SphereSeg0.680 121.000 10.856 30.744 180.618 90.893 10.151 170.651 220.713 20.537 110.579 140.430 230.651 91.000 10.389 340.744 210.697 90.991 290.601 5
Box2Mask0.677 131.000 10.847 60.771 130.509 240.816 110.277 70.558 310.482 140.562 80.640 70.448 190.700 51.000 10.666 90.852 30.578 240.997 230.488 21
Julian Chibane, Francis Engelmann, Tuan Anh Tran, Gerard Pons-Moll: Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation Using Bounding Boxes. ECCV 2022
OccuSeg+instance0.672 141.000 10.758 280.682 220.576 160.842 90.477 30.504 340.524 130.567 70.585 130.451 180.557 261.000 10.751 50.797 110.563 271.000 10.467 24
Lei Han, Tian Zheng, Lan Xu, Lu Fang: OccuSeg: Occupancy-aware 3D Instance Segmentation. CVPR2020
Mask-Group0.664 151.000 10.822 90.764 160.616 100.815 120.139 210.694 190.597 110.459 190.566 150.599 100.600 190.516 400.715 70.819 90.635 171.000 10.603 3
Min Zhong, Xinghao Chen, Xiaokang Chen, Gang Zeng, Yunhe Wang: MaskGroup: Hierarchical Point Grouping and Masking for 3D Instance Segmentation. ICME 2022
INS-Conv-instance0.657 161.000 10.760 260.667 240.581 140.863 60.323 50.655 210.477 150.473 170.549 170.432 220.650 101.000 10.655 120.738 220.585 230.944 330.472 23
CSC-Pretrained0.648 171.000 10.810 110.768 140.523 230.813 130.143 200.819 50.389 220.422 260.511 220.443 200.650 101.000 10.624 170.732 230.634 181.000 10.375 30
PE0.645 181.000 10.773 230.798 100.538 200.786 230.088 280.799 90.350 260.435 250.547 180.545 120.646 160.933 240.562 210.761 170.556 320.997 230.501 19
Biao Zhang, Peter Wonka: Point Cloud Instance Segmentation using Probabilistic Embeddings. CVPR 2021
RPGN0.643 191.000 10.758 270.582 320.539 190.826 100.046 320.765 110.372 240.436 240.588 100.539 140.650 101.000 10.577 190.750 190.653 160.997 230.495 20
Dyco3Dcopyleft0.641 201.000 10.841 70.893 30.531 210.802 170.115 250.588 290.448 180.438 220.537 210.430 240.550 270.857 260.534 230.764 160.657 140.987 300.568 9
Tong He; Chunhua Shen; Anton van den Hengel: DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution. CVPR2021
GICN0.638 211.000 10.895 20.800 90.480 270.676 290.144 190.737 150.354 250.447 200.400 310.365 290.700 51.000 10.569 200.836 60.599 201.000 10.473 22
PointGroup0.636 221.000 10.765 240.624 260.505 260.797 180.116 240.696 180.384 230.441 210.559 160.476 160.596 231.000 10.666 90.756 180.556 310.997 230.513 15
Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia: PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation. CVPR 2020 [oral]
DD-UNet+Group0.635 230.667 330.797 180.714 210.562 170.774 240.146 180.810 80.429 200.476 160.546 200.399 260.633 171.000 10.632 160.722 250.609 191.000 10.514 14
H. Liu, R. Liu, K. Yang, J. Zhang, K. Peng, R. Stiefelhagen: HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor. ICCVW 2021
DENet0.629 241.000 10.797 170.608 270.589 130.627 330.219 140.882 10.310 280.402 300.383 330.396 270.650 101.000 10.663 110.543 410.691 111.000 10.568 10
3D-MPA0.611 251.000 10.833 80.765 150.526 220.756 260.136 230.588 290.470 160.438 230.432 290.358 300.650 100.857 260.429 300.765 150.557 301.000 10.430 26
Francis Engelmann, Martin Bokeloh, Alireza Fathi, Bastian Leibe, Matthias Nie├čner: 3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation. CVPR 2020
PCJC0.578 261.000 10.810 120.583 310.449 300.813 140.042 330.603 270.341 270.490 150.465 250.410 250.650 100.835 320.264 390.694 300.561 280.889 370.504 18
SSEN0.575 271.000 10.761 250.473 340.477 280.795 190.066 290.529 320.658 50.460 180.461 260.380 280.331 390.859 250.401 330.692 310.653 151.000 10.348 32
Dongsu Zhang, Junha Chun, Sang Kyun Cha, Young Min Kim: Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with Deep Metric Learning. Arxiv
RWSeg0.567 280.528 420.708 350.626 250.580 150.745 270.063 300.627 230.240 320.400 310.497 230.464 170.515 281.000 10.475 270.745 200.571 251.000 10.429 27
MTML0.549 291.000 10.807 140.588 300.327 350.647 310.004 380.815 70.180 340.418 270.364 350.182 340.445 321.000 10.442 290.688 320.571 261.000 10.396 28
Jean Lahoud, Bernard Ghanem, Marc Pollefeys, Martin R. Oswald: 3D Instance Segmentation via Multi-task Metric Learning. ICCV 2019 [oral]
One_Thing_One_Clickpermissive0.529 300.667 330.718 310.777 120.399 310.683 280.000 410.669 200.138 370.391 320.374 340.539 130.360 380.641 370.556 220.774 130.593 210.997 230.251 37
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
Sparse R-CNN0.515 311.000 10.538 430.282 370.468 290.790 210.173 160.345 390.429 190.413 290.484 240.176 350.595 240.591 380.522 240.668 330.476 360.986 310.327 33
Occipital-SCS0.512 321.000 10.716 320.509 330.506 250.611 340.092 270.602 280.177 350.346 350.383 320.165 360.442 330.850 310.386 350.618 370.543 330.889 370.389 29
3D-BoNet0.488 331.000 10.672 380.590 290.301 370.484 440.098 260.620 250.306 290.341 360.259 390.125 380.434 350.796 330.402 320.499 430.513 350.909 360.439 25
Bo Yang, Jianan Wang, Ronald Clark, Qingyong Hu, Sen Wang, Andrew Markham, Niki Trigoni: Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds. NeurIPS 2019 Spotlight
PanopticFusion-inst0.478 340.667 330.712 340.595 280.259 400.550 400.000 410.613 260.175 360.250 410.434 270.437 210.411 370.857 260.485 260.591 400.267 460.944 330.359 31
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
SPG_WSIS0.470 350.667 330.685 370.677 230.372 330.562 380.000 410.482 350.244 310.316 380.298 360.052 440.442 340.857 260.267 380.702 270.559 291.000 10.287 35
SALoss-ResNet0.459 361.000 10.737 300.159 470.259 390.587 360.138 220.475 360.217 330.416 280.408 300.128 370.315 400.714 340.411 310.536 420.590 220.873 400.304 34
Zhidong Liang, Ming Yang, Hao Li, Chunxiang Wang: 3D Instance Embedding Learning With a Structure-Aware Loss Function for Point Cloud Segmentation. IEEE Robotics and Automation Letters (IROS2020)
MASCpermissive0.447 370.528 420.555 410.381 350.382 320.633 320.002 390.509 330.260 300.361 340.432 280.327 310.451 310.571 390.367 360.639 350.386 370.980 320.276 36
Chen Liu, Yasutaka Furukawa: MASC: Multi-scale Affinity with Sparse Convolution for 3D Instance Segmentation.
SegGroup_inspermissive0.445 380.667 330.773 220.185 440.317 360.656 300.000 410.407 380.134 380.381 330.267 380.217 330.476 300.714 340.452 280.629 360.514 341.000 10.222 40
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
3D-SISpermissive0.382 391.000 10.432 450.245 390.190 410.577 370.013 360.263 410.033 440.320 370.240 400.075 400.422 360.857 260.117 430.699 280.271 450.883 390.235 39
Ji Hou, Angela Dai, Matthias Niessner: 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans. CVPR 2019
Hier3Dcopyleft0.323 400.667 330.542 420.264 380.157 440.550 390.000 410.205 440.009 450.270 400.218 410.075 400.500 290.688 360.007 490.698 290.301 420.459 460.200 41
Tan: HCFS3D: Hierarchical Coupled Feature Selection Network for 3D Semantic and Instance Segmentation.
UNet-backbone0.319 410.667 330.715 330.233 400.189 420.479 450.008 370.218 420.067 430.201 430.173 420.107 390.123 450.438 410.150 410.615 380.355 380.916 350.093 48
R-PointNet0.306 420.500 440.405 460.311 360.348 340.589 350.054 310.068 470.126 390.283 390.290 370.028 450.219 430.214 440.331 370.396 470.275 430.821 420.245 38
Region-18class0.284 430.250 480.751 290.228 420.270 380.521 410.000 410.468 370.008 470.205 420.127 430.000 490.068 470.070 470.262 400.652 340.323 400.740 430.173 42
SemRegionNet-20cls0.250 440.333 450.613 390.229 410.163 430.493 420.000 410.304 400.107 400.147 450.100 440.052 430.231 410.119 450.039 450.445 450.325 390.654 440.141 44
3D-BEVIS0.248 450.667 330.566 400.076 480.035 490.394 470.027 350.035 480.098 410.099 470.030 480.025 460.098 460.375 430.126 420.604 390.181 470.854 410.171 43
Cathrin Elich, Francis Engelmann, Jonas Schult, Theodora Kontogianni, Bastian Leibe: 3D-BEVIS: Birds-Eye-View Instance Segmentation.
tmp0.248 450.667 330.437 440.188 430.153 450.491 430.000 410.208 430.094 420.153 440.099 450.057 420.217 440.119 450.039 450.466 440.302 410.640 450.140 45
ASIS0.199 470.333 450.253 480.167 460.140 460.438 460.000 410.177 450.008 460.121 460.069 460.004 480.231 420.429 420.036 470.445 460.273 440.333 480.119 47
Sgpn_scannet0.143 480.208 490.390 470.169 450.065 470.275 480.029 340.069 460.000 480.087 480.043 470.014 470.027 490.000 480.112 440.351 480.168 480.438 470.138 46
MaskRCNN 2d->3d Proj0.058 490.333 450.002 490.000 490.053 480.002 490.002 400.021 490.000 480.045 490.024 490.238 320.065 480.000 480.014 480.107 490.020 490.110 490.006 49