The 3D semantic instance prediction task involves detecting and segmenting the object in an 3D scan mesh.

Evaluation and metrics

Our evaluation ranks all methods according to the average precision for each class. We report the mean average precision AP at overlap 0.25 (AP 25%), overlap 0.5 (AP 50%), and over overlaps in the range [0.5:0.95:0.05] (AP). Note that multiple predictions of the same ground truth instance are penalized as false positives.



This table lists the benchmark results for the 3D semantic instance scenario.




Method Infoavg apbathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort by
Competitor-MAFT0.618 20.866 160.724 10.628 10.484 40.803 20.300 70.509 320.496 10.539 10.547 60.703 10.668 80.708 320.463 170.708 160.595 40.959 60.418 8
SIM3D0.617 30.952 40.629 170.539 110.426 160.768 110.302 60.681 20.425 90.473 160.511 160.701 20.717 10.821 60.467 150.774 10.559 150.914 180.448 3
EV3D0.615 50.946 50.652 130.555 60.433 140.773 60.271 130.604 80.447 50.506 70.544 70.698 30.716 20.775 160.480 90.747 50.572 130.925 140.435 6
Spherical Mask(CtoF)0.616 40.946 50.654 120.555 60.434 130.769 100.271 120.604 80.447 50.505 80.549 30.698 30.716 20.775 160.480 90.747 50.575 110.925 140.436 5
DCD0.614 60.892 130.633 160.434 280.495 30.810 10.292 80.501 330.408 100.525 40.582 10.688 50.625 100.801 80.608 10.672 200.649 10.965 30.476 1
MG-Former0.587 100.852 170.639 150.454 220.393 210.758 140.338 20.572 160.480 30.527 30.491 220.671 60.527 220.867 10.485 60.601 310.590 80.938 120.390 12
PointRel0.622 10.926 80.710 30.541 100.502 20.772 70.314 40.598 110.425 80.504 100.565 20.650 70.716 20.809 70.476 120.747 40.618 20.963 40.364 20
: Relation3D (PointRel): Enhancing Relation Modeling for Point Cloud Instance Segmentation.
UniPerception0.588 90.963 30.667 100.493 150.472 70.750 150.229 190.528 270.468 40.498 130.542 80.643 80.530 210.661 390.463 160.695 180.599 30.972 10.420 7
KmaxOneFormerNetpermissive0.581 130.745 280.692 80.551 80.458 100.798 30.264 150.531 260.369 190.513 50.531 140.632 90.494 250.798 90.567 30.648 240.558 170.950 90.362 22
ISBNetpermissive0.559 190.939 70.655 110.383 400.426 170.763 130.180 210.534 250.386 130.499 110.509 180.621 100.427 390.704 340.467 140.649 230.571 140.948 100.401 9
Tuan Duc Ngo, Binh-Son Hua, Khoi Nguyen: ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution. CVPR 2023
MAFT0.596 80.889 140.721 20.448 230.460 90.768 120.251 160.558 210.408 110.504 90.539 90.616 110.618 120.858 30.482 80.684 190.551 180.931 130.450 2
GraphCut0.552 201.000 10.611 190.438 270.392 220.714 230.139 250.598 130.327 230.389 230.510 170.598 120.427 400.754 210.463 180.761 30.588 90.903 230.329 30
ExtMask3D0.598 70.852 170.692 70.433 310.461 80.791 40.264 140.488 360.493 20.508 60.528 150.594 130.706 60.791 100.483 70.734 90.595 50.911 200.437 4
TST3D0.569 160.778 250.675 90.598 30.451 110.727 220.280 100.476 390.395 120.472 170.457 280.583 140.580 160.777 130.462 190.735 80.547 200.919 170.333 28
Duc Tran Dang Trung, Byeongkeun Kang, Yeejin Lee: MSTA3D: Multi-scale Twin-attention for 3D Instance Segmentation. ACM Multimedia 2024
DKNet0.532 220.815 210.624 180.517 120.377 240.749 160.107 270.509 310.304 250.437 210.475 230.581 150.539 190.775 150.339 300.640 270.506 240.901 240.385 15
Yizheng Wu, Min Shi, Shuaiyuan Du, Hao Lu, Zhiguo Cao, Weicai Zhong: 3D Instances as 1D Kernels. ECCV 2022
InsSSM0.586 111.000 10.593 210.440 260.480 50.771 80.345 10.437 400.444 70.495 140.548 50.579 160.621 110.720 280.409 230.712 110.593 60.960 50.395 10
Lei Yao, Yi Wang, Moyun Liu, Lap-Pui Chau: SGIFormer: Semantic-guided and Geometric-enhanced Interleaving Transformer for 3D Instance Segmentation. TCSVT, 2024
SSTNetpermissive0.506 250.738 320.549 320.497 140.316 360.693 270.178 220.377 500.198 390.330 290.463 270.576 170.515 230.857 40.494 40.637 280.457 300.943 110.290 39
Zhihao Liang, Zhihao Li, Songcen Xu, Mingkui Tan, Kui Jia: Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks. ICCV2021
OneFormer3Dcopyleft0.566 170.781 240.697 60.562 50.431 150.770 90.331 30.400 460.373 180.529 20.504 200.568 180.475 290.732 260.470 130.762 20.550 190.871 350.379 17
Maxim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: OneFormer3D: One Transformer for Unified Point Cloud Segmentation.
Mask3D0.566 170.926 80.597 200.408 340.420 180.737 180.239 170.598 110.386 140.458 200.549 30.568 190.716 20.601 450.480 90.646 250.575 110.922 160.364 19
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
SPFormerpermissive0.549 210.745 280.640 140.484 160.395 200.739 170.311 50.566 180.335 210.468 180.492 210.555 200.478 280.747 230.436 200.712 120.540 210.893 270.343 27
Sun Jiahao, Qing Chunmei, Tan Junpeng, Xu Xiangmin: Superpoint Transformer for 3D Scene Instance Segmentation. AAAI 2023 [Oral]
Competitor-SPFormer0.580 140.721 350.705 40.593 40.444 120.786 50.286 90.564 190.376 160.498 120.534 120.546 210.390 450.785 120.577 20.708 150.579 100.954 80.388 13
PBNetpermissive0.573 150.926 80.575 270.619 20.472 60.736 190.239 180.487 370.383 150.459 190.506 190.533 220.585 140.767 180.404 240.717 100.559 160.969 20.381 16
Weiguang Zhao, Yuyao Yan, Chaolong Yang, Jianan Ye, Xi Yang, Kaizhu Huang: Divide and Conquer: 3D Instance Segmentation With Point-Wise Binarization. ICCV 2023
Queryformer0.583 120.926 80.702 50.393 370.504 10.733 210.276 110.527 280.373 170.479 150.534 110.533 230.697 70.720 290.436 210.745 70.592 70.958 70.363 21
IPCA-Inst0.520 230.889 140.551 310.548 90.418 190.665 330.064 360.585 140.260 330.277 380.471 250.500 240.644 90.785 110.369 260.591 350.511 220.878 320.362 23
ODIN - Inspermissive0.463 330.738 320.589 220.344 460.358 320.560 520.139 240.393 490.331 220.373 270.392 370.496 250.493 260.709 310.377 250.599 330.359 470.752 550.332 29
Ayush Jain, Pushkal Katara, Nikolaos Gkanatsios, Adam W. Harley, Gabriel Sarch, Kriti Aggarwal, Vishrav Chaudhary, Katerina Fragkiadaki: ODIN: A Single Model for 2D and 3D Segmentation. CVPR 2024
DANCENET0.504 260.926 80.579 230.472 180.367 270.626 430.165 230.432 410.221 350.408 220.449 300.411 260.564 170.746 240.421 220.707 170.438 330.846 430.288 40
One_Thing_One_Clickpermissive0.326 560.472 620.361 570.232 570.183 570.555 530.000 690.498 340.038 640.195 560.226 580.362 270.168 630.469 570.251 430.553 460.335 530.846 440.117 64
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
SoftGroup++0.513 240.704 370.578 260.398 360.363 300.704 240.061 370.647 50.297 300.378 260.537 100.343 280.614 130.828 50.295 350.710 140.505 260.875 340.394 11
OccuSeg+instance0.486 290.802 230.536 340.428 320.369 260.702 250.205 200.331 550.301 270.379 250.474 240.327 290.437 340.862 20.485 50.601 320.394 410.846 450.273 43
Lei Han, Tian Zheng, Lan Xu, Lu Fang: OccuSeg: Occupancy-aware 3D Instance Segmentation. CVPR2020
TopoSeg0.479 300.704 370.564 280.467 200.366 280.633 410.068 340.554 220.262 320.328 300.447 310.323 300.534 200.722 270.288 370.614 290.482 280.912 190.358 25
Box2Mask0.433 380.741 300.463 510.433 300.283 410.625 440.103 280.298 600.125 500.260 400.424 330.322 310.472 300.701 350.363 280.711 130.309 590.882 300.272 45
Julian Chibane, Francis Engelmann, Tuan Anh Tran, Gerard Pons-Moll: Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation Using Bounding Boxes. ECCV 2022
SoftGrouppermissive0.504 260.667 440.579 240.372 420.381 230.694 260.072 330.677 30.303 260.387 240.531 130.319 320.582 150.754 200.318 310.643 260.492 270.907 220.388 14
Thang Vu, Kookhoi Kim, Tung M. Luu, Xuan Thanh Nguyen, Chang D. Yoo: SoftGroup for 3D Instance Segmentaiton on Point Clouds. CVPR 2022 [Oral]
Mask-Group0.434 370.778 250.516 390.471 190.330 340.658 340.029 480.526 290.249 340.256 410.400 350.309 330.384 480.296 650.368 270.575 390.425 360.877 330.362 24
Min Zhong, Xinghao Chen, Xiaokang Chen, Gang Zeng, Yunhe Wang: MaskGroup: Hierarchical Point Grouping and Masking for 3D Instance Segmentation. ICME 2022
SSEC0.465 320.667 440.578 250.502 130.362 310.641 400.035 460.605 70.291 310.323 320.451 290.296 340.417 430.677 370.245 440.501 530.506 250.900 250.366 18
HAISpermissive0.457 340.704 370.561 290.457 210.364 290.673 300.046 450.547 230.194 400.308 330.426 320.288 350.454 320.711 300.262 410.563 430.434 350.889 290.344 26
Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang: Hierarchical Aggregation for 3D Instance Segmentation. ICCV 2021
TD3Dpermissive0.489 280.852 170.511 410.434 290.322 350.735 200.101 300.512 300.355 200.349 280.468 260.283 360.514 240.676 380.268 400.671 210.510 230.908 210.329 31
Maksim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: Top-Down Beats Bottom-Up in 3D Instance Segmentation. WACV 2024
DualGroup0.469 310.815 210.552 300.398 350.374 250.683 290.130 260.539 240.310 240.327 310.407 340.276 370.447 330.535 490.342 290.659 220.455 310.900 260.301 35
RPGN0.428 390.630 520.508 430.367 430.249 480.658 350.016 560.673 40.131 480.234 450.383 400.270 380.434 350.748 220.274 390.609 300.406 380.842 470.267 46
Shichao Dong, Guosheng Lin, Tzu-Yi Hung: Learning Regional Purity for Instance Segmentation on 3D Point Clouds. ECCV 2022
RWSeg0.348 540.475 610.456 520.320 500.275 440.476 570.020 540.491 350.056 620.212 520.320 460.261 390.302 530.520 500.182 530.557 450.285 610.867 360.197 53
INS-Conv-instance0.435 360.716 360.495 460.355 440.331 330.689 280.102 290.394 480.208 380.280 360.395 360.250 400.544 180.741 250.309 330.536 490.391 420.842 480.258 47
ClickSeg_Instance0.366 500.654 480.375 550.184 600.302 390.592 490.050 430.300 590.093 540.283 340.277 510.249 410.426 420.615 440.299 340.504 520.367 450.832 490.191 54
PE0.396 430.667 440.467 490.446 250.243 490.624 460.022 520.577 150.106 520.219 480.340 430.239 420.487 270.475 560.225 460.541 480.350 490.818 500.273 44
Biao Zhang, Peter Wonka: Point Cloud Instance Segmentation using Probabilistic Embeddings. CVPR 2021
DENet0.413 400.741 300.520 360.237 560.284 400.523 550.097 310.691 10.138 450.209 550.229 570.238 430.390 460.707 330.310 320.448 600.470 290.892 280.310 33
Dyco3Dcopyleft0.395 440.642 500.518 380.447 240.259 470.666 320.050 420.251 650.166 420.231 460.362 410.232 440.331 510.535 480.229 450.587 360.438 340.850 400.317 32
Tong He; Chunhua Shen; Anton van den Hengel: DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution. CVPR2021
SSEN0.384 470.852 170.494 470.192 590.226 520.648 380.022 510.398 470.299 290.277 370.317 470.231 450.194 620.514 520.196 490.586 370.444 320.843 460.184 55
Dongsu Zhang, Junha Chun, Sang Kyun Cha, Young Min Kim: Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with Deep Metric Learning. Arxiv
AOIA0.387 460.704 370.515 400.385 390.225 530.669 310.005 630.482 380.126 490.181 580.269 540.221 460.426 410.478 550.218 470.592 340.371 440.851 390.242 49
PointGroup0.407 410.639 510.496 450.415 330.243 500.645 390.021 530.570 170.114 510.211 530.359 420.217 470.428 380.660 400.256 420.562 440.341 510.860 380.291 37
Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia: PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation. CVPR 2020 [oral]
CSC-Pretrained0.405 420.738 320.465 500.331 490.205 540.655 360.051 410.601 100.092 550.211 540.329 450.198 480.459 310.775 140.195 510.524 510.400 400.878 310.184 56
DD-UNet+Group0.436 350.630 520.508 440.480 170.310 380.624 450.065 350.638 60.174 410.256 420.384 390.194 490.428 370.759 190.289 360.574 400.400 390.849 420.291 38
H. Liu, R. Liu, K. Yang, J. Zhang, K. Peng, R. Stiefelhagen: HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor. ICCVW 2021
3D-MPA0.355 520.457 640.484 480.299 510.277 430.591 500.047 440.332 530.212 370.217 490.278 500.193 500.413 440.410 590.195 500.574 410.352 480.849 410.213 52
Francis Engelmann, Martin Bokeloh, Alireza Fathi, Bastian Leibe, Matthias Nießner: 3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation. CVPR 2020
PanopticFusion-inst0.214 650.250 710.330 610.275 540.103 660.228 740.000 690.345 520.024 660.088 670.203 610.186 510.167 640.367 620.125 630.221 720.112 740.666 630.162 59
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
PCJC0.375 490.704 370.542 330.284 530.197 560.649 370.006 600.426 420.138 460.242 430.304 490.183 520.388 470.629 420.141 620.546 470.344 500.738 570.283 41
Mask3D_evaluation0.382 480.593 540.520 370.390 380.314 370.600 470.018 550.287 630.151 440.281 350.387 380.169 530.429 360.654 410.172 550.578 380.384 430.670 620.278 42
SphereSeg0.357 510.651 490.411 530.345 450.264 460.630 420.059 380.289 620.212 360.240 440.336 440.158 540.305 520.557 460.159 580.455 590.341 520.726 590.294 36
OSIS0.392 450.778 250.530 350.220 580.278 420.567 510.083 320.330 560.299 280.270 390.310 480.143 550.260 550.624 430.277 380.568 420.361 460.865 370.301 34
MASCpermissive0.254 610.463 630.249 670.113 630.167 580.412 630.000 680.374 510.073 560.173 610.243 550.130 560.228 580.368 610.160 570.356 640.208 640.711 610.136 62
Chen Liu, Yasutaka Furukawa: MASC: Multi-scale Affinity with Sparse Convolution for 3D Instance Segmentation.
GICN0.341 550.580 560.371 560.344 470.198 550.469 580.052 400.564 200.093 530.212 510.212 590.127 570.347 500.537 470.206 480.525 500.329 540.729 580.241 50
MaskRCNN 2d->3d Proj0.022 770.185 740.000 770.000 770.015 750.000 770.000 670.006 770.000 760.010 770.006 770.107 580.012 760.000 760.002 770.027 770.004 770.022 770.001 77
Occipital-SCS0.320 570.679 430.352 580.334 480.229 510.436 590.025 490.412 450.058 600.161 630.240 560.085 590.262 540.496 540.187 520.467 570.328 550.775 520.231 51
SegGroup_inspermissive0.246 640.556 580.335 600.062 710.115 640.490 560.000 690.297 610.018 680.186 570.142 630.083 600.233 560.216 670.153 600.469 560.251 620.744 560.083 67
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
Sparse R-CNN0.292 580.704 370.213 680.153 620.154 590.551 540.053 390.212 660.132 470.174 600.274 520.070 610.363 490.441 580.176 540.424 620.234 630.758 540.161 60
MTML0.282 590.577 570.380 540.182 610.107 650.430 600.001 660.422 430.057 610.179 590.162 620.070 620.229 570.511 530.161 560.491 540.313 560.650 650.162 58
Jean Lahoud, Bernard Ghanem, Marc Pollefeys, Martin R. Oswald: 3D Instance Segmentation via Multi-task Metric Learning. ICCV 2019 [oral]
3D-BoNet0.253 620.519 590.324 620.251 550.137 620.345 680.031 470.419 440.069 570.162 620.131 640.052 630.202 610.338 630.147 610.301 690.303 600.651 640.178 57
Bo Yang, Jianan Wang, Ronald Clark, Qingyong Hu, Sen Wang, Andrew Markham, Niki Trigoni: Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds. NeurIPS 2019 Spotlight
UNet-backbone0.161 660.519 590.259 660.084 650.059 680.325 700.002 640.093 710.009 700.077 690.064 680.045 640.044 720.161 690.045 680.331 670.180 660.566 660.033 76
SALoss-ResNet0.262 600.667 440.335 590.067 690.123 630.427 610.022 500.280 640.058 590.216 500.211 600.039 650.142 650.519 510.106 660.338 660.310 580.721 600.138 61
Zhidong Liang, Ming Yang, Hao Li, Chunxiang Wang: 3D Instance Embedding Learning With a Structure-Aware Loss Function for Point Cloud Segmentation. IEEE Robotics and Automation Letters (IROS2020)
3D-SISpermissive0.161 660.407 650.155 730.068 680.043 720.346 670.001 650.134 680.005 710.088 660.106 670.037 660.135 670.321 640.028 720.339 650.116 730.466 690.093 66
Ji Hou, Angela Dai, Matthias Niessner: 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans. CVPR 2019
Sem_Recon_ins0.098 740.295 700.187 700.015 760.036 730.213 750.005 620.038 730.003 720.056 710.037 710.036 670.015 750.051 720.044 690.209 740.098 750.354 720.071 69
SPG_WSIS0.251 630.380 660.274 650.289 520.144 600.413 620.000 690.311 570.065 580.113 650.130 650.029 680.204 600.388 600.108 650.459 580.311 570.769 530.127 63
Hier3Dcopyleft0.117 710.222 730.161 720.054 730.027 740.289 710.000 690.124 690.001 750.079 680.061 690.027 690.141 660.240 660.005 760.310 680.129 700.153 760.081 68
Tan: HCFS3D: Hierarchical Coupled Feature Selection Network for 3D Semantic and Instance Segmentation.
tmp0.113 730.333 680.151 740.056 720.053 700.344 690.000 690.105 700.016 690.049 730.035 720.020 700.053 690.048 730.013 740.183 750.173 670.344 730.054 73
SemRegionNet-20cls0.121 700.296 690.203 690.071 670.058 690.349 660.000 690.150 670.019 670.054 720.034 730.017 710.052 700.042 740.013 750.209 730.183 650.371 700.057 72
R-PointNet0.158 680.356 670.173 710.113 640.140 610.359 640.012 570.023 740.039 630.134 640.123 660.008 720.089 680.149 700.117 640.221 710.128 710.563 670.094 65
Sgpn_scannet0.049 760.023 770.134 750.031 740.013 760.144 760.006 590.008 760.000 760.028 760.017 750.003 730.009 770.000 760.021 730.122 760.095 760.175 740.054 74
3D-BEVIS0.117 710.250 710.308 640.020 750.009 770.269 730.006 610.008 750.029 650.037 750.014 760.003 740.036 730.147 710.042 700.381 630.118 720.362 710.069 70
Cathrin Elich, Francis Engelmann, Jonas Schult, Theodora Kontogianni, Bastian Leibe: 3D-BEVIS: Birds-Eye-View Instance Segmentation.
ASIS0.085 750.037 760.080 760.066 700.047 710.282 720.000 690.052 720.002 740.047 740.026 740.001 750.046 710.194 680.031 710.264 700.140 680.167 750.047 75
NeuralBF0.353 530.593 540.511 420.375 410.264 450.597 480.008 580.332 540.160 430.229 470.274 530.000 760.206 590.678 360.155 590.485 550.422 370.816 510.254 48
Weiwei Sun, Daniel Rebain, Renjie Liao, Vladimir Tankovich, Soroosh Yazdani, Kwang Moo Yi, Andrea Tagliasacchi: NeuralBF: Neural Bilateral Filtering for Top-down Instance Segmentation on Point Clouds. WACV 2023
Region-18class0.146 690.175 750.321 630.080 660.062 670.357 650.000 690.307 580.002 730.066 700.044 700.000 760.018 740.036 750.054 670.447 610.133 690.472 680.060 71