The 3D semantic instance prediction task involves detecting and segmenting the object in an 3D scan mesh.

Evaluation and metrics

Our evaluation ranks all methods according to the average precision for each class. We report the mean average precision AP at overlap 0.25 (AP 25%), overlap 0.5 (AP 50%), and over overlaps in the range [0.5:0.95:0.05] (AP). Note that multiple predictions of the same ground truth instance are penalized as false positives.



This table lists the benchmark results for the 3D semantic instance scenario.




Method Infoavg apbathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
GraphCut0.552 201.000 10.611 190.438 270.392 220.714 230.139 240.598 130.327 220.389 230.510 170.598 120.427 390.754 210.463 180.761 30.588 90.903 230.329 29
InsSSM0.586 111.000 10.593 210.440 260.480 50.771 80.345 10.437 400.444 70.495 140.548 50.579 160.621 110.720 280.409 230.712 110.593 60.960 50.395 10
Lei Yao, Yi Wang, Moyun Liu, Lap-Pui Chau: SGIFormer: Semantic-guided and Geometric-enhanced Interleaving Transformer for 3D Instance Segmentation. TCSVT, 2024
UniPerception0.588 90.963 30.667 100.493 150.472 70.750 150.229 190.528 270.468 40.498 130.542 80.643 80.530 210.661 380.463 160.695 180.599 30.972 10.420 7
SIM3D0.617 30.952 40.629 170.539 110.426 160.768 110.302 60.681 20.425 90.473 160.511 160.701 20.717 10.821 60.467 150.774 10.559 150.914 180.448 3
EV3D0.615 50.946 50.652 130.555 60.433 140.773 60.271 130.604 80.447 50.506 70.544 70.698 30.716 20.775 160.480 90.747 50.572 130.925 140.435 6
Spherical Mask(CtoF)0.616 40.946 50.654 120.555 60.434 130.769 100.271 120.604 80.447 50.505 80.549 30.698 30.716 20.775 160.480 90.747 50.575 110.925 140.436 5
ISBNetpermissive0.559 190.939 70.655 110.383 400.426 170.763 130.180 210.534 250.386 130.499 110.509 180.621 100.427 380.704 330.467 140.649 230.571 140.948 100.401 9
Tuan Duc Ngo, Binh-Son Hua, Khoi Nguyen: ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution. CVPR 2023
DANCENET0.504 260.926 80.579 220.472 180.367 270.626 430.165 230.432 410.221 340.408 220.449 300.411 250.564 170.746 240.421 220.707 170.438 330.846 430.288 39
PBNetpermissive0.573 150.926 80.575 260.619 20.472 60.736 190.239 180.487 370.383 150.459 190.506 190.533 220.585 140.767 180.404 240.717 100.559 160.969 20.381 16
Weiguang Zhao, Yuyao Yan, Chaolong Yang, Jianan Ye, Xi Yang, Kaizhu Huang: Divide and Conquer: 3D Instance Segmentation With Point-Wise Binarization. ICCV 2023
PointRel0.622 10.926 80.710 30.541 100.502 20.772 70.314 40.598 110.425 80.504 100.565 20.650 70.716 20.809 70.476 120.747 40.618 20.963 40.364 20
: Relation3D (PointRel): Enhancing Relation Modeling for Point Cloud Instance Segmentation.
Queryformer0.583 120.926 80.702 50.393 370.504 10.733 210.276 110.527 280.373 170.479 150.534 110.533 230.697 70.720 290.436 210.745 70.592 70.958 70.363 21
Mask3D0.566 170.926 80.597 200.408 340.420 180.737 180.239 170.598 110.386 140.458 200.549 30.568 190.716 20.601 440.480 90.646 250.575 110.922 160.364 19
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
DCD0.614 60.892 130.633 160.434 280.495 30.810 10.292 80.501 330.408 100.525 40.582 10.688 50.625 100.801 80.608 10.672 200.649 10.965 30.476 1
IPCA-Inst0.520 230.889 140.551 300.548 90.418 190.665 330.064 350.585 140.260 320.277 370.471 250.500 240.644 90.785 110.369 250.591 340.511 220.878 320.362 23
MAFT0.596 80.889 140.721 20.448 230.460 90.768 120.251 160.558 210.408 110.504 90.539 90.616 110.618 120.858 30.482 80.684 190.551 180.931 130.450 2
Competitor-MAFT0.618 20.866 160.724 10.628 10.484 40.803 20.300 70.509 320.496 10.539 10.547 60.703 10.668 80.708 310.463 170.708 160.595 40.959 60.418 8
ExtMask3D0.598 70.852 170.692 70.433 310.461 80.791 40.264 140.488 360.493 20.508 60.528 150.594 130.706 60.791 100.483 70.734 90.595 50.911 200.437 4
TD3Dpermissive0.489 280.852 170.511 400.434 290.322 340.735 200.101 290.512 300.355 200.349 270.468 260.283 350.514 240.676 370.268 390.671 210.510 230.908 210.329 30
Maksim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: Top-Down Beats Bottom-Up in 3D Instance Segmentation. WACV 2024
MG-Former0.587 100.852 170.639 150.454 220.393 210.758 140.338 20.572 160.480 30.527 30.491 220.671 60.527 220.867 10.485 60.601 310.590 80.938 120.390 12
SSEN0.384 460.852 170.494 460.192 580.226 510.648 380.022 500.398 470.299 280.277 360.317 460.231 440.194 610.514 510.196 480.586 360.444 320.843 460.184 54
Dongsu Zhang, Junha Chun, Sang Kyun Cha, Young Min Kim: Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with Deep Metric Learning. Arxiv
DualGroup0.469 310.815 210.552 290.398 350.374 250.683 290.130 250.539 240.310 230.327 300.407 340.276 360.447 320.535 480.342 280.659 220.455 310.900 260.301 34
DKNet0.532 220.815 210.624 180.517 120.377 240.749 160.107 260.509 310.304 240.437 210.475 230.581 150.539 190.775 150.339 290.640 270.506 240.901 240.385 15
Yizheng Wu, Min Shi, Shuaiyuan Du, Hao Lu, Zhiguo Cao, Weicai Zhong: 3D Instances as 1D Kernels. ECCV 2022
OccuSeg+instance0.486 290.802 230.536 330.428 320.369 260.702 250.205 200.331 540.301 260.379 250.474 240.327 280.437 330.862 20.485 50.601 320.394 410.846 450.273 42
Lei Han, Tian Zheng, Lan Xu, Lu Fang: OccuSeg: Occupancy-aware 3D Instance Segmentation. CVPR2020
OneFormer3Dcopyleft0.566 170.781 240.697 60.562 50.431 150.770 90.331 30.400 460.373 180.529 20.504 200.568 180.475 280.732 260.470 130.762 20.550 190.871 350.379 17
Maxim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: OneFormer3D: One Transformer for Unified Point Cloud Segmentation.
Mask-Group0.434 360.778 250.516 380.471 190.330 330.658 340.029 470.526 290.249 330.256 400.400 350.309 320.384 470.296 640.368 260.575 380.425 360.877 330.362 24
Min Zhong, Xinghao Chen, Xiaokang Chen, Gang Zeng, Yunhe Wang: MaskGroup: Hierarchical Point Grouping and Masking for 3D Instance Segmentation. ICME 2022
OSIS0.392 440.778 250.530 340.220 570.278 410.567 510.083 310.330 550.299 270.270 380.310 470.143 540.260 540.624 420.277 370.568 410.361 460.865 370.301 33
TST3D0.569 160.778 250.675 90.598 30.451 110.727 220.280 100.476 390.395 120.472 170.457 280.583 140.580 160.777 130.462 190.735 80.547 200.919 170.333 28
Duc Tran Dang Trung, Byeongkeun Kang, Yeejin Lee: MSTA3D: Multi-scale Twin-attention for 3D Instance Segmentation. ACM Multimedia 2024
KmaxOneFormerNetpermissive0.581 130.745 280.692 80.551 80.458 100.798 30.264 150.531 260.369 190.513 50.531 140.632 90.494 250.798 90.567 30.648 240.558 170.950 90.362 22
SPFormerpermissive0.549 210.745 280.640 140.484 160.395 200.739 170.311 50.566 180.335 210.468 180.492 210.555 200.478 270.747 230.436 200.712 120.540 210.893 270.343 27
Sun Jiahao, Qing Chunmei, Tan Junpeng, Xu Xiangmin: Superpoint Transformer for 3D Scene Instance Segmentation. AAAI 2023 [Oral]
Box2Mask0.433 370.741 300.463 500.433 300.283 400.625 440.103 270.298 590.125 490.260 390.424 330.322 300.472 290.701 340.363 270.711 130.309 580.882 300.272 44
Julian Chibane, Francis Engelmann, Tuan Anh Tran, Gerard Pons-Moll: Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation Using Bounding Boxes. ECCV 2022
DENet0.413 390.741 300.520 350.237 550.284 390.523 540.097 300.691 10.138 440.209 540.229 560.238 420.390 450.707 320.310 310.448 590.470 290.892 280.310 32
SSTNetpermissive0.506 250.738 320.549 310.497 140.316 350.693 270.178 220.377 490.198 380.330 280.463 270.576 170.515 230.857 40.494 40.637 280.457 300.943 110.290 38
Zhihao Liang, Zhihao Li, Songcen Xu, Mingkui Tan, Kui Jia: Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks. ICCV2021
CSC-Pretrained0.405 410.738 320.465 490.331 480.205 530.655 360.051 400.601 100.092 540.211 530.329 440.198 470.459 300.775 140.195 500.524 500.400 400.878 310.184 55
Competitor-SPFormer0.580 140.721 340.705 40.593 40.444 120.786 50.286 90.564 190.376 160.498 120.534 120.546 210.390 440.785 120.577 20.708 150.579 100.954 80.388 13
INS-Conv-instance0.435 350.716 350.495 450.355 440.331 320.689 280.102 280.394 480.208 370.280 350.395 360.250 390.544 180.741 250.309 320.536 480.391 420.842 480.258 46
Sparse R-CNN0.292 570.704 360.213 670.153 610.154 580.551 530.053 380.212 650.132 460.174 590.274 510.070 600.363 480.441 570.176 530.424 610.234 620.758 540.161 59
SoftGroup++0.513 240.704 360.578 250.398 360.363 300.704 240.061 360.647 50.297 290.378 260.537 100.343 270.614 130.828 50.295 340.710 140.505 260.875 340.394 11
TopoSeg0.479 300.704 360.564 270.467 200.366 280.633 410.068 330.554 220.262 310.328 290.447 310.323 290.534 200.722 270.288 360.614 290.482 280.912 190.358 25
HAISpermissive0.457 330.704 360.561 280.457 210.364 290.673 300.046 440.547 230.194 390.308 320.426 320.288 340.454 310.711 300.262 400.563 420.434 350.889 290.344 26
Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang: Hierarchical Aggregation for 3D Instance Segmentation. ICCV 2021
PCJC0.375 480.704 360.542 320.284 520.197 550.649 370.006 590.426 420.138 450.242 420.304 480.183 510.388 460.629 410.141 610.546 460.344 490.738 560.283 40
AOIA0.387 450.704 360.515 390.385 390.225 520.669 310.005 620.482 380.126 480.181 570.269 530.221 450.426 400.478 540.218 460.592 330.371 440.851 390.242 48
Occipital-SCS0.320 560.679 420.352 570.334 470.229 500.436 580.025 480.412 450.058 590.161 620.240 550.085 580.262 530.496 530.187 510.467 560.328 540.775 520.231 50
SSEC0.465 320.667 430.578 240.502 130.362 310.641 400.035 450.605 70.291 300.323 310.451 290.296 330.417 420.677 360.245 430.501 520.506 250.900 250.366 18
SALoss-ResNet0.262 590.667 430.335 580.067 680.123 620.427 600.022 490.280 630.058 580.216 490.211 590.039 640.142 640.519 500.106 650.338 650.310 570.721 590.138 60
Zhidong Liang, Ming Yang, Hao Li, Chunxiang Wang: 3D Instance Embedding Learning With a Structure-Aware Loss Function for Point Cloud Segmentation. IEEE Robotics and Automation Letters (IROS2020)
SoftGrouppermissive0.504 260.667 430.579 230.372 420.381 230.694 260.072 320.677 30.303 250.387 240.531 130.319 310.582 150.754 200.318 300.643 260.492 270.907 220.388 14
Thang Vu, Kookhoi Kim, Tung M. Luu, Xuan Thanh Nguyen, Chang D. Yoo: SoftGroup for 3D Instance Segmentaiton on Point Clouds. CVPR 2022 [Oral]
PE0.396 420.667 430.467 480.446 250.243 480.624 460.022 510.577 150.106 510.219 470.340 420.239 410.487 260.475 550.225 450.541 470.350 480.818 500.273 43
Biao Zhang, Peter Wonka: Point Cloud Instance Segmentation using Probabilistic Embeddings. CVPR 2021
ClickSeg_Instance0.366 490.654 470.375 540.184 590.302 380.592 490.050 420.300 580.093 530.283 330.277 500.249 400.426 410.615 430.299 330.504 510.367 450.832 490.191 53
SphereSeg0.357 500.651 480.411 520.345 450.264 450.630 420.059 370.289 610.212 350.240 430.336 430.158 530.305 510.557 450.159 570.455 580.341 510.726 580.294 35
Dyco3Dcopyleft0.395 430.642 490.518 370.447 240.259 460.666 320.050 410.251 640.166 410.231 450.362 400.232 430.331 500.535 470.229 440.587 350.438 340.850 400.317 31
Tong He; Chunhua Shen; Anton van den Hengel: DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution. CVPR2021
PointGroup0.407 400.639 500.496 440.415 330.243 490.645 390.021 520.570 170.114 500.211 520.359 410.217 460.428 370.660 390.256 410.562 430.341 500.860 380.291 36
Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia: PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation. CVPR 2020 [oral]
RPGN0.428 380.630 510.508 420.367 430.249 470.658 350.016 550.673 40.131 470.234 440.383 390.270 370.434 340.748 220.274 380.609 300.406 380.842 470.267 45
Shichao Dong, Guosheng Lin, Tzu-Yi Hung: Learning Regional Purity for Instance Segmentation on 3D Point Clouds. ECCV 2022
DD-UNet+Group0.436 340.630 510.508 430.480 170.310 370.624 450.065 340.638 60.174 400.256 410.384 380.194 480.428 360.759 190.289 350.574 390.400 390.849 420.291 37
H. Liu, R. Liu, K. Yang, J. Zhang, K. Peng, R. Stiefelhagen: HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor. ICCVW 2021
NeuralBF0.353 520.593 530.511 410.375 410.264 440.597 480.008 570.332 530.160 420.229 460.274 520.000 750.206 580.678 350.155 580.485 540.422 370.816 510.254 47
Weiwei Sun, Daniel Rebain, Renjie Liao, Vladimir Tankovich, Soroosh Yazdani, Kwang Moo Yi, Andrea Tagliasacchi: NeuralBF: Neural Bilateral Filtering for Top-down Instance Segmentation on Point Clouds. WACV 2023
Mask3D_evaluation0.382 470.593 530.520 360.390 380.314 360.600 470.018 540.287 620.151 430.281 340.387 370.169 520.429 350.654 400.172 540.578 370.384 430.670 610.278 41
GICN0.341 540.580 550.371 550.344 460.198 540.469 570.052 390.564 200.093 520.212 500.212 580.127 560.347 490.537 460.206 470.525 490.329 530.729 570.241 49
MTML0.282 580.577 560.380 530.182 600.107 640.430 590.001 650.422 430.057 600.179 580.162 610.070 610.229 560.511 520.161 550.491 530.313 550.650 640.162 57
Jean Lahoud, Bernard Ghanem, Marc Pollefeys, Martin R. Oswald: 3D Instance Segmentation via Multi-task Metric Learning. ICCV 2019 [oral]
SegGroup_inspermissive0.246 630.556 570.335 590.062 700.115 630.490 550.000 680.297 600.018 670.186 560.142 620.083 590.233 550.216 660.153 590.469 550.251 610.744 550.083 66
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
UNet-backbone0.161 650.519 580.259 650.084 640.059 670.325 690.002 630.093 700.009 690.077 680.064 670.045 630.044 710.161 680.045 670.331 660.180 650.566 650.033 75
3D-BoNet0.253 610.519 580.324 610.251 540.137 610.345 670.031 460.419 440.069 560.162 610.131 630.052 620.202 600.338 620.147 600.301 680.303 590.651 630.178 56
Bo Yang, Jianan Wang, Ronald Clark, Qingyong Hu, Sen Wang, Andrew Markham, Niki Trigoni: Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds. NeurIPS 2019 Spotlight
RWSeg0.348 530.475 600.456 510.320 490.275 430.476 560.020 530.491 350.056 610.212 510.320 450.261 380.302 520.520 490.182 520.557 440.285 600.867 360.197 52
One_Thing_One_Clickpermissive0.326 550.472 610.361 560.232 560.183 560.555 520.000 680.498 340.038 630.195 550.226 570.362 260.168 620.469 560.251 420.553 450.335 520.846 440.117 63
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
MASCpermissive0.254 600.463 620.249 660.113 620.167 570.412 620.000 670.374 500.073 550.173 600.243 540.130 550.228 570.368 600.160 560.356 630.208 630.711 600.136 61
Chen Liu, Yasutaka Furukawa: MASC: Multi-scale Affinity with Sparse Convolution for 3D Instance Segmentation.
3D-MPA0.355 510.457 630.484 470.299 500.277 420.591 500.047 430.332 520.212 360.217 480.278 490.193 490.413 430.410 580.195 490.574 400.352 470.849 410.213 51
Francis Engelmann, Martin Bokeloh, Alireza Fathi, Bastian Leibe, Matthias Nießner: 3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation. CVPR 2020
3D-SISpermissive0.161 650.407 640.155 720.068 670.043 710.346 660.001 640.134 670.005 700.088 650.106 660.037 650.135 660.321 630.028 710.339 640.116 720.466 680.093 65
Ji Hou, Angela Dai, Matthias Niessner: 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans. CVPR 2019
SPG_WSIS0.251 620.380 650.274 640.289 510.144 590.413 610.000 680.311 560.065 570.113 640.130 640.029 670.204 590.388 590.108 640.459 570.311 560.769 530.127 62
R-PointNet0.158 670.356 660.173 700.113 630.140 600.359 630.012 560.023 730.039 620.134 630.123 650.008 710.089 670.149 690.117 630.221 700.128 700.563 660.094 64
tmp0.113 720.333 670.151 730.056 710.053 690.344 680.000 680.105 690.016 680.049 720.035 710.020 690.053 680.048 720.013 730.183 740.173 660.344 720.054 72
SemRegionNet-20cls0.121 690.296 680.203 680.071 660.058 680.349 650.000 680.150 660.019 660.054 710.034 720.017 700.052 690.042 730.013 740.209 720.183 640.371 690.057 71
Sem_Recon_ins0.098 730.295 690.187 690.015 750.036 720.213 740.005 610.038 720.003 710.056 700.037 700.036 660.015 740.051 710.044 680.209 730.098 740.354 710.071 68
3D-BEVIS0.117 700.250 700.308 630.020 740.009 760.269 720.006 600.008 740.029 640.037 740.014 750.003 730.036 720.147 700.042 690.381 620.118 710.362 700.069 69
Cathrin Elich, Francis Engelmann, Jonas Schult, Theodora Kontogianni, Bastian Leibe: 3D-BEVIS: Birds-Eye-View Instance Segmentation.
PanopticFusion-inst0.214 640.250 700.330 600.275 530.103 650.228 730.000 680.345 510.024 650.088 660.203 600.186 500.167 630.367 610.125 620.221 710.112 730.666 620.162 58
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
Hier3Dcopyleft0.117 700.222 720.161 710.054 720.027 730.289 700.000 680.124 680.001 740.079 670.061 680.027 680.141 650.240 650.005 750.310 670.129 690.153 750.081 67
Tan: HCFS3D: Hierarchical Coupled Feature Selection Network for 3D Semantic and Instance Segmentation.
MaskRCNN 2d->3d Proj0.022 760.185 730.000 760.000 760.015 740.000 760.000 660.006 760.000 750.010 760.006 760.107 570.012 750.000 750.002 760.027 760.004 760.022 760.001 76
Region-18class0.146 680.175 740.321 620.080 650.062 660.357 640.000 680.307 570.002 720.066 690.044 690.000 750.018 730.036 740.054 660.447 600.133 680.472 670.060 70
ASIS0.085 740.037 750.080 750.066 690.047 700.282 710.000 680.052 710.002 730.047 730.026 730.001 740.046 700.194 670.031 700.264 690.140 670.167 740.047 74
Sgpn_scannet0.049 750.023 760.134 740.031 730.013 750.144 750.006 580.008 750.000 750.028 750.017 740.003 720.009 760.000 750.021 720.122 750.095 750.175 730.054 73