The 3D semantic instance prediction task involves detecting and segmenting the object in an 3D scan mesh.

Evaluation and metrics

Our evaluation ranks all methods according to the average precision for each class. We report the mean average precision AP at overlap 0.25 (AP 25%), overlap 0.5 (AP 50%), and over overlaps in the range [0.5:0.95:0.05] (AP). Note that multiple predictions of the same ground truth instance are penalized as false positives.



This table lists the benchmark results for the 3D semantic instance scenario.




Method Infoavg apbathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sort bysort bysort bysort bysort bysorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
ExtMask3D0.598 50.852 150.692 60.433 280.461 60.791 10.264 120.488 330.493 10.508 30.528 120.594 100.706 60.791 80.483 50.734 90.595 30.911 170.437 3
Competitor-SPFormer0.580 110.721 310.705 30.593 30.444 90.786 20.286 70.564 190.376 140.498 90.534 100.546 180.390 410.785 100.577 10.708 150.579 80.954 60.388 11
EV3D0.615 40.946 50.652 110.555 50.433 110.773 30.271 110.604 80.447 40.506 40.544 50.698 20.716 20.775 140.480 70.747 50.572 110.925 110.435 5
PointRel0.622 10.926 80.710 20.541 80.502 20.772 40.314 40.598 110.425 70.504 70.565 10.650 50.716 20.809 70.476 100.747 40.618 10.963 30.364 18
: Relation3D (PointRel): Enhancing Relation Modeling for Point Cloud Instance Segmentation.
InsSSM0.586 91.000 10.593 180.440 240.480 30.771 50.345 10.437 370.444 60.495 110.548 40.579 130.621 90.720 260.409 200.712 110.593 40.960 40.395 8
Lei Yao, Yi Wang, Moyun Liu, Lap-Pui Chau: SGIFormer: Semantic-guided and Geometric-enhanced Interleaving Transformer for 3D Instance Segmentation. TCSVT, 2024
OneFormer3Dcopyleft0.566 140.781 220.697 50.562 40.431 120.770 60.331 30.400 430.373 160.529 10.504 170.568 150.475 250.732 240.470 110.762 20.550 160.871 320.379 15
Maxim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: OneFormer3D: One Transformer for Unified Point Cloud Segmentation.
Spherical Mask(CtoF)0.616 30.946 50.654 100.555 50.434 100.769 70.271 100.604 80.447 40.505 50.549 20.698 20.716 20.775 140.480 70.747 50.575 90.925 110.436 4
SIM3D0.617 20.952 40.629 140.539 90.426 130.768 80.302 60.681 20.425 80.473 130.511 130.701 10.717 10.821 60.467 130.774 10.559 130.914 150.448 2
MAFT0.596 60.889 130.721 10.448 210.460 70.768 90.251 130.558 210.408 90.504 60.539 70.616 80.618 100.858 30.482 60.684 180.551 150.931 100.450 1
ISBNetpermissive0.559 160.939 70.655 90.383 370.426 140.763 100.180 180.534 250.386 110.499 80.509 150.621 70.427 350.704 300.467 120.649 210.571 120.948 70.401 7
Tuan Duc Ngo, Binh-Son Hua, Khoi Nguyen: ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution. CVPR 2023
MG-Former0.587 80.852 150.639 130.454 200.393 180.758 110.338 20.572 160.480 20.527 20.491 190.671 40.527 200.867 10.485 40.601 280.590 60.938 90.390 10
UniPerception0.588 70.963 30.667 80.493 130.472 50.750 120.229 160.528 260.468 30.498 100.542 60.643 60.530 190.661 350.463 140.695 170.599 20.972 10.420 6
DKNet0.532 190.815 190.624 150.517 100.377 210.749 130.107 230.509 300.304 210.437 180.475 200.581 120.539 170.775 130.339 260.640 240.506 210.901 210.385 13
Yizheng Wu, Min Shi, Shuaiyuan Du, Hao Lu, Zhiguo Cao, Weicai Zhong: 3D Instances as 1D Kernels. ECCV 2022
SPFormerpermissive0.549 180.745 260.640 120.484 140.395 170.739 140.311 50.566 180.335 180.468 150.492 180.555 170.478 240.747 210.436 170.712 120.540 180.893 240.343 24
Sun Jiahao, Qing Chunmei, Tan Junpeng, Xu Xiangmin: Superpoint Transformer for 3D Scene Instance Segmentation. AAAI 2023 [Oral]
Mask3D0.566 140.926 80.597 170.408 310.420 150.737 150.239 140.598 110.386 120.458 170.549 20.568 160.716 20.601 410.480 70.646 220.575 90.922 130.364 17
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
PBNetpermissive0.573 120.926 80.575 230.619 10.472 40.736 160.239 150.487 340.383 130.459 160.506 160.533 190.585 120.767 160.404 210.717 100.559 140.969 20.381 14
W.Zhao, Y.Yan, C.Yang, J.Ye,X.Yang,K.Huang: Divide and Conquer: 3D Instance Segmentation With Point-Wise Binarization. ICCV 2023
TD3Dpermissive0.489 250.852 150.511 370.434 260.322 310.735 170.101 260.512 290.355 170.349 240.468 230.283 320.514 220.676 340.268 360.671 190.510 200.908 180.329 27
Maksim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: Top-Down Beats Bottom-Up in 3D Instance Segmentation. WACV 2024
Queryformer0.583 100.926 80.702 40.393 340.504 10.733 180.276 90.527 270.373 150.479 120.534 90.533 200.697 70.720 270.436 180.745 70.592 50.958 50.363 19
TST3D0.569 130.778 230.675 70.598 20.451 80.727 190.280 80.476 360.395 100.472 140.457 250.583 110.580 140.777 110.462 160.735 80.547 170.919 140.333 25
Duc Tran Dang Trung, Byeongkeun Kang, Yeejin Lee: MSTA3D: Multi-scale Twin-attention for 3D Instance Segmentation. ACM Multimedia 2024
GraphCut0.552 171.000 10.611 160.438 250.392 190.714 200.139 210.598 130.327 190.389 200.510 140.598 90.427 360.754 190.463 150.761 30.588 70.903 200.329 26
SoftGroup++0.513 210.704 330.578 220.398 330.363 270.704 210.061 330.647 50.297 260.378 230.537 80.343 240.614 110.828 50.295 310.710 140.505 230.875 310.394 9
OccuSeg+instance0.486 260.802 210.536 300.428 290.369 230.702 220.205 170.331 510.301 230.379 220.474 210.327 250.437 300.862 20.485 30.601 290.394 380.846 420.273 39
Lei Han, Tian Zheng, Lan Xu, Lu Fang: OccuSeg: Occupancy-aware 3D Instance Segmentation. CVPR2020
SoftGrouppermissive0.504 230.667 400.579 200.372 390.381 200.694 230.072 290.677 30.303 220.387 210.531 110.319 280.582 130.754 180.318 270.643 230.492 240.907 190.388 12
Thang Vu, Kookhoi Kim, Tung M. Luu, Xuan Thanh Nguyen, Chang D. Yoo: SoftGroup for 3D Instance Segmentaiton on Point Clouds. CVPR 2022 [Oral]
SSTNetpermissive0.506 220.738 290.549 280.497 120.316 320.693 240.178 190.377 460.198 350.330 250.463 240.576 140.515 210.857 40.494 20.637 250.457 270.943 80.290 35
Zhihao Liang, Zhihao Li, Songcen Xu, Mingkui Tan, Kui Jia: Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks. ICCV2021
INS-Conv-instance0.435 320.716 320.495 420.355 410.331 290.689 250.102 250.394 450.208 340.280 320.395 330.250 360.544 160.741 230.309 290.536 450.391 390.842 450.258 43
DualGroup0.469 280.815 190.552 260.398 320.374 220.683 260.130 220.539 240.310 200.327 270.407 310.276 330.447 290.535 450.342 250.659 200.455 280.900 230.301 31
HAISpermissive0.457 300.704 330.561 250.457 190.364 260.673 270.046 410.547 230.194 360.308 290.426 290.288 310.454 280.711 280.262 370.563 390.434 320.889 260.344 23
Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang: Hierarchical Aggregation for 3D Instance Segmentation. ICCV 2021
AOIA0.387 420.704 330.515 360.385 360.225 490.669 280.005 590.482 350.126 450.181 540.269 500.221 420.426 370.478 510.218 430.592 300.371 410.851 360.242 45
Dyco3Dcopyleft0.395 400.642 460.518 340.447 220.259 430.666 290.050 380.251 610.166 380.231 420.362 370.232 400.331 470.535 440.229 410.587 320.438 310.850 370.317 28
Tong He; Chunhua Shen; Anton van den Hengel: DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution. CVPR2021
IPCA-Inst0.520 200.889 130.551 270.548 70.418 160.665 300.064 320.585 140.260 290.277 340.471 220.500 210.644 80.785 90.369 220.591 310.511 190.878 290.362 20
Mask-Group0.434 330.778 230.516 350.471 170.330 300.658 310.029 440.526 280.249 300.256 370.400 320.309 290.384 440.296 610.368 230.575 350.425 330.877 300.362 21
Min Zhong, Xinghao Chen, Xiaokang Chen, Gang Zeng, Yunhe Wang: MaskGroup: Hierarchical Point Grouping and Masking for 3D Instance Segmentation. ICME 2022
RPGN0.428 350.630 480.508 390.367 400.249 440.658 320.016 520.673 40.131 440.234 410.383 360.270 340.434 310.748 200.274 350.609 270.406 350.842 440.267 42
Shichao Dong, Guosheng Lin, Tzu-Yi Hung: Learning Regional Purity for Instance Segmentation on 3D Point Clouds. ECCV 2022
CSC-Pretrained0.405 380.738 290.465 460.331 450.205 500.655 330.051 370.601 100.092 510.211 500.329 410.198 440.459 270.775 120.195 470.524 470.400 370.878 280.184 52
PCJC0.375 450.704 330.542 290.284 490.197 520.649 340.006 560.426 390.138 420.242 390.304 450.183 480.388 430.629 380.141 580.546 430.344 460.738 530.283 37
SSEN0.384 430.852 150.494 430.192 550.226 480.648 350.022 470.398 440.299 250.277 330.317 430.231 410.194 580.514 480.196 450.586 330.444 290.843 430.184 51
Dongsu Zhang, Junha Chun, Sang Kyun Cha, Young Min Kim: Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with Deep Metric Learning. Arxiv
PointGroup0.407 370.639 470.496 410.415 300.243 460.645 360.021 490.570 170.114 470.211 490.359 380.217 430.428 340.660 360.256 380.562 400.341 470.860 350.291 33
Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia: PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation. CVPR 2020 [oral]
SSEC0.465 290.667 400.578 210.502 110.362 280.641 370.035 420.605 70.291 270.323 280.451 260.296 300.417 390.677 330.245 400.501 490.506 220.900 220.366 16
TopoSeg0.479 270.704 330.564 240.467 180.366 250.633 380.068 300.554 220.262 280.328 260.447 280.323 260.534 180.722 250.288 330.614 260.482 250.912 160.358 22
SphereSeg0.357 470.651 450.411 490.345 420.264 420.630 390.059 340.289 580.212 320.240 400.336 400.158 500.305 480.557 420.159 540.455 550.341 480.726 550.294 32
DANCENET0.504 230.926 80.579 190.472 160.367 240.626 400.165 200.432 380.221 310.408 190.449 270.411 220.564 150.746 220.421 190.707 160.438 300.846 400.288 36
Box2Mask0.433 340.741 270.463 470.433 270.283 370.625 410.103 240.298 560.125 460.260 360.424 300.322 270.472 260.701 310.363 240.711 130.309 550.882 270.272 41
Julian Chibane, Francis Engelmann, Tuan Anh Tran, Gerard Pons-Moll: Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation Using Bounding Boxes. ECCV 2022
DD-UNet+Group0.436 310.630 480.508 400.480 150.310 340.624 420.065 310.638 60.174 370.256 380.384 350.194 450.428 330.759 170.289 320.574 360.400 360.849 390.291 34
H. Liu, R. Liu, K. Yang, J. Zhang, K. Peng, R. Stiefelhagen: HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor. ICCVW 2021
PE0.396 390.667 400.467 450.446 230.243 450.624 430.022 480.577 150.106 480.219 440.340 390.239 380.487 230.475 520.225 420.541 440.350 450.818 470.273 40
Biao Zhang, Peter Wonka: Point Cloud Instance Segmentation using Probabilistic Embeddings. CVPR 2021
Mask3D_evaluation0.382 440.593 500.520 330.390 350.314 330.600 440.018 510.287 590.151 400.281 310.387 340.169 490.429 320.654 370.172 510.578 340.384 400.670 580.278 38
NeuralBF0.353 490.593 500.511 380.375 380.264 410.597 450.008 540.332 500.160 390.229 430.274 490.000 720.206 550.678 320.155 550.485 510.422 340.816 480.254 44
Weiwei Sun, Daniel Rebain, Renjie Liao, Vladimir Tankovich, Soroosh Yazdani, Kwang Moo Yi, Andrea Tagliasacchi: NeuralBF: Neural Bilateral Filtering for Top-down Instance Segmentation on Point Clouds. WACV 2023
ClickSeg_Instance0.366 460.654 440.375 510.184 560.302 350.592 460.050 390.300 550.093 500.283 300.277 470.249 370.426 380.615 400.299 300.504 480.367 420.832 460.191 50
3D-MPA0.355 480.457 600.484 440.299 470.277 390.591 470.047 400.332 490.212 330.217 450.278 460.193 460.413 400.410 550.195 460.574 370.352 440.849 380.213 48
Francis Engelmann, Martin Bokeloh, Alireza Fathi, Bastian Leibe, Matthias Nießner: 3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation. CVPR 2020
OSIS0.392 410.778 230.530 310.220 540.278 380.567 480.083 280.330 520.299 240.270 350.310 440.143 510.260 510.624 390.277 340.568 380.361 430.865 340.301 30
One_Thing_One_Clickpermissive0.326 520.472 580.361 530.232 530.183 530.555 490.000 650.498 310.038 600.195 520.226 540.362 230.168 590.469 530.251 390.553 420.335 490.846 410.117 60
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
Sparse R-CNN0.292 540.704 330.213 640.153 580.154 550.551 500.053 350.212 620.132 430.174 560.274 480.070 570.363 450.441 540.176 500.424 580.234 590.758 510.161 56
DENet0.413 360.741 270.520 320.237 520.284 360.523 510.097 270.691 10.138 410.209 510.229 530.238 390.390 420.707 290.310 280.448 560.470 260.892 250.310 29
SegGroup_inspermissive0.246 600.556 540.335 560.062 670.115 600.490 520.000 650.297 570.018 640.186 530.142 590.083 560.233 520.216 630.153 560.469 520.251 580.744 520.083 63
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
RWSeg0.348 500.475 570.456 480.320 460.275 400.476 530.020 500.491 320.056 580.212 480.320 420.261 350.302 490.520 460.182 490.557 410.285 570.867 330.197 49
GICN0.341 510.580 520.371 520.344 430.198 510.469 540.052 360.564 200.093 490.212 470.212 550.127 530.347 460.537 430.206 440.525 460.329 500.729 540.241 46
Occipital-SCS0.320 530.679 390.352 540.334 440.229 470.436 550.025 450.412 420.058 560.161 590.240 520.085 550.262 500.496 500.187 480.467 530.328 510.775 490.231 47
MTML0.282 550.577 530.380 500.182 570.107 610.430 560.001 620.422 400.057 570.179 550.162 580.070 580.229 530.511 490.161 520.491 500.313 520.650 610.162 54
Jean Lahoud, Bernard Ghanem, Marc Pollefeys, Martin R. Oswald: 3D Instance Segmentation via Multi-task Metric Learning. ICCV 2019 [oral]
SALoss-ResNet0.262 560.667 400.335 550.067 650.123 590.427 570.022 460.280 600.058 550.216 460.211 560.039 610.142 610.519 470.106 620.338 620.310 540.721 560.138 57
Zhidong Liang, Ming Yang, Hao Li, Chunxiang Wang: 3D Instance Embedding Learning With a Structure-Aware Loss Function for Point Cloud Segmentation. IEEE Robotics and Automation Letters (IROS2020)
SPG_WSIS0.251 590.380 620.274 610.289 480.144 560.413 580.000 650.311 530.065 540.113 610.130 610.029 640.204 560.388 560.108 610.459 540.311 530.769 500.127 59
MASCpermissive0.254 570.463 590.249 630.113 590.167 540.412 590.000 640.374 470.073 520.173 570.243 510.130 520.228 540.368 570.160 530.356 600.208 600.711 570.136 58
Chen Liu, Yasutaka Furukawa: MASC: Multi-scale Affinity with Sparse Convolution for 3D Instance Segmentation.
R-PointNet0.158 640.356 630.173 670.113 600.140 570.359 600.012 530.023 700.039 590.134 600.123 620.008 680.089 640.149 660.117 600.221 670.128 670.563 630.094 61
Region-18class0.146 650.175 710.321 590.080 620.062 630.357 610.000 650.307 540.002 690.066 660.044 660.000 720.018 700.036 710.054 630.447 570.133 650.472 640.060 67
SemRegionNet-20cls0.121 660.296 650.203 650.071 630.058 650.349 620.000 650.150 630.019 630.054 680.034 690.017 670.052 660.042 700.013 710.209 690.183 610.371 660.057 68
3D-SISpermissive0.161 620.407 610.155 690.068 640.043 680.346 630.001 610.134 640.005 670.088 620.106 630.037 620.135 630.321 600.028 680.339 610.116 690.466 650.093 62
Ji Hou, Angela Dai, Matthias Niessner: 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans. CVPR 2019
3D-BoNet0.253 580.519 550.324 580.251 510.137 580.345 640.031 430.419 410.069 530.162 580.131 600.052 590.202 570.338 590.147 570.301 650.303 560.651 600.178 53
Bo Yang, Jianan Wang, Ronald Clark, Qingyong Hu, Sen Wang, Andrew Markham, Niki Trigoni: Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds. NeurIPS 2019 Spotlight
tmp0.113 690.333 640.151 700.056 680.053 660.344 650.000 650.105 660.016 650.049 690.035 680.020 660.053 650.048 690.013 700.183 710.173 630.344 690.054 69
UNet-backbone0.161 620.519 550.259 620.084 610.059 640.325 660.002 600.093 670.009 660.077 650.064 640.045 600.044 680.161 650.045 640.331 630.180 620.566 620.033 72
Hier3Dcopyleft0.117 670.222 690.161 680.054 690.027 700.289 670.000 650.124 650.001 710.079 640.061 650.027 650.141 620.240 620.005 720.310 640.129 660.153 720.081 64
Tan: HCFS3D: Hierarchical Coupled Feature Selection Network for 3D Semantic and Instance Segmentation.
ASIS0.085 710.037 720.080 720.066 660.047 670.282 680.000 650.052 680.002 700.047 700.026 700.001 710.046 670.194 640.031 670.264 660.140 640.167 710.047 71
3D-BEVIS0.117 670.250 670.308 600.020 710.009 730.269 690.006 570.008 710.029 610.037 710.014 720.003 700.036 690.147 670.042 660.381 590.118 680.362 670.069 66
Cathrin Elich, Francis Engelmann, Jonas Schult, Theodora Kontogianni, Bastian Leibe: 3D-BEVIS: Birds-Eye-View Instance Segmentation.
PanopticFusion-inst0.214 610.250 670.330 570.275 500.103 620.228 700.000 650.345 480.024 620.088 630.203 570.186 470.167 600.367 580.125 590.221 680.112 700.666 590.162 55
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
Sem_Recon_ins0.098 700.295 660.187 660.015 720.036 690.213 710.005 580.038 690.003 680.056 670.037 670.036 630.015 710.051 680.044 650.209 700.098 710.354 680.071 65
Sgpn_scannet0.049 720.023 730.134 710.031 700.013 720.144 720.006 550.008 720.000 720.028 720.017 710.003 690.009 730.000 720.021 690.122 720.095 720.175 700.054 70
MaskRCNN 2d->3d Proj0.022 730.185 700.000 730.000 730.015 710.000 730.000 630.006 730.000 720.010 730.006 730.107 540.012 720.000 720.002 730.027 730.004 730.022 730.001 73