The 3D semantic instance prediction task involves detecting and segmenting the object in an 3D scan mesh.

Evaluation and metrics

Our evaluation ranks all methods according to the average precision for each class. We report the mean average precision AP at overlap 0.25 (AP 25%), overlap 0.5 (AP 50%), and over overlaps in the range [0.5:0.95:0.05] (AP). Note that multiple predictions of the same ground truth instance are penalized as false positives.



This table lists the benchmark results for the 3D semantic instance scenario.




Method Infoavg apbathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
sorted bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort bysort by
PointComp0.629 10.787 250.679 100.574 50.502 30.824 10.378 10.480 390.483 30.480 160.601 10.744 10.682 80.809 80.460 210.819 10.643 20.935 130.449 3
PointRel0.622 20.926 80.710 30.541 110.502 20.772 80.314 50.598 110.425 100.504 110.565 30.650 80.716 20.809 70.476 120.747 60.618 30.963 40.364 21
: Relation3D: Enhancing Relation Modeling for Point Cloud Instance Segmentation. CVPR 2025
Competitor-MAFT0.618 30.866 160.724 10.628 10.484 50.803 30.300 90.509 320.496 10.539 10.547 70.703 20.668 90.708 340.463 180.708 180.595 50.959 60.418 9
SIM3D0.617 40.952 40.629 190.539 120.426 170.768 120.302 80.681 20.425 110.473 180.511 170.701 30.717 10.821 60.467 150.774 20.559 160.914 200.448 4
Spherical Mask(CtoF)0.616 50.946 50.654 140.555 70.434 140.769 110.271 140.604 80.447 60.505 90.549 40.698 40.716 20.775 170.480 90.747 70.575 120.925 150.436 6
EV3D0.615 60.946 50.652 150.555 70.433 150.773 70.271 150.604 80.447 60.506 80.544 80.698 40.716 20.775 170.480 90.747 70.572 140.925 150.435 7
DCD0.614 70.892 130.633 180.434 300.495 40.810 20.292 100.501 330.408 120.525 50.582 20.688 60.625 110.801 90.608 10.672 220.649 10.965 30.476 1
ExtMask3D0.598 80.852 170.692 80.433 330.461 90.791 50.264 160.488 360.493 20.508 70.528 160.594 140.706 60.791 110.483 70.734 110.595 60.911 220.437 5
MAFT0.596 90.889 140.721 20.448 250.460 100.768 130.251 180.558 210.408 130.504 100.539 100.616 120.618 130.858 30.482 80.684 210.551 190.931 140.450 2
UniPerception0.588 100.963 30.667 120.493 160.472 80.750 170.229 210.528 270.468 50.498 140.542 90.643 90.530 230.661 410.463 170.695 200.599 40.972 10.420 8
MG-Former0.587 110.852 170.639 170.454 240.393 230.758 160.338 30.572 160.480 40.527 30.491 240.671 70.527 240.867 10.485 60.601 330.590 90.938 120.390 13
InsSSM0.586 121.000 10.593 230.440 280.480 60.771 90.345 20.437 420.444 90.495 150.548 60.579 180.621 120.720 300.409 250.712 130.593 70.960 50.395 11
Lei Yao, Yi Wang, Moyun Liu, Lap-Pui Chau: SGIFormer: Semantic-guided and Geometric-enhanced Interleaving Transformer for 3D Instance Segmentation. TCSVT, 2024
Queryformer0.583 130.926 80.702 50.393 390.504 10.733 230.276 130.527 280.373 190.479 170.534 120.533 250.697 70.720 310.436 230.745 90.592 80.958 70.363 22
KmaxOneFormerNetpermissive0.581 140.745 300.692 90.551 90.458 110.798 40.264 170.531 260.369 210.513 60.531 150.632 100.494 270.798 100.567 30.648 260.558 180.950 90.362 24
Competitor-SPFormer0.580 150.721 370.705 40.593 40.444 130.786 60.286 110.564 190.376 180.498 130.534 130.546 230.390 470.785 130.577 20.708 170.579 110.954 80.388 14
VDG-Uni3DSeg0.576 160.833 210.699 60.483 180.412 210.767 140.313 60.461 410.446 80.526 40.498 220.584 150.551 190.743 260.464 160.766 30.538 230.919 180.363 23
PBNetpermissive0.573 170.926 80.575 290.619 20.472 70.736 210.239 200.487 370.383 170.459 210.506 200.533 240.585 150.767 190.404 260.717 120.559 170.969 20.381 17
Weiguang Zhao, Yuyao Yan, Chaolong Yang, Jianan Ye, Xi Yang, Kaizhu Huang: Divide and Conquer: 3D Instance Segmentation With Point-Wise Binarization. ICCV 2023
TST3D0.569 180.778 270.675 110.598 30.451 120.727 240.280 120.476 400.395 140.472 190.457 300.583 160.580 170.777 140.462 200.735 100.547 210.919 190.333 30
Duc Tran Dang Trung, Byeongkeun Kang, Yeejin Lee: MSTA3D: Multi-scale Twin-attention for 3D Instance Segmentation. ACM Multimedia 2024
Mask3D0.566 190.926 80.597 220.408 360.420 190.737 200.239 190.598 110.386 160.458 220.549 40.568 210.716 20.601 470.480 90.646 270.575 120.922 170.364 20
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023
OneFormer3Dcopyleft0.566 190.781 260.697 70.562 60.431 160.770 100.331 40.400 480.373 200.529 20.504 210.568 200.475 310.732 280.470 130.762 40.550 200.871 370.379 18
Maxim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: OneFormer3D: One Transformer for Unified Point Cloud Segmentation.
ISBNetpermissive0.559 210.939 70.655 130.383 420.426 180.763 150.180 230.534 250.386 150.499 120.509 190.621 110.427 410.704 360.467 140.649 250.571 150.948 100.401 10
Tuan Duc Ngo, Binh-Son Hua, Khoi Nguyen: ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution. CVPR 2023
GraphCut0.552 221.000 10.611 210.438 290.392 240.714 250.139 270.598 130.327 250.389 250.510 180.598 130.427 420.754 220.463 190.761 50.588 100.903 250.329 32
SPFormerpermissive0.549 230.745 300.640 160.484 170.395 220.739 190.311 70.566 180.335 230.468 200.492 230.555 220.478 300.747 240.436 220.712 140.540 220.893 290.343 29
Sun Jiahao, Qing Chunmei, Tan Junpeng, Xu Xiangmin: Superpoint Transformer for 3D Scene Instance Segmentation. AAAI 2023 [Oral]
DKNet0.532 240.815 220.624 200.517 130.377 260.749 180.107 290.509 310.304 270.437 230.475 250.581 170.539 210.775 160.339 320.640 290.506 260.901 260.385 16
Yizheng Wu, Min Shi, Shuaiyuan Du, Hao Lu, Zhiguo Cao, Weicai Zhong: 3D Instances as 1D Kernels. ECCV 2022
IPCA-Inst0.520 250.889 140.551 330.548 100.418 200.665 350.064 380.585 140.260 350.277 400.471 270.500 260.644 100.785 120.369 280.591 370.511 240.878 340.362 25
SoftGroup++0.513 260.704 390.578 280.398 380.363 320.704 260.061 390.647 50.297 320.378 280.537 110.343 300.614 140.828 50.295 370.710 160.505 280.875 360.394 12
SSTNetpermissive0.506 270.738 340.549 340.497 150.316 380.693 290.178 240.377 520.198 410.330 310.463 290.576 190.515 250.857 40.494 40.637 300.457 320.943 110.290 41
Zhihao Liang, Zhihao Li, Songcen Xu, Mingkui Tan, Kui Jia: Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks. ICCV2021
SoftGrouppermissive0.504 280.667 460.579 260.372 440.381 250.694 280.072 350.677 30.303 280.387 260.531 140.319 340.582 160.754 210.318 330.643 280.492 290.907 240.388 15
Thang Vu, Kookhoi Kim, Tung M. Luu, Xuan Thanh Nguyen, Chang D. Yoo: SoftGroup for 3D Instance Segmentaiton on Point Clouds. CVPR 2022 [Oral]
DANCENET0.504 280.926 80.579 250.472 200.367 290.626 450.165 250.432 430.221 370.408 240.449 320.411 280.564 180.746 250.421 240.707 190.438 350.846 450.288 42
TD3Dpermissive0.489 300.852 170.511 430.434 310.322 370.735 220.101 320.512 300.355 220.349 300.468 280.283 380.514 260.676 400.268 420.671 230.510 250.908 230.329 33
Maksim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: Top-Down Beats Bottom-Up in 3D Instance Segmentation. WACV 2024
OccuSeg+instance0.486 310.802 240.536 360.428 340.369 280.702 270.205 220.331 570.301 290.379 270.474 260.327 310.437 360.862 20.485 50.601 340.394 430.846 470.273 45
Lei Han, Tian Zheng, Lan Xu, Lu Fang: OccuSeg: Occupancy-aware 3D Instance Segmentation. CVPR2020
TopoSeg0.479 320.704 390.564 300.467 220.366 300.633 430.068 360.554 220.262 340.328 320.447 330.323 320.534 220.722 290.288 390.614 310.482 300.912 210.358 27
DualGroup0.469 330.815 220.552 320.398 370.374 270.683 310.130 280.539 240.310 260.327 330.407 360.276 390.447 350.535 510.342 310.659 240.455 330.900 280.301 37
SSEC0.465 340.667 460.578 270.502 140.362 330.641 420.035 480.605 70.291 330.323 340.451 310.296 360.417 450.677 390.245 460.501 550.506 270.900 270.366 19
ODIN - Inspermissive0.463 350.738 340.589 240.344 480.358 340.560 540.139 260.393 510.331 240.373 290.392 390.496 270.493 280.709 330.377 270.599 350.359 490.752 570.332 31
Ayush Jain, Pushkal Katara, Nikolaos Gkanatsios, Adam W. Harley, Gabriel Sarch, Kriti Aggarwal, Vishrav Chaudhary, Katerina Fragkiadaki: ODIN: A Single Model for 2D and 3D Segmentation. CVPR 2024
HAISpermissive0.457 360.704 390.561 310.457 230.364 310.673 320.046 470.547 230.194 420.308 350.426 340.288 370.454 340.711 320.262 430.563 450.434 370.889 310.344 28
Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang: Hierarchical Aggregation for 3D Instance Segmentation. ICCV 2021
DD-UNet+Group0.436 370.630 540.508 460.480 190.310 400.624 470.065 370.638 60.174 430.256 440.384 410.194 510.428 390.759 200.289 380.574 420.400 410.849 440.291 40
H. Liu, R. Liu, K. Yang, J. Zhang, K. Peng, R. Stiefelhagen: HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor. ICCVW 2021
INS-Conv-instance0.435 380.716 380.495 480.355 460.331 350.689 300.102 310.394 500.208 400.280 380.395 380.250 420.544 200.741 270.309 350.536 510.391 440.842 500.258 49
Mask-Group0.434 390.778 270.516 410.471 210.330 360.658 360.029 500.526 290.249 360.256 430.400 370.309 350.384 500.296 670.368 290.575 410.425 380.877 350.362 26
Min Zhong, Xinghao Chen, Xiaokang Chen, Gang Zeng, Yunhe Wang: MaskGroup: Hierarchical Point Grouping and Masking for 3D Instance Segmentation. ICME 2022
Box2Mask0.433 400.741 320.463 530.433 320.283 430.625 460.103 300.298 620.125 520.260 420.424 350.322 330.472 320.701 370.363 300.711 150.309 610.882 320.272 47
Julian Chibane, Francis Engelmann, Tuan Anh Tran, Gerard Pons-Moll: Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation Using Bounding Boxes. ECCV 2022
RPGN0.428 410.630 540.508 450.367 450.249 500.658 370.016 580.673 40.131 500.234 470.383 420.270 400.434 370.748 230.274 410.609 320.406 400.842 490.267 48
Shichao Dong, Guosheng Lin, Tzu-Yi Hung: Learning Regional Purity for Instance Segmentation on 3D Point Clouds. ECCV 2022
DENet0.413 420.741 320.520 380.237 580.284 420.523 570.097 330.691 10.138 470.209 570.229 590.238 450.390 480.707 350.310 340.448 620.470 310.892 300.310 35
PointGroup0.407 430.639 530.496 470.415 350.243 520.645 410.021 550.570 170.114 530.211 550.359 440.217 490.428 400.660 420.256 440.562 460.341 530.860 400.291 39
Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia: PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation. CVPR 2020 [oral]
CSC-Pretrained0.405 440.738 340.465 520.331 510.205 560.655 380.051 430.601 100.092 570.211 560.329 470.198 500.459 330.775 150.195 530.524 530.400 420.878 330.184 58
PE0.396 450.667 460.467 510.446 270.243 510.624 480.022 540.577 150.106 540.219 500.340 450.239 440.487 290.475 580.225 480.541 500.350 510.818 520.273 46
Biao Zhang, Peter Wonka: Point Cloud Instance Segmentation using Probabilistic Embeddings. CVPR 2021
Dyco3Dcopyleft0.395 460.642 520.518 400.447 260.259 490.666 340.050 440.251 670.166 440.231 480.362 430.232 460.331 530.535 500.229 470.587 380.438 360.850 420.317 34
Tong He; Chunhua Shen; Anton van den Hengel: DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution. CVPR2021
OSIS0.392 470.778 270.530 370.220 600.278 440.567 530.083 340.330 580.299 300.270 410.310 500.143 570.260 570.624 450.277 400.568 440.361 480.865 390.301 36
AOIA0.387 480.704 390.515 420.385 410.225 550.669 330.005 650.482 380.126 510.181 600.269 560.221 480.426 430.478 570.218 490.592 360.371 460.851 410.242 51
SSEN0.384 490.852 170.494 490.192 610.226 540.648 400.022 530.398 490.299 310.277 390.317 490.231 470.194 640.514 540.196 510.586 390.444 340.843 480.184 57
Dongsu Zhang, Junha Chun, Sang Kyun Cha, Young Min Kim: Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with Deep Metric Learning. Arxiv
Mask3D_evaluation0.382 500.593 560.520 390.390 400.314 390.600 490.018 570.287 650.151 460.281 370.387 400.169 550.429 380.654 430.172 570.578 400.384 450.670 640.278 44
PCJC0.375 510.704 390.542 350.284 550.197 580.649 390.006 620.426 440.138 480.242 450.304 510.183 540.388 490.629 440.141 640.546 490.344 520.738 590.283 43
ClickSeg_Instance0.366 520.654 500.375 570.184 620.302 410.592 510.050 450.300 610.093 560.283 360.277 530.249 430.426 440.615 460.299 360.504 540.367 470.832 510.191 56
SphereSeg0.357 530.651 510.411 550.345 470.264 480.630 440.059 400.289 640.212 380.240 460.336 460.158 560.305 540.557 480.159 600.455 610.341 540.726 610.294 38
3D-MPA0.355 540.457 660.484 500.299 530.277 450.591 520.047 460.332 550.212 390.217 510.278 520.193 520.413 460.410 610.195 520.574 430.352 500.849 430.213 54
Francis Engelmann, Martin Bokeloh, Alireza Fathi, Bastian Leibe, Matthias Nießner: 3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation. CVPR 2020
NeuralBF0.353 550.593 560.511 440.375 430.264 470.597 500.008 600.332 560.160 450.229 490.274 550.000 780.206 610.678 380.155 610.485 570.422 390.816 530.254 50
Weiwei Sun, Daniel Rebain, Renjie Liao, Vladimir Tankovich, Soroosh Yazdani, Kwang Moo Yi, Andrea Tagliasacchi: NeuralBF: Neural Bilateral Filtering for Top-down Instance Segmentation on Point Clouds. WACV 2023
RWSeg0.348 560.475 630.456 540.320 520.275 460.476 590.020 560.491 350.056 640.212 540.320 480.261 410.302 550.520 520.182 550.557 470.285 630.867 380.197 55
GICN0.341 570.580 580.371 580.344 490.198 570.469 600.052 420.564 200.093 550.212 530.212 610.127 590.347 520.537 490.206 500.525 520.329 560.729 600.241 52
One_Thing_One_Clickpermissive0.326 580.472 640.361 590.232 590.183 590.555 550.000 710.498 340.038 660.195 580.226 600.362 290.168 650.469 590.251 450.553 480.335 550.846 460.117 66
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
Occipital-SCS0.320 590.679 450.352 600.334 500.229 530.436 610.025 510.412 470.058 620.161 650.240 580.085 610.262 560.496 560.187 540.467 590.328 570.775 540.231 53
Sparse R-CNN0.292 600.704 390.213 700.153 640.154 610.551 560.053 410.212 680.132 490.174 620.274 540.070 630.363 510.441 600.176 560.424 640.234 650.758 560.161 62
MTML0.282 610.577 590.380 560.182 630.107 670.430 620.001 680.422 450.057 630.179 610.162 640.070 640.229 590.511 550.161 580.491 560.313 580.650 670.162 60
Jean Lahoud, Bernard Ghanem, Marc Pollefeys, Martin R. Oswald: 3D Instance Segmentation via Multi-task Metric Learning. ICCV 2019 [oral]
SALoss-ResNet0.262 620.667 460.335 610.067 710.123 650.427 630.022 520.280 660.058 610.216 520.211 620.039 670.142 670.519 530.106 680.338 680.310 600.721 620.138 63
Zhidong Liang, Ming Yang, Hao Li, Chunxiang Wang: 3D Instance Embedding Learning With a Structure-Aware Loss Function for Point Cloud Segmentation. IEEE Robotics and Automation Letters (IROS2020)
MASCpermissive0.254 630.463 650.249 690.113 650.167 600.412 650.000 700.374 530.073 580.173 630.243 570.130 580.228 600.368 630.160 590.356 660.208 660.711 630.136 64
Chen Liu, Yasutaka Furukawa: MASC: Multi-scale Affinity with Sparse Convolution for 3D Instance Segmentation.
3D-BoNet0.253 640.519 610.324 640.251 570.137 640.345 700.031 490.419 460.069 590.162 640.131 660.052 650.202 630.338 650.147 630.301 710.303 620.651 660.178 59
Bo Yang, Jianan Wang, Ronald Clark, Qingyong Hu, Sen Wang, Andrew Markham, Niki Trigoni: Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds. NeurIPS 2019 Spotlight
SPG_WSIS0.251 650.380 680.274 670.289 540.144 620.413 640.000 710.311 590.065 600.113 670.130 670.029 700.204 620.388 620.108 670.459 600.311 590.769 550.127 65
SegGroup_inspermissive0.246 660.556 600.335 620.062 730.115 660.490 580.000 710.297 630.018 700.186 590.142 650.083 620.233 580.216 690.153 620.469 580.251 640.744 580.083 69
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022
PanopticFusion-inst0.214 670.250 730.330 630.275 560.103 680.228 760.000 710.345 540.024 680.088 690.203 630.186 530.167 660.367 640.125 650.221 740.112 760.666 650.162 61
Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear)
UNet-backbone0.161 680.519 610.259 680.084 670.059 700.325 720.002 660.093 730.009 720.077 710.064 700.045 660.044 740.161 710.045 700.331 690.180 680.566 680.033 78
3D-SISpermissive0.161 680.407 670.155 750.068 700.043 740.346 690.001 670.134 700.005 730.088 680.106 690.037 680.135 690.321 660.028 740.339 670.116 750.466 710.093 68
Ji Hou, Angela Dai, Matthias Niessner: 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans. CVPR 2019
R-PointNet0.158 700.356 690.173 730.113 660.140 630.359 660.012 590.023 760.039 650.134 660.123 680.008 740.089 700.149 720.117 660.221 730.128 730.563 690.094 67
Region-18class0.146 710.175 770.321 650.080 680.062 690.357 670.000 710.307 600.002 750.066 720.044 720.000 780.018 760.036 770.054 690.447 630.133 710.472 700.060 73
SemRegionNet-20cls0.121 720.296 710.203 710.071 690.058 710.349 680.000 710.150 690.019 690.054 740.034 750.017 730.052 720.042 760.013 770.209 750.183 670.371 720.057 74
3D-BEVIS0.117 730.250 730.308 660.020 770.009 790.269 750.006 630.008 770.029 670.037 770.014 780.003 760.036 750.147 730.042 720.381 650.118 740.362 730.069 72
Cathrin Elich, Francis Engelmann, Jonas Schult, Theodora Kontogianni, Bastian Leibe: 3D-BEVIS: Birds-Eye-View Instance Segmentation.
Hier3Dcopyleft0.117 730.222 750.161 740.054 750.027 760.289 730.000 710.124 710.001 770.079 700.061 710.027 710.141 680.240 680.005 780.310 700.129 720.153 780.081 70
Tan: HCFS3D: Hierarchical Coupled Feature Selection Network for 3D Semantic and Instance Segmentation.
tmp0.113 750.333 700.151 760.056 740.053 720.344 710.000 710.105 720.016 710.049 750.035 740.020 720.053 710.048 750.013 760.183 770.173 690.344 750.054 75
Sem_Recon_ins0.098 760.295 720.187 720.015 780.036 750.213 770.005 640.038 750.003 740.056 730.037 730.036 690.015 770.051 740.044 710.209 760.098 770.354 740.071 71
ASIS0.085 770.037 780.080 780.066 720.047 730.282 740.000 710.052 740.002 760.047 760.026 760.001 770.046 730.194 700.031 730.264 720.140 700.167 770.047 77
Sgpn_scannet0.049 780.023 790.134 770.031 760.013 780.144 780.006 610.008 780.000 780.028 780.017 770.003 750.009 790.000 780.021 750.122 780.095 780.175 760.054 76
MaskRCNN 2d->3d Proj0.022 790.185 760.000 790.000 790.015 770.000 790.000 690.006 790.000 780.010 790.006 790.107 600.012 780.000 780.002 790.027 790.004 790.022 790.001 79