3D Semantic Instance Benchmark
The 3D semantic instance prediction task involves detecting and segmenting the object in an 3D scan mesh.
Evaluation and metricsOur evaluation ranks all methods according to the average precision for each class. We report the mean average precision AP at overlap 0.25 (AP 25%), overlap 0.5 (AP 50%), and over overlaps in the range [0.5:0.95:0.05] (AP). Note that multiple predictions of the same ground truth instance are penalized as false positives.
This table lists the benchmark results for the 3D semantic instance scenario.
| Method | Info | avg ap 50% | bathtub | bed | bookshelf | cabinet | chair | counter | curtain | desk | door | otherfurniture | picture | refrigerator | shower curtain | sink | sofa | table | toilet | window |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| PointComp | 0.811 4 | 0.850 60 | 0.969 10 | 0.864 14 | 0.739 3 | 0.946 1 | 0.539 15 | 0.671 34 | 0.835 1 | 0.700 19 | 0.742 3 | 0.817 1 | 0.766 13 | 1.000 1 | 0.755 22 | 0.909 1 | 0.808 7 | 1.000 1 | 0.687 2 | |
| UniPerception | 0.800 9 | 1.000 1 | 0.930 14 | 0.872 11 | 0.727 5 | 0.862 27 | 0.454 22 | 0.764 13 | 0.820 2 | 0.746 8 | 0.706 13 | 0.750 8 | 0.772 10 | 0.926 49 | 0.764 20 | 0.818 29 | 0.826 2 | 0.997 42 | 0.660 4 | |
| Competitor-MAFT | 0.816 1 | 1.000 1 | 0.983 4 | 0.872 11 | 0.718 6 | 0.941 2 | 0.588 5 | 0.652 41 | 0.819 3 | 0.776 3 | 0.720 6 | 0.780 6 | 0.769 12 | 1.000 1 | 0.797 11 | 0.813 31 | 0.798 9 | 1.000 1 | 0.659 5 | |
| TD3D | 0.751 24 | 1.000 1 | 0.774 48 | 0.867 13 | 0.621 31 | 0.934 4 | 0.404 24 | 0.706 24 | 0.812 4 | 0.605 26 | 0.633 27 | 0.626 30 | 0.690 23 | 1.000 1 | 0.640 37 | 0.820 26 | 0.777 18 | 1.000 1 | 0.612 17 | |
| Maksim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: Top-Down Beats Bottom-Up in 3D Instance Segmentation. WACV 2024 | ||||||||||||||||||||
| Mask3D | 0.780 19 | 1.000 1 | 0.786 47 | 0.716 42 | 0.696 17 | 0.885 21 | 0.500 17 | 0.714 22 | 0.810 5 | 0.672 21 | 0.715 9 | 0.679 24 | 0.809 1 | 1.000 1 | 0.831 4 | 0.833 20 | 0.787 13 | 1.000 1 | 0.602 22 | |
| Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe: Mask3D for 3D Semantic Instance Segmentation. ICRA 2023 | ||||||||||||||||||||
| ExtMask3D | 0.789 15 | 1.000 1 | 0.988 2 | 0.756 37 | 0.706 13 | 0.912 14 | 0.429 23 | 0.647 43 | 0.806 6 | 0.755 7 | 0.673 18 | 0.689 23 | 0.772 11 | 1.000 1 | 0.789 16 | 0.852 13 | 0.811 6 | 1.000 1 | 0.617 15 | |
| MG-Former | 0.791 14 | 1.000 1 | 0.980 6 | 0.837 21 | 0.626 29 | 0.897 15 | 0.543 14 | 0.759 15 | 0.800 7 | 0.766 6 | 0.659 20 | 0.769 7 | 0.697 20 | 1.000 1 | 0.791 15 | 0.707 52 | 0.791 12 | 1.000 1 | 0.610 18 | |
| VDG-Uni3DSeg | 0.804 6 | 1.000 1 | 0.990 1 | 0.886 9 | 0.688 21 | 0.912 13 | 0.602 2 | 0.703 26 | 0.786 8 | 0.771 4 | 0.708 12 | 0.700 18 | 0.669 27 | 0.981 41 | 0.789 17 | 0.903 2 | 0.772 20 | 1.000 1 | 0.609 20 | |
| InsSSM | 0.799 11 | 1.000 1 | 0.915 16 | 0.710 44 | 0.729 4 | 0.925 7 | 0.664 1 | 0.670 35 | 0.770 9 | 0.766 5 | 0.739 4 | 0.737 9 | 0.700 17 | 1.000 1 | 0.792 14 | 0.829 23 | 0.815 5 | 0.997 42 | 0.625 12 | |
| Lei Yao, Yi Wang, Moyun Liu, Lap-Pui Chau: SGIFormer: Semantic-guided and Geometric-enhanced Interleaving Transformer for 3D Instance Segmentation. TCSVT, 2024 | ||||||||||||||||||||
| TST3D | 0.795 13 | 1.000 1 | 0.929 15 | 0.918 4 | 0.709 11 | 0.884 22 | 0.596 4 | 0.704 25 | 0.769 10 | 0.734 10 | 0.644 24 | 0.699 20 | 0.751 14 | 1.000 1 | 0.794 12 | 0.876 7 | 0.757 26 | 0.997 42 | 0.550 36 | |
| Duc Tran Dang Trung, Byeongkeun Kang, Yeejin Lee: MSTA3D: Multi-scale Twin-attention for 3D Instance Segmentation. ACM Multimedia 2024 | ||||||||||||||||||||
| Queryformer | 0.787 16 | 1.000 1 | 0.933 13 | 0.601 54 | 0.754 1 | 0.886 20 | 0.558 12 | 0.661 39 | 0.767 11 | 0.665 22 | 0.716 8 | 0.639 29 | 0.808 5 | 1.000 1 | 0.844 3 | 0.897 6 | 0.804 8 | 1.000 1 | 0.624 13 | |
| Competitor-SPFormer | 0.800 9 | 1.000 1 | 0.986 3 | 0.845 18 | 0.705 14 | 0.915 12 | 0.532 16 | 0.733 20 | 0.757 12 | 0.733 12 | 0.708 11 | 0.698 21 | 0.648 39 | 0.981 41 | 0.890 1 | 0.830 21 | 0.796 10 | 0.997 42 | 0.644 6 | |
| EV3D | 0.811 4 | 1.000 1 | 0.968 11 | 0.852 16 | 0.717 8 | 0.921 10 | 0.574 8 | 0.677 31 | 0.748 13 | 0.730 14 | 0.703 15 | 0.795 3 | 0.809 1 | 1.000 1 | 0.831 4 | 0.854 11 | 0.778 17 | 1.000 1 | 0.638 9 | |
| Spherical Mask(CtoF) | 0.812 3 | 1.000 1 | 0.973 8 | 0.852 16 | 0.718 7 | 0.917 11 | 0.574 7 | 0.677 31 | 0.748 13 | 0.729 15 | 0.715 9 | 0.795 3 | 0.809 1 | 1.000 1 | 0.831 4 | 0.854 11 | 0.787 13 | 1.000 1 | 0.638 8 | |
| MAFT | 0.786 17 | 1.000 1 | 0.894 21 | 0.807 25 | 0.694 18 | 0.893 18 | 0.486 18 | 0.674 33 | 0.740 15 | 0.786 1 | 0.704 14 | 0.727 12 | 0.739 15 | 1.000 1 | 0.707 28 | 0.849 15 | 0.756 27 | 1.000 1 | 0.685 3 | |
| SphereSeg | 0.680 35 | 1.000 1 | 0.856 24 | 0.744 38 | 0.618 34 | 0.893 17 | 0.151 42 | 0.651 42 | 0.713 16 | 0.537 34 | 0.579 39 | 0.430 48 | 0.651 28 | 1.000 1 | 0.389 62 | 0.744 46 | 0.697 31 | 0.991 53 | 0.601 23 | |
| PBNet | 0.747 25 | 1.000 1 | 0.818 32 | 0.837 22 | 0.713 10 | 0.844 29 | 0.457 21 | 0.647 43 | 0.711 17 | 0.614 24 | 0.617 31 | 0.657 27 | 0.650 29 | 1.000 1 | 0.692 29 | 0.822 25 | 0.765 24 | 1.000 1 | 0.595 25 | |
| Weiguang Zhao, Yuyao Yan, Chaolong Yang, Jianan Ye, Xi Yang, Kaizhu Huang: Divide and Conquer: 3D Instance Segmentation With Point-Wise Binarization. ICCV 2023 | ||||||||||||||||||||
| KmaxOneFormerNet | 0.783 18 | 0.903 58 | 0.981 5 | 0.794 29 | 0.706 12 | 0.931 5 | 0.561 11 | 0.701 27 | 0.706 18 | 0.727 16 | 0.697 16 | 0.731 11 | 0.689 24 | 1.000 1 | 0.856 2 | 0.750 43 | 0.761 25 | 1.000 1 | 0.599 24 | |
| SPFormer | 0.770 20 | 0.903 58 | 0.903 18 | 0.806 26 | 0.609 36 | 0.886 19 | 0.568 10 | 0.815 6 | 0.705 19 | 0.711 17 | 0.655 21 | 0.652 28 | 0.685 25 | 1.000 1 | 0.789 18 | 0.809 32 | 0.776 19 | 1.000 1 | 0.583 28 | |
| Sun Jiahao, Qing Chunmei, Tan Junpeng, Xu Xiangmin: Superpoint Transformer for 3D Scene Instance Segmentation. AAAI 2023 [Oral] | ||||||||||||||||||||
| PointRel | 0.816 1 | 1.000 1 | 0.971 9 | 0.908 6 | 0.743 2 | 0.923 9 | 0.573 9 | 0.714 22 | 0.695 20 | 0.734 11 | 0.747 2 | 0.725 13 | 0.809 1 | 1.000 1 | 0.814 9 | 0.899 5 | 0.820 4 | 1.000 1 | 0.610 19 | |
| : Relation3D: Enhancing Relation Modeling for Point Cloud Instance Segmentation. CVPR 2025 | ||||||||||||||||||||
| OneFormer3D | 0.801 8 | 1.000 1 | 0.973 7 | 0.909 5 | 0.698 16 | 0.928 6 | 0.582 6 | 0.668 37 | 0.685 21 | 0.780 2 | 0.687 17 | 0.698 22 | 0.702 16 | 1.000 1 | 0.794 13 | 0.900 4 | 0.784 15 | 0.986 55 | 0.635 10 | |
| Maxim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich: OneFormer3D: One Transformer for Unified Point Cloud Segmentation. | ||||||||||||||||||||
| ODIN - Ins | 0.693 34 | 1.000 1 | 0.880 22 | 0.647 49 | 0.620 32 | 0.779 47 | 0.336 28 | 0.501 62 | 0.681 22 | 0.577 28 | 0.595 34 | 0.679 25 | 0.683 26 | 1.000 1 | 0.709 27 | 0.816 30 | 0.637 41 | 0.770 71 | 0.557 34 | |
| Ayush Jain, Pushkal Katara, Nikolaos Gkanatsios, Adam W. Harley, Gabriel Sarch, Kriti Aggarwal, Vishrav Chaudhary, Katerina Fragkiadaki: ODIN: A Single Model for 2D and 3D Segmentation. CVPR 2024 | ||||||||||||||||||||
| OSIS | 0.605 51 | 1.000 1 | 0.801 41 | 0.599 55 | 0.535 46 | 0.728 55 | 0.286 30 | 0.436 66 | 0.679 23 | 0.491 39 | 0.433 54 | 0.256 59 | 0.404 66 | 0.857 51 | 0.620 40 | 0.724 49 | 0.510 64 | 1.000 1 | 0.539 37 | |
| ISBNet | 0.757 23 | 1.000 1 | 0.904 17 | 0.731 40 | 0.678 23 | 0.895 16 | 0.458 20 | 0.644 45 | 0.670 24 | 0.710 18 | 0.620 29 | 0.732 10 | 0.650 29 | 1.000 1 | 0.756 21 | 0.778 35 | 0.779 16 | 1.000 1 | 0.614 16 | |
| Tuan Duc Ngo, Binh-Son Hua, Khoi Nguyen: ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution. CVPR 2023 | ||||||||||||||||||||
| SIM3D | 0.803 7 | 1.000 1 | 0.967 12 | 0.863 15 | 0.692 20 | 0.924 8 | 0.552 13 | 0.732 21 | 0.667 25 | 0.732 13 | 0.662 19 | 0.796 2 | 0.789 9 | 1.000 1 | 0.803 10 | 0.864 8 | 0.766 23 | 1.000 1 | 0.643 7 | |
| SoftGroup++ | 0.769 21 | 1.000 1 | 0.803 40 | 0.937 1 | 0.684 22 | 0.865 24 | 0.213 39 | 0.870 2 | 0.664 26 | 0.571 29 | 0.758 1 | 0.702 17 | 0.807 6 | 1.000 1 | 0.653 35 | 0.902 3 | 0.792 11 | 1.000 1 | 0.626 11 | |
| DCD | 0.798 12 | 1.000 1 | 0.878 23 | 0.792 30 | 0.693 19 | 0.936 3 | 0.596 3 | 0.685 30 | 0.663 27 | 0.736 9 | 0.717 7 | 0.788 5 | 0.693 22 | 1.000 1 | 0.825 7 | 0.840 17 | 0.837 1 | 1.000 1 | 0.689 1 | |
| SSEN | 0.575 54 | 1.000 1 | 0.761 52 | 0.473 62 | 0.477 57 | 0.795 43 | 0.066 56 | 0.529 57 | 0.658 28 | 0.460 44 | 0.461 52 | 0.380 54 | 0.331 68 | 0.859 50 | 0.401 61 | 0.692 59 | 0.653 38 | 1.000 1 | 0.348 60 | |
| Dongsu Zhang, Junha Chun, Sang Kyun Cha, Young Min Kim: Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with Deep Metric Learning. Arxiv | ||||||||||||||||||||
| SoftGroup | 0.761 22 | 1.000 1 | 0.808 36 | 0.845 18 | 0.716 9 | 0.862 26 | 0.243 36 | 0.824 4 | 0.655 29 | 0.620 23 | 0.734 5 | 0.699 19 | 0.791 8 | 0.981 41 | 0.716 25 | 0.844 16 | 0.769 21 | 1.000 1 | 0.594 26 | |
| Thang Vu, Kookhoi Kim, Tung M. Luu, Xuan Thanh Nguyen, Chang D. Yoo: SoftGroup for 3D Instance Segmentaiton on Point Clouds. CVPR 2022 [Oral] | ||||||||||||||||||||
| IPCA-Inst | 0.731 27 | 1.000 1 | 0.788 46 | 0.884 10 | 0.698 15 | 0.788 45 | 0.252 34 | 0.760 14 | 0.646 30 | 0.511 37 | 0.637 26 | 0.665 26 | 0.804 7 | 1.000 1 | 0.644 36 | 0.778 36 | 0.747 29 | 1.000 1 | 0.561 32 | |
| TopoSeg | 0.725 28 | 1.000 1 | 0.806 39 | 0.933 2 | 0.668 25 | 0.758 50 | 0.272 33 | 0.734 19 | 0.630 31 | 0.549 33 | 0.654 22 | 0.606 31 | 0.697 21 | 0.966 46 | 0.612 41 | 0.839 18 | 0.754 28 | 1.000 1 | 0.573 29 | |
| GraphCut | 0.732 26 | 1.000 1 | 0.788 45 | 0.724 41 | 0.642 28 | 0.859 28 | 0.248 35 | 0.787 11 | 0.618 32 | 0.596 27 | 0.653 23 | 0.722 15 | 0.583 51 | 1.000 1 | 0.766 19 | 0.861 9 | 0.825 3 | 1.000 1 | 0.504 42 | |
| Mask-Group | 0.664 39 | 1.000 1 | 0.822 31 | 0.764 36 | 0.616 35 | 0.815 36 | 0.139 46 | 0.694 29 | 0.597 33 | 0.459 45 | 0.566 41 | 0.599 32 | 0.600 45 | 0.516 69 | 0.715 26 | 0.819 28 | 0.635 42 | 1.000 1 | 0.603 21 | |
| Min Zhong, Xinghao Chen, Xiaokang Chen, Gang Zeng, Yunhe Wang: MaskGroup: Hierarchical Point Grouping and Masking for 3D Instance Segmentation. ICME 2022 | ||||||||||||||||||||
| DualGroup | 0.694 33 | 1.000 1 | 0.799 42 | 0.811 24 | 0.622 30 | 0.817 34 | 0.376 26 | 0.805 9 | 0.590 34 | 0.487 41 | 0.568 40 | 0.525 38 | 0.650 29 | 0.835 59 | 0.600 42 | 0.829 22 | 0.655 37 | 1.000 1 | 0.526 38 | |
| SSEC | 0.707 30 | 1.000 1 | 0.850 25 | 0.924 3 | 0.648 26 | 0.747 53 | 0.162 41 | 0.862 3 | 0.572 35 | 0.520 35 | 0.624 28 | 0.549 34 | 0.649 38 | 1.000 1 | 0.560 46 | 0.706 53 | 0.768 22 | 1.000 1 | 0.591 27 | |
| DKNet | 0.718 29 | 1.000 1 | 0.814 33 | 0.782 31 | 0.619 33 | 0.872 23 | 0.224 37 | 0.751 17 | 0.569 36 | 0.677 20 | 0.585 36 | 0.724 14 | 0.633 41 | 0.981 41 | 0.515 51 | 0.819 27 | 0.736 30 | 1.000 1 | 0.617 14 | |
| Yizheng Wu, Min Shi, Shuaiyuan Du, Hao Lu, Zhiguo Cao, Weicai Zhong: 3D Instances as 1D Kernels. ECCV 2022 | ||||||||||||||||||||
| DANCENET | 0.680 35 | 1.000 1 | 0.807 37 | 0.733 39 | 0.600 37 | 0.768 49 | 0.375 27 | 0.543 55 | 0.538 37 | 0.610 25 | 0.599 32 | 0.498 39 | 0.632 43 | 0.981 41 | 0.739 24 | 0.856 10 | 0.633 44 | 0.882 66 | 0.454 51 | |
| OccuSeg+instance | 0.672 38 | 1.000 1 | 0.758 56 | 0.682 46 | 0.576 41 | 0.842 30 | 0.477 19 | 0.504 61 | 0.524 38 | 0.567 30 | 0.585 38 | 0.451 43 | 0.557 53 | 1.000 1 | 0.751 23 | 0.797 33 | 0.563 54 | 1.000 1 | 0.467 50 | |
| Lei Han, Tian Zheng, Lan Xu, Lu Fang: OccuSeg: Occupancy-aware 3D Instance Segmentation. CVPR2020 | ||||||||||||||||||||
| Box2Mask | 0.677 37 | 1.000 1 | 0.847 27 | 0.771 33 | 0.509 52 | 0.816 35 | 0.277 32 | 0.558 54 | 0.482 39 | 0.562 31 | 0.640 25 | 0.448 44 | 0.700 17 | 1.000 1 | 0.666 30 | 0.852 14 | 0.578 51 | 0.997 42 | 0.488 46 | |
| Julian Chibane, Francis Engelmann, Tuan Anh Tran, Gerard Pons-Moll: Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation Using Bounding Boxes. ECCV 2022 | ||||||||||||||||||||
| INS-Conv-instance | 0.657 40 | 1.000 1 | 0.760 54 | 0.667 48 | 0.581 39 | 0.863 25 | 0.323 29 | 0.655 40 | 0.477 40 | 0.473 43 | 0.549 43 | 0.432 47 | 0.650 29 | 1.000 1 | 0.655 33 | 0.738 47 | 0.585 50 | 0.944 58 | 0.472 49 | |
| 3D-MPA | 0.611 50 | 1.000 1 | 0.833 29 | 0.765 35 | 0.526 49 | 0.756 51 | 0.136 48 | 0.588 52 | 0.470 41 | 0.438 50 | 0.432 56 | 0.358 57 | 0.650 29 | 0.857 51 | 0.429 58 | 0.765 39 | 0.557 57 | 1.000 1 | 0.430 53 | |
| Francis Engelmann, Martin Bokeloh, Alireza Fathi, Bastian Leibe, Matthias Nießner: 3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation. CVPR 2020 | ||||||||||||||||||||
| HAIS | 0.699 31 | 1.000 1 | 0.849 26 | 0.820 23 | 0.675 24 | 0.808 39 | 0.279 31 | 0.757 16 | 0.465 42 | 0.517 36 | 0.596 33 | 0.559 33 | 0.600 45 | 1.000 1 | 0.654 34 | 0.767 38 | 0.676 34 | 0.994 51 | 0.560 33 | |
| Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang: Hierarchical Aggregation for 3D Instance Segmentation. ICCV 2021 | ||||||||||||||||||||
| Mask3D_evaluation | 0.631 48 | 1.000 1 | 0.829 30 | 0.606 53 | 0.646 27 | 0.836 31 | 0.068 55 | 0.511 59 | 0.462 43 | 0.507 38 | 0.619 30 | 0.389 53 | 0.610 44 | 1.000 1 | 0.432 57 | 0.828 24 | 0.673 35 | 0.788 70 | 0.552 35 | |
| Dyco3D | 0.641 44 | 1.000 1 | 0.841 28 | 0.893 7 | 0.531 47 | 0.802 41 | 0.115 51 | 0.588 52 | 0.448 44 | 0.438 49 | 0.537 46 | 0.430 49 | 0.550 54 | 0.857 51 | 0.534 49 | 0.764 40 | 0.657 36 | 0.987 54 | 0.568 30 | |
| Tong He; Chunhua Shen; Anton van den Hengel: DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution. CVPR2021 | ||||||||||||||||||||
| Sparse R-CNN | 0.515 60 | 1.000 1 | 0.538 72 | 0.282 66 | 0.468 58 | 0.790 44 | 0.173 40 | 0.345 68 | 0.429 45 | 0.413 57 | 0.484 49 | 0.176 63 | 0.595 50 | 0.591 67 | 0.522 50 | 0.668 61 | 0.476 65 | 0.986 56 | 0.327 62 | |
| DD-UNet+Group | 0.635 47 | 0.667 62 | 0.797 44 | 0.714 43 | 0.562 42 | 0.774 48 | 0.146 43 | 0.810 8 | 0.429 46 | 0.476 42 | 0.546 45 | 0.399 51 | 0.633 41 | 1.000 1 | 0.632 38 | 0.722 50 | 0.609 45 | 1.000 1 | 0.514 39 | |
| H. Liu, R. Liu, K. Yang, J. Zhang, K. Peng, R. Stiefelhagen: HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor. ICCVW 2021 | ||||||||||||||||||||
| SSTNet | 0.698 32 | 1.000 1 | 0.697 64 | 0.888 8 | 0.556 43 | 0.803 40 | 0.387 25 | 0.626 47 | 0.417 47 | 0.556 32 | 0.585 37 | 0.702 16 | 0.600 45 | 1.000 1 | 0.824 8 | 0.720 51 | 0.692 32 | 1.000 1 | 0.509 41 | |
| Zhihao Liang, Zhihao Li, Songcen Xu, Mingkui Tan, Kui Jia: Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks. ICCV2021 | ||||||||||||||||||||
| NeuralBF | 0.555 56 | 0.667 62 | 0.896 19 | 0.843 20 | 0.517 51 | 0.751 52 | 0.029 61 | 0.519 58 | 0.414 48 | 0.439 48 | 0.465 50 | 0.000 78 | 0.484 57 | 0.857 51 | 0.287 66 | 0.693 58 | 0.651 40 | 1.000 1 | 0.485 47 | |
| Weiwei Sun, Daniel Rebain, Renjie Liao, Vladimir Tankovich, Soroosh Yazdani, Kwang Moo Yi, Andrea Tagliasacchi: NeuralBF: Neural Bilateral Filtering for Top-down Instance Segmentation on Point Clouds. WACV 2023 | ||||||||||||||||||||
| AOIA | 0.601 52 | 1.000 1 | 0.761 53 | 0.687 45 | 0.485 55 | 0.828 32 | 0.008 66 | 0.663 38 | 0.405 49 | 0.405 58 | 0.425 57 | 0.490 40 | 0.596 48 | 0.714 62 | 0.553 48 | 0.779 34 | 0.597 47 | 0.992 52 | 0.424 55 | |
| CSC-Pretrained | 0.648 41 | 1.000 1 | 0.810 34 | 0.768 34 | 0.523 50 | 0.813 37 | 0.143 45 | 0.819 5 | 0.389 50 | 0.422 54 | 0.511 47 | 0.443 45 | 0.650 29 | 1.000 1 | 0.624 39 | 0.732 48 | 0.634 43 | 1.000 1 | 0.375 58 | |
| PointGroup | 0.636 46 | 1.000 1 | 0.765 51 | 0.624 51 | 0.505 54 | 0.797 42 | 0.116 50 | 0.696 28 | 0.384 51 | 0.441 47 | 0.559 42 | 0.476 41 | 0.596 48 | 1.000 1 | 0.666 30 | 0.756 42 | 0.556 58 | 0.997 42 | 0.513 40 | |
| Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia: PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation. CVPR 2020 [oral] | ||||||||||||||||||||
| RPGN | 0.643 43 | 1.000 1 | 0.758 55 | 0.582 60 | 0.539 44 | 0.826 33 | 0.046 59 | 0.765 12 | 0.372 52 | 0.436 51 | 0.588 35 | 0.539 37 | 0.650 29 | 1.000 1 | 0.577 43 | 0.750 44 | 0.653 39 | 0.997 42 | 0.495 45 | |
| Shichao Dong, Guosheng Lin, Tzu-Yi Hung: Learning Regional Purity for Instance Segmentation on 3D Point Clouds. ECCV 2022 | ||||||||||||||||||||
| GICN | 0.638 45 | 1.000 1 | 0.895 20 | 0.800 27 | 0.480 56 | 0.676 58 | 0.144 44 | 0.737 18 | 0.354 53 | 0.447 46 | 0.400 60 | 0.365 55 | 0.700 17 | 1.000 1 | 0.569 44 | 0.836 19 | 0.599 46 | 1.000 1 | 0.473 48 | |
| PE | 0.645 42 | 1.000 1 | 0.773 50 | 0.798 28 | 0.538 45 | 0.786 46 | 0.088 54 | 0.799 10 | 0.350 54 | 0.435 52 | 0.547 44 | 0.545 35 | 0.646 40 | 0.933 48 | 0.562 45 | 0.761 41 | 0.556 59 | 0.997 42 | 0.501 44 | |
| Biao Zhang, Peter Wonka: Point Cloud Instance Segmentation using Probabilistic Embeddings. CVPR 2021 | ||||||||||||||||||||
| PCJC | 0.578 53 | 1.000 1 | 0.810 35 | 0.583 59 | 0.449 59 | 0.813 38 | 0.042 60 | 0.603 50 | 0.341 55 | 0.490 40 | 0.465 51 | 0.410 50 | 0.650 29 | 0.835 59 | 0.264 68 | 0.694 57 | 0.561 55 | 0.889 63 | 0.504 43 | |
| DENet | 0.629 49 | 1.000 1 | 0.797 43 | 0.608 52 | 0.589 38 | 0.627 62 | 0.219 38 | 0.882 1 | 0.310 56 | 0.402 59 | 0.383 62 | 0.396 52 | 0.650 29 | 1.000 1 | 0.663 32 | 0.543 70 | 0.691 33 | 1.000 1 | 0.568 31 | |
| 3D-BoNet | 0.488 62 | 1.000 1 | 0.672 66 | 0.590 57 | 0.301 66 | 0.484 73 | 0.098 52 | 0.620 48 | 0.306 57 | 0.341 65 | 0.259 68 | 0.125 66 | 0.434 63 | 0.796 61 | 0.402 60 | 0.499 72 | 0.513 63 | 0.909 62 | 0.439 52 | |
| Bo Yang, Jianan Wang, Ronald Clark, Qingyong Hu, Sen Wang, Andrew Markham, Niki Trigoni: Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds. NeurIPS 2019 Spotlight | ||||||||||||||||||||
| MASC | 0.447 66 | 0.528 72 | 0.555 70 | 0.381 63 | 0.382 61 | 0.633 61 | 0.002 69 | 0.509 60 | 0.260 58 | 0.361 63 | 0.432 55 | 0.327 58 | 0.451 59 | 0.571 68 | 0.367 64 | 0.639 64 | 0.386 66 | 0.980 57 | 0.276 65 | |
| Chen Liu, Yasutaka Furukawa: MASC: Multi-scale Affinity with Sparse Convolution for 3D Instance Segmentation. | ||||||||||||||||||||
| SPG_WSIS | 0.470 64 | 0.667 62 | 0.685 65 | 0.677 47 | 0.372 62 | 0.562 67 | 0.000 71 | 0.482 63 | 0.244 59 | 0.316 67 | 0.298 65 | 0.052 73 | 0.442 62 | 0.857 51 | 0.267 67 | 0.702 54 | 0.559 56 | 1.000 1 | 0.287 64 | |
| RWSeg | 0.567 55 | 0.528 72 | 0.708 63 | 0.626 50 | 0.580 40 | 0.745 54 | 0.063 57 | 0.627 46 | 0.240 60 | 0.400 60 | 0.497 48 | 0.464 42 | 0.515 55 | 1.000 1 | 0.475 53 | 0.745 45 | 0.571 52 | 1.000 1 | 0.429 54 | |
| ClickSeg_Instance | 0.539 58 | 1.000 1 | 0.621 67 | 0.300 65 | 0.530 48 | 0.698 56 | 0.127 49 | 0.533 56 | 0.222 61 | 0.430 53 | 0.400 59 | 0.365 55 | 0.574 52 | 0.938 47 | 0.472 54 | 0.659 62 | 0.543 60 | 0.944 58 | 0.347 61 | |
| SALoss-ResNet | 0.459 65 | 1.000 1 | 0.737 58 | 0.159 76 | 0.259 68 | 0.587 65 | 0.138 47 | 0.475 64 | 0.217 62 | 0.416 56 | 0.408 58 | 0.128 65 | 0.315 69 | 0.714 62 | 0.411 59 | 0.536 71 | 0.590 49 | 0.873 67 | 0.304 63 | |
| Zhidong Liang, Ming Yang, Hao Li, Chunxiang Wang: 3D Instance Embedding Learning With a Structure-Aware Loss Function for Point Cloud Segmentation. IEEE Robotics and Automation Letters (IROS2020) | ||||||||||||||||||||
| MTML | 0.549 57 | 1.000 1 | 0.807 38 | 0.588 58 | 0.327 64 | 0.647 60 | 0.004 68 | 0.815 7 | 0.180 63 | 0.418 55 | 0.364 64 | 0.182 62 | 0.445 60 | 1.000 1 | 0.442 56 | 0.688 60 | 0.571 53 | 1.000 1 | 0.396 56 | |
| Jean Lahoud, Bernard Ghanem, Marc Pollefeys, Martin R. Oswald: 3D Instance Segmentation via Multi-task Metric Learning. ICCV 2019 [oral] | ||||||||||||||||||||
| Occipital-SCS | 0.512 61 | 1.000 1 | 0.716 60 | 0.509 61 | 0.506 53 | 0.611 63 | 0.092 53 | 0.602 51 | 0.177 64 | 0.346 64 | 0.383 61 | 0.165 64 | 0.442 61 | 0.850 58 | 0.386 63 | 0.618 66 | 0.543 61 | 0.889 63 | 0.389 57 | |
| PanopticFusion-inst | 0.478 63 | 0.667 62 | 0.712 62 | 0.595 56 | 0.259 69 | 0.550 69 | 0.000 71 | 0.613 49 | 0.175 65 | 0.250 70 | 0.434 53 | 0.437 46 | 0.411 65 | 0.857 51 | 0.485 52 | 0.591 69 | 0.267 75 | 0.944 58 | 0.359 59 | |
| Gaku Narita, Takashi Seno, Tomoya Ishikawa, Yohsuke Kaji: PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things. IROS 2019 (to appear) | ||||||||||||||||||||
| One_Thing_One_Click | 0.529 59 | 0.667 62 | 0.718 59 | 0.777 32 | 0.399 60 | 0.683 57 | 0.000 71 | 0.669 36 | 0.138 66 | 0.391 61 | 0.374 63 | 0.539 36 | 0.360 67 | 0.641 66 | 0.556 47 | 0.774 37 | 0.593 48 | 0.997 42 | 0.251 66 | |
| Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021 | ||||||||||||||||||||
| SegGroup_ins | 0.445 67 | 0.667 62 | 0.773 49 | 0.185 73 | 0.317 65 | 0.656 59 | 0.000 71 | 0.407 67 | 0.134 67 | 0.381 62 | 0.267 67 | 0.217 61 | 0.476 58 | 0.714 62 | 0.452 55 | 0.629 65 | 0.514 62 | 1.000 1 | 0.222 69 | |
| An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, Jie Zhou: SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation. TIP 2022 | ||||||||||||||||||||
| R-PointNet | 0.306 71 | 0.500 74 | 0.405 76 | 0.311 64 | 0.348 63 | 0.589 64 | 0.054 58 | 0.068 76 | 0.126 68 | 0.283 68 | 0.290 66 | 0.028 74 | 0.219 72 | 0.214 73 | 0.331 65 | 0.396 76 | 0.275 72 | 0.821 69 | 0.245 67 | |
| SemRegionNet-20cls | 0.250 73 | 0.333 75 | 0.613 68 | 0.229 70 | 0.163 72 | 0.493 71 | 0.000 71 | 0.304 69 | 0.107 69 | 0.147 75 | 0.100 74 | 0.052 72 | 0.231 70 | 0.119 75 | 0.039 75 | 0.445 74 | 0.325 68 | 0.654 73 | 0.141 74 | |
| 3D-BEVIS | 0.248 74 | 0.667 62 | 0.566 69 | 0.076 77 | 0.035 79 | 0.394 77 | 0.027 63 | 0.035 78 | 0.098 70 | 0.099 77 | 0.030 78 | 0.025 75 | 0.098 75 | 0.375 72 | 0.126 72 | 0.604 68 | 0.181 77 | 0.854 68 | 0.171 72 | |
| Cathrin Elich, Francis Engelmann, Jonas Schult, Theodora Kontogianni, Bastian Leibe: 3D-BEVIS: Birds-Eye-View Instance Segmentation. | ||||||||||||||||||||
| tmp | 0.248 74 | 0.667 62 | 0.437 74 | 0.188 72 | 0.153 74 | 0.491 72 | 0.000 71 | 0.208 72 | 0.094 71 | 0.153 74 | 0.099 75 | 0.057 71 | 0.217 73 | 0.119 75 | 0.039 75 | 0.466 73 | 0.302 70 | 0.640 74 | 0.140 75 | |
| UNet-backbone | 0.319 70 | 0.667 62 | 0.715 61 | 0.233 69 | 0.189 71 | 0.479 74 | 0.008 66 | 0.218 71 | 0.067 72 | 0.201 72 | 0.173 71 | 0.107 67 | 0.123 74 | 0.438 70 | 0.150 70 | 0.615 67 | 0.355 67 | 0.916 61 | 0.093 78 | |
| 3D-SIS | 0.382 68 | 1.000 1 | 0.432 75 | 0.245 68 | 0.190 70 | 0.577 66 | 0.013 65 | 0.263 70 | 0.033 73 | 0.320 66 | 0.240 69 | 0.075 69 | 0.422 64 | 0.857 51 | 0.117 73 | 0.699 55 | 0.271 74 | 0.883 65 | 0.235 68 | |
| Ji Hou, Angela Dai, Matthias Niessner: 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans. CVPR 2019 | ||||||||||||||||||||
| Sem_Recon_ins | 0.227 76 | 0.764 61 | 0.486 73 | 0.069 78 | 0.098 76 | 0.426 76 | 0.017 64 | 0.067 77 | 0.015 74 | 0.172 73 | 0.100 73 | 0.096 68 | 0.054 78 | 0.183 74 | 0.135 71 | 0.366 77 | 0.260 76 | 0.614 75 | 0.168 73 | |
| Hier3D | 0.323 69 | 0.667 62 | 0.542 71 | 0.264 67 | 0.157 73 | 0.550 68 | 0.000 71 | 0.205 73 | 0.009 75 | 0.270 69 | 0.218 70 | 0.075 69 | 0.500 56 | 0.688 65 | 0.007 79 | 0.698 56 | 0.301 71 | 0.459 76 | 0.200 70 | |
| Tan: HCFS3D: Hierarchical Coupled Feature Selection Network for 3D Semantic and Instance Segmentation. | ||||||||||||||||||||
| ASIS | 0.199 77 | 0.333 75 | 0.253 78 | 0.167 75 | 0.140 75 | 0.438 75 | 0.000 71 | 0.177 74 | 0.008 76 | 0.121 76 | 0.069 76 | 0.004 77 | 0.231 71 | 0.429 71 | 0.036 77 | 0.445 75 | 0.273 73 | 0.333 78 | 0.119 77 | |
| Region-18class | 0.284 72 | 0.250 78 | 0.751 57 | 0.228 71 | 0.270 67 | 0.521 70 | 0.000 71 | 0.468 65 | 0.008 77 | 0.205 71 | 0.127 72 | 0.000 78 | 0.068 76 | 0.070 77 | 0.262 69 | 0.652 63 | 0.323 69 | 0.740 72 | 0.173 71 | |
| Sgpn_scannet | 0.143 78 | 0.208 79 | 0.390 77 | 0.169 74 | 0.065 77 | 0.275 78 | 0.029 62 | 0.069 75 | 0.000 78 | 0.087 78 | 0.043 77 | 0.014 76 | 0.027 79 | 0.000 78 | 0.112 74 | 0.351 78 | 0.168 78 | 0.438 77 | 0.138 76 | |
| MaskRCNN 2d->3d Proj | 0.058 79 | 0.333 75 | 0.002 79 | 0.000 79 | 0.053 78 | 0.002 79 | 0.002 70 | 0.021 79 | 0.000 78 | 0.045 79 | 0.024 79 | 0.238 60 | 0.065 77 | 0.000 78 | 0.014 78 | 0.107 79 | 0.020 79 | 0.110 79 | 0.006 79 | |
