Limited Annotations Semantic Label Results

The 3D semantic labeling task involves predicting a semantic labeling of a 3D scan mesh.

Evaluation and metrics

Our evaluation ranks all methods according to the PASCAL VOC intersection-over-union metric (IoU). IoU = TP/(TP+FP+FN), where TP, FP, and FN are the numbers of true positive, false positive, and false negative pixels, respectively. Predicted labels are evaluated per-vertex over the respective 3D scan mesh; for 3D approaches that operate on other representations like grids or points, the predicted labels should be mapped onto the mesh vertices (e.g., one such example for grid to mesh vertices is provided in the evaluation helpers).

This table lists the benchmark results for the 3D semantic label with limited annotations scenario.

Method	avg iou	bathtub	bed	bookshelf	cabinet	chair	counter	curtain	desk	door	floor	otherfurniture	picture	refrigerator	shower curtain	sink	sofa	table	toilet	wall	window

GaIA	0.685 8	0.759 10	0.834 1	0.759 5	0.650 8	0.859 3	0.427 10	0.694 10	0.524 10	0.575 7	0.948 6	0.537 1	0.304 3	0.534 12	0.853 2	0.678 7	0.820 1	0.581 10	0.914 4	0.828 5	0.626 8
Min Seok Lee, Seok Woo Yang, and Sung Won Han: GaIA: Graphical Information gain based Attention Network for Weakly Supervised 3D Point Cloud Semantic Segmentation. WACV 2023
One-Thing-One-Click	0.694 4	0.760 9	0.815 2	0.706 13	0.684 5	0.840 6	0.492 4	0.701 9	0.557 7	0.596 5	0.972 2	0.497 4	0.281 4	0.709 2	0.757 8	0.689 6	0.789 4	0.600 7	0.907 7	0.864 1	0.671 4
Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu: One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation. CVPR 2021
ActiveST	0.748 1	0.984 1	0.804 3	0.759 5	0.720 2	0.849 5	0.516 2	0.791 3	0.670 1	0.654 2	0.974 1	0.495 5	0.382 1	0.811 1	0.828 5	0.787 1	0.780 6	0.640 2	0.952 1	0.861 3	0.701 1
Gengxin Liu, Oliver van Kaick, Hui Huang, Ruizhen Hu: Active Self-Training for Weakly Supervised 3D Scene Semantic Segmentation.
Q2E	0.743 2	0.984 1	0.803 4	0.770 1	0.725 1	0.881 1	0.572 1	0.806 2	0.663 2	0.665 1	0.972 2	0.506 3	0.305 2	0.652 6	0.829 4	0.761 2	0.809 2	0.660 1	0.951 2	0.862 2	0.682 2

Scratch_LA_SEM	0.643 12	0.699 13	0.793 5	0.718 12	0.636 10	0.816 11	0.411 11	0.707 8	0.490 12	0.574 8	0.948 6	0.448 10	0.173 13	0.559 10	0.689 12	0.604 11	0.722 12	0.556 11	0.853 12	0.820 8	0.651 5

LE	0.688 7	0.856 7	0.779 6	0.754 7	0.687 4	0.834 8	0.438 8	0.732 7	0.536 9	0.577 6	0.948 6	0.508 2	0.248 7	0.699 3	0.831 3	0.636 8	0.752 11	0.586 9	0.895 9	0.821 7	0.643 6

PointContrast_LA_SEM	0.653 11	0.717 12	0.775 7	0.754 7	0.626 11	0.804 13	0.391 12	0.689 11	0.485 13	0.572 9	0.945 12	0.448 10	0.232 9	0.603 8	0.813 6	0.591 12	0.775 9	0.537 12	0.885 10	0.816 9	0.608 10

DE-3DLearner LA	0.709 3	0.877 4	0.772 8	0.744 9	0.694 3	0.836 7	0.453 6	0.787 4	0.623 4	0.598 4	0.953 4	0.490 7	0.216 11	0.682 5	0.879 1	0.727 3	0.802 3	0.604 5	0.922 3	0.845 4	0.676 3
Ping-Chung Yu, Cheng Sun, Min Sun: Data Efficient 3D Learner via Knowledge Transferred from 2D Model. ECCV 2022
CSC_LA_SEM	0.665 10	0.857 6	0.756 9	0.763 4	0.647 9	0.852 4	0.432 9	0.684 12	0.543 8	0.514 12	0.948 6	0.469 8	0.179 12	0.599 9	0.702 11	0.620 10	0.789 4	0.614 4	0.911 5	0.815 11	0.607 11

WS3D_LA_Sem	0.694 4	0.895 3	0.743 10	0.767 2	0.675 6	0.826 10	0.496 3	0.817 1	0.612 5	0.613 3	0.947 10	0.460 9	0.254 6	0.558 11	0.811 7	0.710 5	0.776 8	0.616 3	0.874 11	0.822 6	0.603 12
Kangcheng Liu: WS3D: Weakly Supervised 3D Scene Segmentation with Region-Level Boundary Awareness and Instance Discrimination. European Conference on Computer Vision (ECCV), 2022
Viewpoint_BN_LA_AIR	0.669 9	0.847 8	0.732 11	0.724 11	0.613 12	0.827 9	0.443 7	0.742 6	0.562 6	0.551 10	0.947 10	0.441 12	0.218 10	0.650 7	0.753 9	0.621 9	0.765 10	0.601 6	0.905 8	0.814 12	0.618 9
Liyi Luo, Beiwen Tian, Hao Zhao, Guyue Zhou: Pointly-supervised 3D Scene Parsing with Viewpoint Bottleneck.
VIBUS	0.691 6	0.860 5	0.731 12	0.738 10	0.672 7	0.860 2	0.470 5	0.766 5	0.625 3	0.547 11	0.949 5	0.491 6	0.255 5	0.693 4	0.715 10	0.712 4	0.778 7	0.597 8	0.911 5	0.816 9	0.635 7
Beiwen Tian,Liyi Luo,Hao Zhao,Guyue Zhou: VIBUS: Data-efficient 3D Scene Parsing with VIewpoint Bottleneck and Uncertainty-Spectrum Modeling. ISPRS Journal of Photogrammetry and Remote Sensing
SQN_LA	0.598 13	0.741 11	0.681 13	0.766 3	0.482 13	0.805 12	0.389 13	0.658 13	0.499 11	0.437 13	0.936 13	0.386 13	0.243 8	0.422 13	0.663 13	0.552 13	0.700 13	0.519 13	0.809 13	0.750 13	0.515 13

3D Semantic Label with Limited Annotations Benchmark