The mesh surface is manually annotated with long-tail labels, of which we pick the top 100 classes.
We rank methods similar to the ScanNet intersection-over-union metric (IoU). IoU = TP/(TP+FP+FN), where TP, FP, and FN are the numbers of true positive, false positive, and false negative 3D vertices, respectively.
Predicted labels are evaluated per-vertex over the vertices of 5% decimated 3D scan mesh (mesh_aligned_0.05.ply); for 3D approaches that operate on other representations like grids or points, the predicted labels should be mapped onto the mesh vertices.
Evaluation excludes the vertices which are anonymized. The list of these vertices is in mesh_aligned_0.05_mask.txt
The ScanNet++ ground truth contains multilabels; vertices that have more than one label in ambiguous cases. Hence, we evaluate the Top-1 and Top-3 performance of methods.
Submissions may either contain a single prediction per-vertex, or 3 predictions per-vertex.
Top-1 evaluation considers the top prediction for each vertex, and considers a prediction correct if it matches any ground truth label for that vertex.
Top-3 evaluation considers the top 3 predictions for each vertex, and considers a prediction correct if any of the top 3 predictions matches the ground truth. For multilabeled vertices, all labels in the ground truth must be present in the top 3 predictions for the prediction to be considered correct. Submissions with a single prediction per-vertex will be evaluated as if they had 3 predictions per-vertex, with the same prediction repeated 3 times.
Methods | MIOU | AIR VENT | BACKPACK | BAG | BASKET | BED | BINDER | BLANKET | BLIND RAIL | BLINDS | BOARD | BOOK | BOOKSHELF | BOTTLE | BOWL | BOX | BUCKET | CABINET | CEILING | CEILING LAMP | CHAIR | CLOCK | CLOTH | CLOTHES HANGER | COAT HANGER | COMPUTER TOWER | CONTAINER | CRATE | CUP | CURTAIN | CUSHION | CUTTING BOARD | DOOR | DOORFRAME | ELECTRICAL DUCT | EXHAUST FAN | FILE FOLDER | FLOOR | HEADPHONES | HEATER | JACKET | JAR | KETTLE | KEYBOARD | KITCHEN CABINET | KITCHEN COUNTER | LAPTOP | LIGHT SWITCH | MARKER | MICROWAVE | MONITOR | MOUSE | OFFICE CHAIR | PAINTING | PAN | PAPER | PAPER BAG | PAPER TOWEL | PICTURE | PILLOW | PIPE | PLANT | PLANT POT | POSTER | POT | POWER STRIP | PRINTER | RACK | REFRIGERATOR | SHELF | SHOE RACK | SHOES | SHOWER WALL | SINK | SLIPPERS | SMOKE DETECTOR | SOAP DISPENSER | SOCKET | SOFA | SPEAKER | SPRAY BOTTLE | STAPLER | STORAGE CABINET | SUITCASE | TABLE | TABLE LAMP | TAP | TELEPHONE | TISSUE BOX | TOILET | TOILET BRUSH | TOILET PAPER | TOWEL | TRASH CAN | TV | WALL | WHITEBOARD | WHITEBOARD ERASER | WINDOW | WINDOW FRAME | WINDOWSILL |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
PTv3 - PPT | 0.464 | 0.034 | 0.591 | 0.427 | 0.007 | 0.812 | 0.000 | 0.745 | 0.629 | 0.876 | 0.000 | 0.171 | 0.494 | 0.382 | 0.118 | 0.507 | 0.327 | 0.366 | 0.908 | 0.921 | 0.711 | 0.732 | 0.000 | 0.056 | 0.083 | 0.556 | 0.002 | 0.145 | 0.439 | 0.781 | 0.020 | 0.163 | 0.702 | 0.396 | 0.316 | 0.742 | 0.002 | 0.926 | 0.192 | 0.801 | 0.691 | 0.043 | 0.414 | 0.829 | 0.558 | 0.218 | 0.753 | 0.466 | 0.000 | 0.896 | 0.875 | 0.749 | 0.841 | 0.082 | 0.302 | 0.292 | 0.159 | 0.263 | 0.514 | 0.394 | 0.729 | 0.905 | 0.536 | 0.148 | 0.000 | 0.372 | 0.447 | 0.005 | 0.834 | 0.254 | 0.309 | 0.406 | 0.212 | 0.779 | 0.354 | 0.423 | 0.731 | 0.227 | 0.819 | 0.077 | 0.056 | 0.058 | 0.541 | 0.588 | 0.775 | 0.846 | 0.447 | 0.781 | 0.247 | 0.919 | 0.678 | 0.335 | 0.617 | 0.779 | 0.983 | 0.830 | 0.810 | 0.390 | 0.630 | 0.420 | 0.689 |
Xiaoyang Wu, Zhuotao Tian, Xin Wen, Bohao Peng, Xihui Liu, Kaicheng Yu, Hengshuang Zhao. Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt Training. CVPR 2024 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
PT-Fusion-All | 0.450 | 0.058 | 0.467 | 0.301 | 0.076 | 0.815 | 0.000 | 0.689 | 0.566 | 0.821 | 0.080 | 0.346 | 0.578 | 0.378 | 0.136 | 0.468 | 0.211 | 0.342 | 0.925 | 0.935 | 0.723 | 0.596 | 0.000 | 0.058 | 0.123 | 0.512 | 0.003 | 0.712 | 0.447 | 0.807 | 0.203 | 0.242 | 0.674 | 0.417 | 0.322 | 0.669 | 0.000 | 0.945 | 0.124 | 0.803 | 0.599 | 0.094 | 0.410 | 0.841 | 0.616 | 0.274 | 0.414 | 0.471 | 0.000 | 0.905 | 0.885 | 0.724 | 0.856 | 0.150 | 0.176 | 0.185 | 0.005 | 0.000 | 0.372 | 0.443 | 0.764 | 0.857 | 0.525 | 0.038 | 0.031 | 0.346 | 0.186 | 0.000 | 0.734 | 0.242 | 0.257 | 0.370 | 0.142 | 0.729 | 0.378 | 0.471 | 0.768 | 0.481 | 0.785 | 0.041 | 0.022 | 0.056 | 0.408 | 0.601 | 0.781 | 0.758 | 0.623 | 0.778 | 0.224 | 0.922 | 0.668 | 0.244 | 0.493 | 0.767 | 0.898 | 0.835 | 0.764 | 0.411 | 0.535 | 0.445 | 0.660 |
PTv2 | 0.427 | 0.073 | 0.463 | 0.219 | 0.003 | 0.679 | 0.000 | 0.667 | 0.597 | 0.873 | 0.007 | 0.187 | 0.523 | 0.435 | 0.295 | 0.461 | 0.101 | 0.369 | 0.916 | 0.902 | 0.712 | 0.727 | 0.000 | 0.053 | 0.037 | 0.546 | 0.090 | 0.549 | 0.412 | 0.791 | 0.009 | 0.035 | 0.671 | 0.356 | 0.316 | 0.660 | 0.003 | 0.934 | 0.050 | 0.758 | 0.651 | 0.058 | 0.621 | 0.811 | 0.596 | 0.227 | 0.218 | 0.428 | 0.021 | 0.812 | 0.853 | 0.740 | 0.799 | 0.000 | 0.016 | 0.262 | 0.000 | 0.236 | 0.192 | 0.346 | 0.766 | 0.862 | 0.390 | 0.042 | 0.000 | 0.259 | 0.278 | 0.000 | 0.605 | 0.290 | 0.265 | 0.399 | 0.196 | 0.718 | 0.541 | 0.447 | 0.676 | 0.232 | 0.756 | 0.251 | 0.046 | 0.051 | 0.457 | 0.541 | 0.737 | 0.650 | 0.488 | 0.721 | 0.242 | 0.880 | 0.685 | 0.272 | 0.496 | 0.674 | 0.776 | 0.823 | 0.800 | 0.335 | 0.609 | 0.394 | 0.670 |
Xiaoyang Wu, Yixing Lao, Li Jiang, Xihui Liu, Hengshuang Zhao. Point Transformer V2: Grouped Vector Attention and Partition-based Pooling. NeurIPS 2022 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
PonderV2-SparseUNet-base | 0.386 | 0.000 | 0.389 | 0.110 | 0.054 | 0.739 | 0.000 | 0.565 | 0.530 | 0.822 | 0.000 | 0.382 | 0.548 | 0.325 | 0.000 | 0.361 | 0.315 | 0.396 | 0.917 | 0.929 | 0.730 | 0.547 | 0.000 | 0.050 | 0.000 | 0.445 | 0.000 | 0.339 | 0.471 | 0.832 | 0.232 | 0.212 | 0.715 | 0.457 | 0.250 | 0.397 | 0.000 | 0.945 | 0.000 | 0.775 | 0.623 | 0.000 | 0.550 | 0.804 | 0.526 | 0.199 | 0.404 | 0.473 | 0.000 | 0.735 | 0.874 | 0.759 | 0.858 | 0.000 | 0.003 | 0.154 | 0.000 | 0.118 | 0.230 | 0.377 | 0.770 | 0.844 | 0.325 | 0.037 | 0.000 | 0.000 | 0.139 | 0.034 | 0.750 | 0.289 | 0.237 | 0.372 | 0.107 | 0.668 | 0.315 | 0.404 | 0.000 | 0.445 | 0.738 | 0.164 | 0.000 | 0.037 | 0.442 | 0.481 | 0.765 | 0.601 | 0.350 | 0.634 | 0.093 | 0.894 | 0.000 | 0.155 | 0.299 | 0.743 | 0.910 | 0.834 | 0.775 | 0.000 | 0.478 | 0.342 | 0.641 |
Haoyi Zhu, Honghui Yang, Xiaoyang Wu, Di Huang, Sha Zhang, Xianglong He, Hengshuang Zhao, Chunhua Shen, Yu Qiao, Tong He, Wanli Ouyang. PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm. Arxiv, 2023 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
MinkowskiNet | 0.292 | 0.000 | 0.323 | 0.116 | 0.039 | 0.719 | 0.000 | 0.418 | 0.031 | 0.726 | 0.005 | 0.217 | 0.481 | 0.178 | 0.000 | 0.212 | 0.084 | 0.286 | 0.879 | 0.880 | 0.627 | 0.211 | 0.000 | 0.002 | 0.278 | 0.377 | 0.000 | 0.207 | 0.320 | 0.737 | 0.001 | 0.000 | 0.579 | 0.252 | 0.170 | 0.047 | 0.000 | 0.923 | 0.000 | 0.603 | 0.614 | 0.000 | 0.149 | 0.506 | 0.451 | 0.240 | 0.025 | 0.011 | 0.000 | 0.531 | 0.837 | 0.369 | 0.815 | 0.028 | 0.000 | 0.069 | 0.000 | 0.130 | 0.238 | 0.357 | 0.721 | 0.842 | 0.372 | 0.086 | 0.000 | 0.008 | 0.033 | 0.029 | 0.386 | 0.315 | 0.046 | 0.281 | 0.132 | 0.667 | 0.064 | 0.348 | 0.273 | 0.200 | 0.608 | 0.103 | 0.000 | 0.016 | 0.234 | 0.236 | 0.704 | 0.233 | 0.046 | 0.390 | 0.021 | 0.703 | 0.363 | 0.057 | 0.214 | 0.539 | 0.706 | 0.761 | 0.724 | 0.089 | 0.428 | 0.321 | 0.601 |
Christopher Choy, JunYoung Gwak, Silvio Savarese. 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. CVPR 2019 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
PTv3 | 0.458 | 0.057 | 0.613 | 0.423 | 0.023 | 0.710 | 0.000 | 0.707 | 0.626 | 0.871 | 0.002 | 0.239 | 0.524 | 0.421 | 0.204 | 0.452 | 0.140 | 0.383 | 0.915 | 0.915 | 0.749 | 0.706 | 0.000 | 0.052 | 0.010 | 0.444 | 0.047 | 0.436 | 0.472 | 0.772 | 0.230 | 0.194 | 0.710 | 0.389 | 0.277 | 0.757 | 0.000 | 0.933 | 0.155 | 0.773 | 0.687 | 0.106 | 0.797 | 0.855 | 0.615 | 0.263 | 0.704 | 0.461 | 0.000 | 0.814 | 0.878 | 0.766 | 0.879 | 0.000 | 0.210 | 0.228 | 0.138 | 0.245 | 0.251 | 0.421 | 0.746 | 0.871 | 0.545 | 0.020 | 0.000 | 0.426 | 0.318 | 0.002 | 0.729 | 0.275 | 0.305 | 0.397 | 0.182 | 0.665 | 0.335 | 0.454 | 0.721 | 0.205 | 0.796 | 0.191 | 0.051 | 0.046 | 0.491 | 0.550 | 0.771 | 0.745 | 0.467 | 0.811 | 0.258 | 0.904 | 0.679 | 0.258 | 0.542 | 0.743 | 0.857 | 0.825 | 0.758 | 0.463 | 0.618 | 0.458 | 0.691 |
Xiaoyang Wu, Li Jiang, Peng-Shuai Wang, Zhijian Liu, Xihui Liu, Yu Qiao, Wanli Ouyang, Tong He, Hengshuang Zhao. Point Transformer V3: Simpler, Faster, Stronger. CVPR 2024 Oral | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
OA-CNNs-Small | 0.392 | 0.000 | 0.478 | 0.137 | 0.021 | 0.649 | 0.000 | 0.523 | 0.541 | 0.879 | 0.072 | 0.161 | 0.523 | 0.339 | 0.121 | 0.374 | 0.168 | 0.329 | 0.895 | 0.902 | 0.683 | 0.625 | 0.000 | 0.055 | 0.035 | 0.483 | 0.001 | 0.140 | 0.386 | 0.769 | 0.093 | 0.098 | 0.629 | 0.378 | 0.323 | 0.573 | 0.011 | 0.932 | 0.247 | 0.757 | 0.642 | 0.053 | 0.401 | 0.736 | 0.535 | 0.164 | 0.137 | 0.448 | 0.015 | 0.786 | 0.874 | 0.703 | 0.801 | 0.022 | 0.193 | 0.128 | 0.208 | 0.217 | 0.183 | 0.279 | 0.721 | 0.861 | 0.338 | 0.125 | 0.000 | 0.189 | 0.239 | 0.010 | 0.540 | 0.244 | 0.296 | 0.359 | 0.066 | 0.634 | 0.198 | 0.416 | 0.620 | 0.201 | 0.811 | 0.152 | 0.015 | 0.047 | 0.415 | 0.400 | 0.739 | 0.445 | 0.518 | 0.634 | 0.152 | 0.860 | 0.572 | 0.195 | 0.468 | 0.577 | 0.941 | 0.810 | 0.747 | 0.312 | 0.510 | 0.330 | 0.646 |
Bohao Peng, Xiaoyang Wu, Li Jiang, Yukang Chen, Hengshuang Zhao, Zhuotao Tian, Jiaya Jia. OA-CNNs: Omni-Adaptive Sparse CNNs for 3D Semantic Segmentation. Journal of Intelligent & Fuzzy Systems, Volume 43, Issue 5 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
KPConv | 0.265 | 0.000 | 0.283 | 0.111 | 0.003 | 0.675 | 0.000 | 0.579 | 0.158 | 0.682 | 0.006 | 0.193 | 0.553 | 0.000 | 0.000 | 0.288 | 0.019 | 0.298 | 0.904 | 0.928 | 0.687 | 0.000 | 0.000 | 0.000 | 0.000 | 0.435 | 0.000 | 0.271 | 0.126 | 0.777 | 0.004 | 0.000 | 0.577 | 0.165 | 0.211 | 0.000 | 0.000 | 0.925 | 0.000 | 0.645 | 0.560 | 0.000 | 0.000 | 0.595 | 0.527 | 0.200 | 0.016 | 0.000 | 0.000 | 0.638 | 0.848 | 0.000 | 0.813 | 0.075 | 0.000 | 0.087 | 0.000 | 0.000 | 0.174 | 0.329 | 0.814 | 0.823 | 0.144 | 0.009 | 0.000 | 0.000 | 0.002 | 0.018 | 0.235 | 0.326 | 0.187 | 0.327 | 0.007 | 0.623 | 0.000 | 0.000 | 0.000 | 0.000 | 0.705 | 0.000 | 0.000 | 0.000 | 0.306 | 0.228 | 0.689 | 0.407 | 0.000 | 0.375 | 0.000 | 0.798 | 0.000 | 0.000 | 0.109 | 0.558 | 0.752 | 0.748 | 0.660 | 0.000 | 0.429 | 0.299 | 0.564 |
Hugues Thomas, Charles R. Qi, Jean-Emmanuel Deschaud, Beatriz Marcotegui, François Goulette, Leonidas J. Guibas. KPConv: Flexible and Deformable Convolution for Point Clouds. ICCV 2019 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
PointNet++ | 0.201 | 0.000 | 0.128 | 0.061 | 0.000 | 0.426 | 0.000 | 0.466 | 0.161 | 0.676 | 0.000 | 0.154 | 0.404 | 0.146 | 0.000 | 0.194 | 0.031 | 0.221 | 0.894 | 0.918 | 0.515 | 0.000 | 0.000 | 0.000 | 0.000 | 0.303 | 0.000 | 0.000 | 0.090 | 0.396 | 0.000 | 0.000 | 0.292 | 0.081 | 0.047 | 0.000 | 0.000 | 0.912 | 0.000 | 0.578 | 0.462 | 0.000 | 0.014 | 0.408 | 0.269 | 0.228 | 0.017 | 0.000 | 0.000 | 0.281 | 0.775 | 0.087 | 0.670 | 0.001 | 0.000 | 0.000 | 0.000 | 0.000 | 0.060 | 0.282 | 0.700 | 0.735 | 0.000 | 0.000 | 0.000 | 0.000 | 0.016 | 0.000 | 0.000 | 0.201 | 0.010 | 0.254 | 0.000 | 0.458 | 0.000 | 0.000 | 0.000 | 0.105 | 0.542 | 0.000 | 0.000 | 0.000 | 0.251 | 0.075 | 0.593 | 0.142 | 0.000 | 0.173 | 0.000 | 0.467 | 0.000 | 0.000 | 0.062 | 0.481 | 0.705 | 0.698 | 0.586 | 0.000 | 0.446 | 0.255 | 0.515 |
Charles R. Qi, Li Yi, Hao Su, Leonidas J. Guibas. PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space. NIPS 2017 |
Please refer to the submission instructions before making a submission
Submit results