Result details - ScanNet Benchmark

Submitted anonymously.

Full name	Swin3D: A Pretrained Transformer Backbone for 3D Indoor Scene Understanding
Description	Our backbone network is based on a 3D Swin transformer and carefully designed to efficiently conduct self-attention on sparse voxels with linear memory complexity and capture the irregularity of point signals via generalized contextual relative positional embedding. Based on this backbone design, we pretrained a large Swin3D model on a synthetic Structured3D dataset and fine-tuned the pretrained model on ScanNet.
Input Data Types	Uses Color,Uses Geometry Uses 3D
Programming language(s)	Python and C++
Hardware	Tesla V100
Submission creation date	5 Feb, 2023
Last edited	24 Apr, 2023
Last uploaded	23 Apr, 2023

Info	avg iou	bathtub	bed	bookshelf	cabinet	chair	counter	curtain	desk	door	floor	otherfurniture	picture	refrigerator	shower curtain	sink	sofa	table	toilet	wall	window
	0.779	0.861	0.818	0.836	0.790	0.875	0.576	0.905	0.704	0.739	0.969	0.611	0.349	0.756	0.958	0.702	0.805	0.708	0.916	0.898	0.801

Results for Swin3D