Documentation
Training Data
Download
Limited Scene Reconstructions
Limited Annotations
Limited Bounding Boxes
Submission format
Format for 3D Semantic Label Prediction
Format for 3D Semantic Instance Prediction
Format for 3D Object Detection
Data
Download: ScanNet Data Efficient Benchmark uses ScanNet dataset. If you would like to download the ScanNet data, please fill out an agreement to the ScanNet Terms of Use and send it to us at the scannet group email. For more information regarding the ScanNet dataset, please see our git repo.
Submission policy
We release 2 configurations in this benchmark on Semantic Segmentation, Instance Segmentation and Object Detection tasks, i.e. Limited Scene Reconstructions (LR) and Limited Scene Annotations (LA). In LR, you are only allowed to train on limited scene reconstructions (with {1%, 5%, 10%, 20%} for instance/semantic segmentation tasks and {10%, 20%, 40%, 80%} for object detection task) of total 1201 training scenes. In LA, our benchmark considers four different training configurations on ScanNet including using {20, 50, 100, 200} labeled points per scene for semantic segmentation and instance segmentation; for object detection we provides {1,2,4,7} bounding boxes that can be used for training per scene.
Parameter tuning is only allowed on the training data. Evaluating on the test data via this evaluation server must only be done once for the final system. It is not permitted to use it to train systems, for example by trying out different parameter values and choosing the best. Only one version must be evaluated (which performed best on the training data). This is to avoid overfitting on the test data. Results of different parameter settings of an algorithm can therefore only be reported on the training set. To help enforcing this policy, we block updates to the test set results of a method for two weeks after a test set submission. You can split up the training data into training and validation sets yourself as you wish.
It is not permitted to register on this webpage with multiple e-mail addresses nor information misrepresenting the identity of the user. We will ban users or domains if required.
Training Data
Download
Using the ScanNet download script:
Download the limited reconstructions files (for instance/semantic segmentation tasks here and object detection task here), limited annotations files, and limited bounding box files.
Limited Reconstructions
We provide a scene list for each configuration limiting which scenes can be used for training. The file structure is as follows:
unzip_root/
For example, 20.txt includes 20% scene ids of total scene ids that you can use for training your model.
|-- 1.txt
|-- 5.txt
|-- 10.txt
|-- 20.txt
Limited Annotations
We provide the list of indices of points (aligned to .ply files), which can be used for training for each configuration for instance segmentation and semantic segmentation tasks. The file structure is as follows:
unzip_root/
For example,
|-- points20
|-- points50
|-- points100
|-- points200
# data efficiency by sampling points
The
if phase == DatasetPhase.Train:
sampled_inds = torch.load(PATH_FILE)
Limited Bounding Boxes
For Object Detection we provide instance id that can be used for training. The file structure is as follows:
unzip_root/
For example,
|-- bbox1
|-- bbox2
|-- bbox4
|-- bbox7
if split_set == 'train':
The
sampled_bbox = torch.load(PATH_FILE)
Note for some methods, you may need object centers as ground truth. However, the object centers can only be computed from the provided list of points, which could be shifted from the real object center. Because we assume only provided points are annotated
Submission format
For all tasks, you need to upload a zip or .7z file (7z is preferred due to smaller file sizes) including 4 submissions for 4 different configurations. Each submission should be one folder, the folder contains txt files organized according to different tasks.
Format for 3D Semantic Label Prediction
There should be a folder for each configuration, i.e {20, 50, 100, 200} for LA; {1, 5, 10, 20} for LR. A submission under each folder must contain a .txt prediction file for each test scan, named
unzip_root/
In each prediction file, results must be provided as class labels per vertex of the corresponding 3D scan mesh, i.e., for each vertex in the order provided by the
|-- 20
|-- scene0707_00.txt
|-- scene0708_00.txt
|-- scene0709_00.txt
⋮
|-- scene0806_00.txt
|-- 50
|-- scene0707_00.txt
|-- scene0708_00.txt
|-- scene0709_00.txt
⋮
|-- scene0806_00.txt
|-- 100
⋮
|-- 200
⋮
10
10
2
2
2
⋮
39
Format for 3D Semantic Instance Prediction
There should be a folder for each configuration, i.e {20, 50, 100, 200} for LA; {1, 5, 10, 20} for LR. A submission under each folder must contain a .txt prediction file for each test scan. Each text file should contain a line for each instance, containing the relative path to a binary mask of the instance, the predicted label id, and the confidence of the prediction. The result text files must be named according to the corresponding test scan, as
unzip_root/
Each prediction file for a scan should contain a list of instances, where an instance is: (1) the relative path to the predicted mask file, (2) the integer class label id, (3) the float confidence score.
Each line in the prediction file should correspond to one instance, and the three values above separated by spaces. Thus, the filenames in the prediction files must not contain spaces.
|-- 1
|-- scene0707_00.txt
|-- scene0708_00.txt
|-- scene0709_00.txt
⋮
|-- scene0806_00.txt
|-- predicted_masks/
|-- scene0707_00_000.txt
|-- scene0707_00_001.txt
⋮
|-- 5
⋮
|-- 10
⋮
|-- 20
⋮
The predicted instance mask file should provide a mask over the vertices of the scan mesh, i.e., for each vertex in the order provided by the
predicted_masks/scene0707_00_000.txt 10 0.7234
and
predicted_masks/scene0707_00_001.txt 36 0.9038
⋮
0
0
0
1
1
⋮
0
Format for 3D Object Detection
There should be a folder for each configuration, i.e {1,2,4,7} for LA; {10,20,40,80} for LR. A submission under each folder must contain a .txt prediction file for each test scan. The result text files must be named according to the corresponding test scan, as
unzip_root/
Each prediction file for a scan should contain a list of instances, where an instance is: (1) the bbox minx, miny, minz, maxx,maxy,maxz in world space (where .ply lives), (2) the integer class label id, (3) the float confidence score. Each line in the prediction file should correspond to one instance. E.g.,
|-- 1
|-- scene0707_00.txt
|-- scene0708_00.txt
|-- scene0709_00.txt
⋮
|-- scene0806_00.txt
|-- 2
|-- scene0707_00.txt
|-- scene0708_00.txt
|-- scene0709_00.txt
⋮
|-- scene0806_00.txt
|-- 4
⋮
|-- 7
⋮
5.25 1.64 0.09 5.83 2.24 0.92 5 0.75
5.60 5.67 1.19 6.16 6.44 1.44 9 0.43
⋮
-0.00 0.07 0.08 0.34 0.49 1.31 39 0.25