Documentation - DeepDeform Benchmark

Submission format
Format for 2D Optical Flow Prediction
Format for Non-rigid RGB-D Reconstruction

Submission policy

The 370 sequences of the DeepDeform dataset release (train and validation set) may be used for learning the parameters of the algorithms. The test data should be used strictly for reporting the final results -- this benchmark is not meant for iterative testing sessions or parameter tweaking.

Parameter tuning is only allowed on the training data. Evaluating on the test data via this evaluation server must only be done once for the final system. It is not permitted to use it to train systems, for example by trying out different parameter values and choosing the best. Only one version must be evaluated (which performed best on the training/validation data). This is to avoid overfitting on the test data. Results of different parameter settings of an algorithm can therefore only be reported on the training set. To help enforcing this policy, we block updates to the test set results of a method for two weeks after a test set submission.

It is not permitted to register on this webpage with multiple e-mail addresses. We will ban users or domains if required.

Data

Download: If you would like to download the DeepDeform data, please fill out this google form and, once accepted, we will send you the link to download the data.

Tasks Data Requirements: For all tasks, both color and depth images can be used as input.

Evaluation Scripts: Additionally, we provide some helper scripts for loading, storing and visualizing flow images. The evaluation scripts used for each task are also provided.

Data Formats

RGB-D Data: 3D data is provided as RGB-D video sequences, where color and depth images are already aligned. Color images are provided as 8-bit RGB .jpg, and depth images as 16-bit .png (divide by 1000 to obtain depth in meters).

Camera Parameters: A 4x4 intrinsic matrix is given for every sequence (because different cameras were used for data capture, every sequence can have different intrinsic matrix). Since the color and depth images are aligned, no extrinsic transformation is necessary.

Optical Flow Data: Dense optical flow data is provided as custom binary image of resolution 640x480 with extension .oflow. Every pixel contains two values for flow in x and y direction, in pixels. Helper script to load binary flow images is provided here.

Scene Flow Data: Dense scene flow data is provided as custom binary image of resolution 640x480 with extension .sflow. Every pixel contains 3 values for flow in x, y and z direction, in meters. Helper script to load binary flow images is provided here.

Object Mask Data: A few frames per sequences also include foreground dynamic object annotation. The mask image is given as 16-bit .png image (1 for object, 0 for background).

Sparse Match Annotations: We provide manual sparse match annotations for a few frame pairs for every sequence. They are stored in .json format, with paths to corresponding source and target RGB-D frames, as a list of source and target pixels.

Sparse Occlusion Annotations: We provide manual sparse occlusion annotations for a few frame pairs for every sequence. They are stored in .json format, with paths to corresponding source and target RGB-D frames, as a list of occluded source pixels.

Dataset Split

We provide a dataset split of 340 train, 30 validation and 30 test sequences. We made sure that there is no instance overlap between train, validation and test sequences. Labels for each subset are given as:

train_matches.json and val_matches.json:
Manually annotated sparse matches.
train_dense.json and val_dense.json:
Densely aligned optical and scene flow images with the use of sparse matches as a guidance.
train_selfsupervised.json and val_selfsupervised.json:
Densely aligned optical and scene flow images using self-supervision (DynamicFusion pipeline) for a few sequences.
train_masks.json and val_masks.json:
Dynamic object annotations for a few frames per sequence.

You can use all provided training and validation data in any combination you want.

Submission format

Results for a method must be uploaded as a single .zip or .7z file (7z is preferred due to smaller file sizes), which when unzipped must contain in the root the prediction files. There must not be any additional files or folders in the archive except those specified below.

Format for 2D Optical Flow Prediction

For the 2D optical flow prediction task, results must be provided in a custom binary format .oflow, as defined here. For each frame pair in test_frame_pairs.json a dense optical flow has to be predicted and stored as SequenceID_ObjectID_SourceID_TargetID.oflow. For example, for frame pair from sequence seq000 with object instance Adult, source frame 000000 and target frame 000150, the output file should be named seq000_Adult_000000_000150.oflow.

A submission must contain an optical flow prediction for each test frame pair, e.g.: unzip_root/ |-- seq000_Adult_000000_000150.oflow |-- seq000_Adult_000000_000300.oflow |-- seq000_Adult_000000_000350.oflow ⋮ |-- seq029_adult_000900_001000.oflow

Format for Non-rigid RGB-D Reconstruction

For non-rigid RGB-D reconstruction task, results must be provided as a subset of .ply pointclouds for every test sequence. In order to estimate deformation error, these pointclouds need to be consistent, i.e. the number of points and their order should not change along the sequence. But since we want to evaluate the performance of non-rigid reconstruction algorithms even if they don't succeed to reconstruct entire sequence, we evaluate algorithm performance segment-wise. For every 100 frames, a consistent set of pointclouds needs to be provided, for every 50th frame in the SequenceID_SegmentID_FrameID. The pointcloud should then be stored as SequenceID_SegmentID_FrameID.ply. Finally, a consistent pointcloud set should also be provided at the end of the sequence, for every 50th frame. In order to know which dynamic object to reconstruct in each sequence, we provide canonical masks for every test sequence, i.e. object mask image for first frame of the sequence.

For instance, a submission should look like: unzip_root/ |-- seq000_100_000000.ply [-- CONSISTENT POINTS 0 --] |-- seq000_100_000050.ply [-- CONSISTENT POINTS 0 --] |-- seq000_100_000100.ply [-- CONSISTENT POINTS 0 --] |-- seq000_200_000000.ply [-- CONSISTENT POINTS 1 --] |-- seq000_200_000050.ply [-- CONSISTENT POINTS 1 --] |-- seq000_200_000100.ply [-- CONSISTENT POINTS 1 --] |-- seq000_200_000150.ply [-- CONSISTENT POINTS 1 --] |-- seq000_200_000200.ply [-- CONSISTENT POINTS 1 --] ⋮ ⋮ |-- seq000_420_000000.ply |-- seq000_420_000050.ply ⋮ ⋮ |-- seq000_420_000350.ply |-- seq000_420_000400.ply |-- seq001_100_000000.ply ⋮ ⋮ |-- seq029_1059_001050.ply