Submitted by Philipp Foth.

Submission data

Full nameScanRefer with Transformer-based Object Detection
DescriptionScanRefer method, but replaced VoteNet with Group-Free Transformer-based 3D Object Detector.
Uses their provided weights for the 12 layer, double width, 256 candidates model as initialization, and therefor only XYZ (without height) as input features.
Input Data TypesUses XYZ coordinates
Programming language(s)Python
HardwareGeForce RTX 2080 Ti, 11GB RAM
Submission creation date8 Jul, 2021
Last edited8 Jul, 2021


Unique Unique Multiple Multiple Overall Overall