Submitted by Dave Zhenyu Chen.

Submission data

Full namePointGroup + Transformer (w/o end-to-end fine-tuning)
DescriptionThis is the pretrained version of D3Net - PointGroup + Transformer (w/o end-to-end fine-tuning)
Publication titleD3Net: A Unified Speaker-Listener Architecture for 3D Dense Captioning and Visual Grounding
Publication authorsDave Zhenyu Chen, Qirui Wu, Matthias Niessner, Angel X. Chang
Publication venue17th European Conference on Computer Vision (ECCV), 2022
Publication URLhttps://arxiv.org/abs/2112.01551
Input Data TypesUses XYZ coordinates,Uses Multiview Image Features,Uses Normal Vectors
Programming language(s)Python
HardwareRTX 3090Ti
Websitehttps://daveredrum.github.io/D3Net/
Source code or download URLhttps://github.com/daveredrum/D3Net
Submission creation date27 Oct, 2021
Last edited23 Jul, 2022

Localization

Unique Unique Multiple Multiple Overall Overall
acc@0.25IoUacc@0.5IoUacc@0.25IoUacc@0.5IoUacc@0.25IoUacc@0.5IoU
0.76590.65790.36190.27260.45250.3590