Submitted by Dave Zhenyu Chen.

Submission data

Full namePointGroup + GRU
Publication titleD3Net: A Unified Speaker-Listener Architecture for 3D Dense Captioning and Visual Grounding
Publication authorsDave Zhenyu Chen, Qirui Wu, Matthias Niessner, Angel X. Chang
Publication venue17th European Conference on Computer Vision (ECCV), 2022
Publication URLhttps://arxiv.org/abs/2112.01551
Input Data TypesUses XYZ coordinates,Uses Multiview Image Features,Uses Normal Vectors
Programming language(s)Python
HardwareRTX 3090
Websitehttps://daveredrum.github.io/D3Net/
Source code or download URLhttps://github.com/daveredrum/D3Net
Submission creation date25 Aug, 2022
Last edited25 Aug, 2022

Captioning

Captioning F1-Score Dense Captioning Object Detection
CIDEr@0.5IoUBLEU-4@0.5IoURouge-L@0.5IoUMETEOR@0.5IoUDCmAPmAP@0.5
0.20880.13350.22370.10220.14810.4198