Submitted by Lichen Zhao.

Submission data

Full name3DJCG (Captioning) (VoteNet + Feature-Enhancement + Transformer-Based-Head)
DescriptionJoint Training
We use the VoteNet backbone for detection.
Publication title3DJCG: A Unified Framework for Joint Dense Captioning and Visual Grounding on 3D Point Clouds
Publication authorsDaigang Cai, Lichen Zhao, Jing Zhang†, Lu Sheng, Dong Xu
Publication venueCVPR2022 Oral
Publication URLhttps://openaccess.thecvf.com/content/CVPR2022/papers/Cai_3DJCG_A_Unified_Framework_for_Joint_Dense_Captioning_and_Visual_CVPR_2022_paper.pdf
Input Data TypesUses XYZ coordinates,Uses Multiview Image Features,Uses Normal Vectors
Programming language(s)Python With Cuda
HardwareGeForce RTX 2080 Ti, 11GB RAM
Source code or download URLhttps://github.com/zlccccc/3DJCG
Submission creation date12 Sep, 2022
Last edited13 Sep, 2022

Captioning

Captioning F1-Score Dense Captioning Object Detection
CIDEr@0.5IoUBLEU-4@0.5IoURouge-L@0.5IoUMETEOR@0.5IoUDCmAPmAP@0.5
0.19180.13500.22070.10130.15060.3867