Submitted by Sijin Chen.

Submission data

Full nameVote2Cap-DETR++
DescriptionDecoupled feature extraction and task decoding for 3D dense captioning.

Set-to-set training, and fine-tuned with SCST (CiDEr reward)
Publication titleVote2Cap-DETR++: Decoupling Localization and Describing for End-to-End 3D Dense Captioning
Publication authorsSijin Chen, Hongyuan Zhu, Mingsheng Li, Xin Chen, Peng Guo, Yinjie Lei, Gang Yu, Taihao Li, Tao Chen
Publication URL
Input Data TypesUses XYZ coordinates,Uses RGB values,Uses Normal Vectors
Programming language(s)python
Source code or download URL
Submission creation date16 Feb, 2024
Last edited19 Feb, 2024


Captioning F1-Score Dense Captioning Object Detection