Submitted by Yang Jiao.

Submission data

Full nameMORE
Description3D dense captioning is a recently-proposed novel task, where point clouds contain more geometric information than the 2D counterpart. However, it is also more challenging due to the higher complexity and wider variety of inter-object relations contained in point clouds. Existing methods only treat such relations as by-products of object feature learning in graphs without specifically encoding them, which leads to sub-optimal results. In this paper, aiming at improving 3D dense captioning via capturing and utilizing the complex relations in the 3D scene, we propose MORE, a Multi-Order RElation mining model, to support generating more descriptive and comprehensive captions. Technically, our MORE encodes object relations in a progressive manner since complex relations can be deduced from a limited number of basic ones. We first devise a novel Spatial Layout Graph Convolution (SLGC), which semantically encodes several first-order relations as edges of a graph constructed over 3D object pro
Publication titleMORE: Multi_ORder RElation Mining for Dense Captioning in 3D Scenes
Publication authorsYang Jiao, Shaoxiang Chen, Zequn Jie, Jingjing Chen, Lin Ma, Yu-Gang Jiang
Publication venueECCV 2022
Publication URL
Input Data TypesUses XYZ coordinates,Uses Normal Vectors
Programming language(s)Pytorch CUDA
Source code or download URL
Submission creation date11 Sep, 2022
Last edited6 Oct, 2022


Captioning F1-Score Dense Captioning Object Detection