Full name | MORE |
Description | 3D dense captioning is a recently-proposed novel task, where point clouds contain more geometric information than the 2D counterpart. However, it is also more challenging due to the higher complexity and wider variety of inter-object relations contained in point clouds. Existing methods only treat such relations as by-products of object feature learning in graphs without specifically encoding them, which leads to sub-optimal results. In this paper, aiming at improving 3D dense captioning via capturing and utilizing the complex relations in the 3D scene, we propose MORE, a Multi-Order RElation mining model, to support generating more descriptive and comprehensive captions. Technically, our MORE encodes object relations in a progressive manner since complex relations can be deduced from a limited number of basic ones. We first devise a novel Spatial Layout Graph Convolution (SLGC), which semantically encodes several first-order relations as edges of a graph constructed over 3D object pro |
Publication title | MORE: Multi_ORder RElation Mining for Dense Captioning in 3D Scenes |
Publication authors | Yang Jiao, Shaoxiang Chen, Zequn Jie, Jingjing Chen, Lin Ma, Yu-Gang Jiang |
Publication venue | ECCV 2022 |
Publication URL | https://arxiv.org/abs/2203.05203 |
Input Data Types | Uses XYZ coordinates,Uses Normal Vectors |
Programming language(s) | Pytorch CUDA |
Hardware | RTX-2080 |
Website | https://github.com/SxJyJay/MORE |
Source code or download URL | https://github.com/SxJyJay/MORE |
Submission creation date | 11 Sep, 2022 |
Last edited | 6 Oct, 2022 |