Result details - ScanRefer Benchmark

Submitted by Yang Jiao.

Submission data

Full name	MORE
Description	3D dense captioning is a recently-proposed novel task, where point clouds contain more geometric information than the 2D counterpart. However, it is also more challenging due to the higher complexity and wider variety of inter-object relations contained in point clouds. Existing methods only treat such relations as by-products of object feature learning in graphs without specifically encoding them, which leads to sub-optimal results. In this paper, aiming at improving 3D dense captioning via capturing and utilizing the complex relations in the 3D scene, we propose MORE, a Multi-Order RElation mining model, to support generating more descriptive and comprehensive captions. Technically, our MORE encodes object relations in a progressive manner since complex relations can be deduced from a limited number of basic ones. We first devise a novel Spatial Layout Graph Convolution (SLGC), which semantically encodes several first-order relations as edges of a graph constructed over 3D object pro
Publication title	MORE: Multi_ORder RElation Mining for Dense Captioning in 3D Scenes
Publication authors	Yang Jiao, Shaoxiang Chen, Zequn Jie, Jingjing Chen, Lin Ma, Yu-Gang Jiang
Publication venue	ECCV 2022
Publication URL	https://arxiv.org/abs/2203.05203
Input Data Types	Uses XYZ coordinates,Uses Normal Vectors
Programming language(s)	Pytorch CUDA
Hardware	RTX-2080
Website	https://github.com/SxJyJay/MORE
Source code or download URL	https://github.com/SxJyJay/MORE
Submission creation date	11 Sep, 2022
Last edited	6 Oct, 2022

Captioning

Captioning F1-Score				Dense Captioning	Object Detection
CIDEr@0.5IoU	BLEU-4@0.5IoU	Rouge-L@0.5IoU	METEOR@0.5IoU	DCmAP	mAP@0.5
0.1239	0.0796	0.1362	0.0631	0.1116	0.2648