Result details - ScanRefer Benchmark

Full name	PointGroup + Transformer (w/o end-to-end fine-tuning)
Description	This is the pretrained version of D3Net - PointGroup + Transformer (w/o end-to-end fine-tuning)
Publication title	D3Net: A Unified Speaker-Listener Architecture for 3D Dense Captioning and Visual Grounding
Publication authors	Dave Zhenyu Chen, Qirui Wu, Matthias Niessner, Angel X. Chang
Publication venue	17th European Conference on Computer Vision (ECCV), 2022
Publication URL	https://arxiv.org/abs/2112.01551
Input Data Types	Uses XYZ coordinates,Uses Multiview Image Features,Uses Normal Vectors
Programming language(s)	Python
Hardware	RTX 3090Ti
Website	https://daveredrum.github.io/D3Net/
Source code or download URL	https://github.com/daveredrum/D3Net
Submission creation date	27 Oct, 2021
Last edited	23 Jul, 2022

Unique	Unique	Multiple	Multiple	Overall	Overall
acc@0.25IoU	acc@0.5IoU	acc@0.25IoU	acc@0.5IoU	acc@0.25IoU	acc@0.5IoU
0.7659	0.6579	0.3619	0.2726	0.4525	0.3590

Results for D3Net - Pretrained