Submitted by Daniel Seichter.

Submission data

Full nameEfficient Multi-Task RGB-D Scene Analysis for Indoor Environments
DescriptionEMSANet is a lightweight neural network that enables real-time panoptic segmentation on an NVIDIA Jetson AGX Xavier. It comprises a fused RGB-D encoder with two lightweight ResNet34-NBt1D-based backbones, a decoder for semantic segmentation, and a decoder for class-agnostic instance segmentation. The results of both decoders are merged to derive a panoptic segmentation. Note, this model was trained (after the mentioned publication) for "PanopticNDT: Efficient and Robust Panoptic Mapping" (IROS 2023).
Weights are publicly available at: https://github.com/TUI-NICR/panoptic-mapping
Publication titleEMSANet: Efficient Multi-Task RGB-D Scene Analysis for Indoor Environments
Publication authorsSeichter, Daniel and Fischedick, Söhnke and Köhler, Mona and Gross, Horst-Michael
Publication venueIJCNN 2022
Publication URLhttps://arxiv.org/abs/2207.04526
Input Data TypesUses Color,Uses Geometry        Uses 2D
Programming language(s)PyTorch CUDA
HardwareNvidia TitanRTX
Source code or download URLhttps://github.com/TUI-NICR/EMSANet
Submission creation date22 Feb, 2023
Last edited23 Jan, 2024

2D semantic instance results



Infoavg ap 50%bathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
0.3800.5490.6510.1470.3970.3990.1670.4370.3190.2100.3010.2350.4630.2450.3720.5110.2960.8760.268