Submitted by Daniel Seichter.

Submission data

Full nameEfficient Multi-Task RGB-D Scene Analysis for Indoor Environments
DescriptionEMSANet is a lightweight neural network that enables real-time panoptic segmentation on an NVIDIA Jetson AGX Xavier. It comprises a fused RGB-D encoder with two lightweight ResNet34-NBt1D-based backbones, a decoder for semantic segmentation, and a decoder for class-agnostic instance segmentation. The results of both decoders are merged to derive a panoptic segmentation. Note, this model was trained (after the mentioned publication) for "PanopticNDT: Efficient and Robust Panoptic Mapping" (IROS 2023).
Weights are publicly available at: https://github.com/TUI-NICR/panoptic-mapping
Publication titleEMSANet: Efficient Multi-Task RGB-D Scene Analysis for Indoor Environments
Publication authorsSeichter, Daniel and Fischedick, Söhnke and Köhler, Mona and Gross, Horst-Michael
Publication venueIJCNN 2022
Publication URLhttps://arxiv.org/abs/2207.04526
Input Data TypesUses Color,Uses Geometry        Uses 2D
Programming language(s)PyTorch CUDA
HardwareNvidia TitanRTX
Source code or download URLhttps://github.com/TUI-NICR/EMSANet
Submission creation date22 Feb, 2023
Last edited23 Jan, 2024

2D semantic instance results



Infoavg apbathtubbedbookshelfcabinetchaircountercurtaindeskdoorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwindow
0.2410.4010.4390.0850.2420.2200.0810.2890.1170.1210.1820.1260.3460.1810.1810.3580.1560.6750.131