Submitted by Daniel Seichter.

Submission data

Full nameEfficient Multi-Task RGB-D Scene Analysis for Indoor Environments
DescriptionEMSANet is a lightweight neural network that enables real-time panoptic segmentation on an NVIDIA Jetson AGX Xavier. It comprises a fused RGB-D encoder with two lightweight ResNet34-NBt1D-based backbones, a decoder for semantic segmentation, and a decoder for class-agnostic instance segmentation. The results of both decoders are merged to derive a panoptic segmentation. Note, this model was trained (after the mentioned publication) for "PanopticNDT: Efficient and Robust Panoptic Mapping" (IROS 2023).
Weights are publicly available at: https://github.com/TUI-NICR/panoptic-mapping
Publication titleEMSANet: Efficient Multi-Task RGB-D Scene Analysis for Indoor Environments
Publication authorsSeichter, Daniel and Fischedick, Söhnke and Köhler, Mona and Gross, Horst-Michael
Publication venueIJCNN 2022
Publication URLhttps://arxiv.org/abs/2207.04526
Input Data TypesUses Color,Uses Geometry        Uses 2D
Programming language(s)PyTorch CUDA
HardwareNvidia Jetson AGX Xavier/Orin
Source code or download URLhttps://github.com/TUI-NICR/EMSANet
Submission creation date5 Feb, 2023
Last edited23 Jan, 2024

2D semantic label results

Infoavg ioubathtubbedbookshelfcabinetchaircountercurtaindeskdoorfloorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwallwindow
0.6000.7160.7460.3950.6140.3820.5230.7130.5710.5030.9220.4040.3970.6550.4000.6260.6630.4690.9000.8270.577