Submitted by Daniel Seichter.

Submission data

Full nameEfficient Multi-Task RGB-D Scene Analysis for Indoor Environments
DescriptionEMSANet is a lightweight neural network that enables real-time panoptic segmentation on an NVIDIA Jetson AGX Xavier. It comprises a fused RGB-D encoder with two lightweight ResNet34-NBt1D-based backbones, a decoder for semantic segmentation, and a decoder for class-agnostic instance segmentation. The results of both decoders are merged to derive a panoptic segmentation. Note, this model was trained (after the mentioned publication) for "PanopticNDT: Efficient and Robust Panoptic Mapping" (IROS 2023).
Weights are publicly available at:
Publication titleEMSANet: Efficient Multi-Task RGB-D Scene Analysis for Indoor Environments
Publication authorsSeichter, Daniel and Fischedick, Söhnke and Köhler, Mona and Gross, Horst-Michael
Publication venueIJCNN 2022
Publication URL
Input Data TypesUses Color,Uses Geometry        Uses 2D
Programming language(s)PyTorch CUDA
HardwareNvidia Jetson AGX Xavier/Orin
Source code or download URL
Submission creation date5 Feb, 2023
Last edited23 Jan, 2024

2D semantic label results

Infoavg ioubathtubbedbookshelfcabinetchaircountercurtaindeskdoorfloorotherfurniturepicturerefrigeratorshower curtainsinksofatabletoiletwallwindow