RGB-D Indoor Research Articles

Semantic segmentation is a crucial task in vision measurement systems that involves understanding and segmenting different objects and regions within an image. Over the years, numerous RGB-D semantic segmentation methods have been developed, leveraging the encoder-decoder architecture to achieve outstanding performance. However, existing methods have two main problems that constrain further performance improvement. Firstly, in the encoding stage, existing methods have a weak ability to fuse cross-modal information, and low-quality depth maps can easily lead to poor feature representation. Secondly, in the decoding stage, the upsampling of high-level semantic information may cause the loss of contextual information, and low-level features from the encoder may bring noises to the decoder through skip connections. To solve these issues, we propose a novel Encoding Fusion and Decoding Correction Network (EFDCNet) for RGB-D indoor semantic segmentation. First, in the encoding stage of EFDCNet, we focus on extracting valuable information from low-quality depth maps, and employ a channel-wise filter to select informative depth features. Additionally, we establish the global dependencies between RGB and depth features via the self-attention mechanism to enhance the cross-modal feature interactions, extracting discriminant and powerful features. Then, in the decoding stage of EFDCNet, we use the highest-level information as semantic guidance to compensate for the upsampling information and filter out noise from the low-level encoder features propagated through the skip connections to the decoder. Extensive experiments conducted on two widely-used RGB-D indoor semantic segmentation datasets demonstrate that the proposed EFDCNet surpasses the performance of relevant state-of-the-art methods. The code is available at https://github.com/ Mark9010/EFDCNet

Immersive novel view generation is an important technology in the field of graphics and has recently also received attention for operator-based human-robot interaction. However, the involved training is time-consuming, and thus the current test scope is majorly on object capturing. This limits the usage of related models in the robotics community for 3D reconstruction since robots (1) usually only capture a very small range of view directions to surfaces that cause arbitrary predictions on unseen, novel direction, (2) requires real-time algorithms, and (3) work with growing scenes, e.g., in robotic exploration. The letter proposes a novel Neural Surface Light Fields model that copes with the small range of view directions while producing a good result in unseen directions. Exploiting recent encoding techniques, the training of our model is highly efficient. In addition, we design Multiple Asynchronous Neural Agents (MANA), a universal framework to learn each small region in parallel for large-scale growing scenes. Our model learns online the Neural Surface Light Fields (NSLF) aside from real-time 3D reconstruction with a sequential data stream as the shared input. In addition to online training, our model also provides real-time rendering after completing the data stream for visualization. We implement experiments using well-known RGBD indoor datasets, showing the high flexibility to embed our model into real-time 3D reconstruction and demonstrating high-fidelity view synthesis for these scenes. The code is available on github <xref ref-type="fn" rid="fn1" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><sup>1</sup></xref> <fn id="fn1" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><label><sup>1</sup></label> <uri>https://jarrome.github.io/NSLF-OL</uri> </fn> .

RGB-D Indoor Research Articles

Related Topics

Articles published on RGB-D Indoor

Dual-modal non-local context guided multi-stage fusion for indoor RGB-D semantic segmentation

SEG-SLAM: Dynamic Indoor RGB-D Visual SLAM Integrating Geometric and YOLOv5-Based Semantic Information.

FGMNet: Feature grouping mechanism network for RGB-D indoor scene semantic segmentation

CMPFFNet: Cross-Modal and Progressive Feature Fusion Network for RGB-D Indoor Scene Semantic Segmentation

An Efficient RGB-D Indoor Scene-Parsing Solution via Lightweight Multiflow Intersection and Knowledge Distillation

DGPINet-KD: Deep Guided and Progressive Integration Network with Knowledge Distillation for RGB-D Indoor Scene Analysis

Tracking by Detection: Robust Indoor RGB-D Odometry Leveraging Key Local Manhattan World

EFDCNet: Encoding fusion and decoding correction network for RGB-D indoor semantic segmentation

Indoor semantic segmentation based on Swin-Transformer

TCANet: three-stream coordinate attention network for RGB-D indoor semantic segmentation

Online Learning of Neural Surface Light Fields Alongside Real-Time Incremental 3D Reconstruction

Localization and Navigation System for Indoor Mobile Robot

RFNet: Reverse Fusion Network With Attention Mechanism for RGB-D Indoor Scene Understanding

FASFLNet: feature adaptive selection and fusion lightweight network for RGB-D indoor scene parsing.

PGDENet: Progressive Guided Fusion and Depth Enhancement Network for RGB-D Indoor Scene Parsing

A Novel Semantic Segmentation Algorithm for RGB-D Images Based on Non-Symmetry and Anti-Packing Pattern Representation Model

RGB-D indoor semantic segmentation network based on wavelet transform

Multi-scale fusion for RGB-D indoor semantic segmentation

Visual SLAM for Dynamic Environments Based on Object Detection and Optical Flow for Dynamic Object Removal

RGB-D Image Semantic Segmentation Based on Multi-Modal Adaptive Convolution

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

RGB-D Indoor Research Articles

Related Topics

Articles published on RGB-D Indoor

Dual-modal non-local context guided multi-stage fusion for indoor RGB-D semantic segmentation

SEG-SLAM: Dynamic Indoor RGB-D Visual SLAM Integrating Geometric and YOLOv5-Based Semantic Information.

FGMNet: Feature grouping mechanism network for RGB-D indoor scene semantic segmentation

CMPFFNet: Cross-Modal and Progressive Feature Fusion Network for RGB-D Indoor Scene Semantic Segmentation

An Efficient RGB-D Indoor Scene-Parsing Solution via Lightweight Multiflow Intersection and Knowledge Distillation

DGPINet-KD: Deep Guided and Progressive Integration Network with Knowledge Distillation for RGB-D Indoor Scene Analysis

Tracking by Detection: Robust Indoor RGB-D Odometry Leveraging Key Local Manhattan World

EFDCNet: Encoding fusion and decoding correction network for RGB-D indoor semantic segmentation

Indoor semantic segmentation based on Swin-Transformer

TCANet: three-stream coordinate attention network for RGB-D indoor semantic segmentation

Online Learning of Neural Surface Light Fields Alongside Real-Time Incremental 3D Reconstruction

Localization and Navigation System for Indoor Mobile Robot

RFNet: Reverse Fusion Network With Attention Mechanism for RGB-D Indoor Scene Understanding

FASFLNet: feature adaptive selection and fusion lightweight network for RGB-D indoor scene parsing.

PGDENet: Progressive Guided Fusion and Depth Enhancement Network for RGB-D Indoor Scene Parsing

A Novel Semantic Segmentation Algorithm for RGB-D Images Based on Non-Symmetry and Anti-Packing Pattern Representation Model

RGB-D indoor semantic segmentation network based on wavelet transform

Multi-scale fusion for RGB-D indoor semantic segmentation

Visual SLAM for Dynamic Environments Based on Object Detection and Optical Flow for Dynamic Object Removal

RGB-D Image Semantic Segmentation Based on Multi-Modal Adaptive Convolution