Enhanced Multi-Scale Attention-Driven 3D Human Reconstruction from Single Image

Yong Ren,Mingquan Zhou,Pengbo Zhou,Shibo Wang,Yangyang Liu,Guohua Geng,Kang Li,Xin Cao

doi:10.3390/electronics13214264

Abstract

Due to the inherent limitations of a single viewpoint, reconstructing 3D human meshes from a single image has long been a challenging task. While deep learning networks enable us to approximate the shape of unseen sides, capturing the texture details of the non-visible side remains difficult with just one image. Traditional methods utilize Generative Adversarial Networks (GANs) to predict the normal maps of the non-visible side, thereby inferring detailed textures and wrinkles on the model’s surface. However, we have identified challenges with existing normal prediction networks when dealing with complex scenes, such as a lack of focus on local features and insufficient modeling of spatial relationships.To address these challenges, we introduce EMAR—Enhanced Multi-scale Attention-Driven Single-Image 3D Human Reconstruction. This approach incorporates a novel Enhanced Multi-Scale Attention (EMSA) mechanism, which excels at capturing intricate features and global relationships in complex scenes. EMSA surpasses traditional single-scale attention mechanisms by adaptively adjusting the weights between features, enabling the network to more effectively leverage information across various scales. Furthermore, we have improved the feature fusion method to better integrate representations from different scales. This enhanced feature fusion allows the network to more comprehensively understand both fine details and global structures within the image. Finally, we have designed a hybrid loss function tailored to the introduced attention mechanism and feature fusion method, optimizing the network’s training process and enhancing the quality of reconstruction results. Our network demonstrates significant improvements in performance for 3D human model reconstruction. Experimental results show that our method exhibits greater robustness to challenging poses compared to traditional single-scale approaches.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Enhanced Multi-Scale Attention-Driven 3D Human Reconstruction from Single Image

Abstract

Talk to us

Similar Papers

More From: Electronics

Lead the way for us

Journal: Electronics	Publication Date: Oct 30, 2024
License type: CC BY 4.0

Similar Papers

Advanced Generative Adversarial Network for Image Superresolution
Mei Jia ... Mingde Lu
-
Mei Jia, et. al.Mei Jia ... Mingde Lu
01 Jan 2021
01 Jan 2021

Conv-Swinformer: Integration of CNN and shift window attention for Alzheimer’s disease classification
Zhentao Hu ... Wei Hou
Computers in Biology and Medicine | VOL. 164
Zhentao Hu, et. al.Zhentao Hu ... Wei Hou
31 Jul 2023
Computers in Biology and Medicine | VOL. 164

Weakly Supervised Local-Global Attention Network for Facial Expression Recognition
Haifeng Zhang ... Wen Su
IEEE Access | VOL. 8
Haifeng Zhang, et. al.Haifeng Zhang ... Wen Su
01 Jan 2020
IEEE Access | VOL. 8

A generative adversarial network to Reinhard stain normalization for histopathology image analysis
Afnan M Alhassan
Ain Shams Engineering Journal | VOL. 15
Afnan M AlhassanAfnan M Alhassan
14 Jul 2024
Ain Shams Engineering Journal | VOL. 15

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Enhanced Multi-Scale Attention-Driven 3D Human Reconstruction from Single Image

Abstract

Talk to us

Similar Papers

More From: Electronics