Hybrid Deep Learning for 3D Reconstruction of Multi-Mineral Porous Media: Integrating U-Net and GAN for Enhanced Segmentation and Texture Preservation
Hybrid Deep Learning for 3D Reconstruction of Multi-Mineral Porous Media: Integrating U-Net and GAN for Enhanced Segmentation and Texture Preservation
- Research Article
- 10.1007/s11548-025-03447-5
- Jun 19, 2025
- International journal of computer assisted radiology and surgery
Neural radiance fields (NeRF) offer exceptional capabilities for 3D reconstruction and view synthesis, yet their reliance on extensive multi-view data limits their application in surgical intraoperative settings where only limited data are available. This work addresses this challenge by leveraging a single intraoperative image and preoperative data to train NeRF efficiently for surgical scenarios. We leverage preoperative MRI data to define the set of camera viewpoints and images needed for robust and unobstructed training. Intraoperatively, the appearance of the surgical image is transferred to the pre-constructed training set through neural style transfer, specifically combining WTC2 and STROTSS to prevent over-stylization. This process enables the creation of a dataset for instant and fast single-image NeRF training. The method is evaluated with four clinical neurosurgical cases. Quantitative comparisons to NeRF models trained on real surgical microscope images demonstrate strong synthesis agreement, with similarity metrics indicating high reconstruction fidelity and stylistic alignment. When compared with ground truth, our method demonstrates high structural similarity, confirming good reconstruction quality and texture preservation. Our approach demonstrates the feasibility of single-image NeRF training in surgical settings, overcoming the limitations of traditional multi-view methods. By eliminating the dependency on a large multi-view dataset, our method offers a faster, more adaptable solution for generating accurate 3D reconstructions in real-time surgical scenarios.
- Research Article
6
- 10.1016/j.media.2024.103334
- Sep 3, 2024
- Medical Image Analysis
Deep learning has been extensively applied in medical image reconstruction, where Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) represent the predominant paradigms, each possessing distinct advantages and inherent limitations: CNNs exhibit linear complexity with local sensitivity, whereas ViTs demonstrate quadratic complexity with global sensitivity. The emerging Mamba has shown superiority in learning visual representation, which combines the advantages of linear scalability and global sensitivity. In this study, we introduce MambaMIR, an Arbitrary-Masked Mamba-based model with wavelet decomposition for joint medical image reconstruction and uncertainty estimation. A novel Arbitrary Scan Masking (ASM) mechanism “masks out” redundant information to introduce randomness for further uncertainty estimation. Compared to the commonly used Monte Carlo (MC) dropout, our proposed MC-ASM provides an uncertainty map without the need for hyperparameter tuning and mitigates the performance drop typically observed when applying dropout to low-level tasks. For further texture preservation and better perceptual quality, we employ the wavelet transformation into MambaMIR and explore its variant based on the Generative Adversarial Network, namely MambaMIR-GAN. Comprehensive experiments have been conducted for multiple representative medical image reconstruction tasks, demonstrating that the proposed MambaMIR and MambaMIR-GAN outperform other baseline and state-of-the-art methods in different reconstruction tasks, where MambaMIR achieves the best reconstruction fidelity and MambaMIR-GAN has the best perceptual quality. In addition, our MC-ASM provides uncertainty maps as an additional tool for clinicians, while mitigating the typical performance drop caused by the commonly used dropout.
- Research Article
1
- 10.1088/2631-8695/ae037e
- Sep 15, 2025
- Engineering Research Express
Synthetic Aperture Radar(SAR) images are extensively used for Earth observation because of their all-weather, day, and night imaging capabilities. However, speckle noise in SAR images significantly reduces their usability in a variety of applications. Deep learning models developed for SAR despeckling exhibit promising noise reduction capabilities. Bringing a balance between reducing graininess and preserving texture details is a challenging task. In addition, supervised training of a robust deep learning model requires noisy images that capture the SAR speckle dynamics and the corresponding speckle-free ground truth, which is generally not available. This study proposes the first hybrid CNN-Halo attention-based transformer model for SAR despeckling. CNN-based feature extraction modules provide multiscale and multidirectional and large-scale feature maps. A halo-attention transformer block is used in the skip connection. It aids in the better preservation of radiometric information in the despeckled SAR images. TransSARNet is trained in a supervised manner using a new synthetic SAR dataset, which is a combination of the Kylberg and UCMerced land-use datasets. This study also analyzed the effect of combining the Kylberg and UCMerced datasets on texture preservation in despeckled SAR images. The visual and qualitative metrics evaluated on Sentinel-1 Single Look Complex SAR data showed that the proposed TransSARNet approach outperformed the other models under consideration. TransSARNet achieves a harmonious balance between model complexity, despeckling ability, edge preservation, radiometric information preservation, and smoothing in homogeneous regions.
- Research Article
4
- 10.1186/s40494-024-01471-3
- Oct 9, 2024
- Heritage Science
The Mogao Grottoes in Dunhuang, a treasure of China's and the world's cultural heritage, contains rich historical and cultural deposits and has left precious relics of the history of human art. Over centuries, the Mogao Caves have been affected by natural and human factors, resulting in irreversible fading and discoloration of many murals. In recent years, deep learning technology has shown great potential in the field of virtual mural color restoration. Therefore, this paper proposes a mural image color restoration method based on a reversible neural network. The method first employs an automatic reference selection module based on structural and texture similarity to choose suitable reference mural images for the faded murals. Then, it utilizes a reversible residual network to extract deep features of the mural images without information loss. Next, a channel refinement module is used to eliminate redundant information in the network channels. Finally, an unbiased color transfer module restores the color of the faded mural images. Compared to other image color restoration methods, the proposed method achieves superior color restoration effects while effectively preserving the original structure and texture details of the mural images. Compared to baseline methods, the Structural Similarity Index (SSIM), Feature Similarity Index (FSIM), and Perception-based Image Quality Evaluator (PIQE) values are improved by 7.97%, 3.46%, and 13.98%, respectively. The color restoration of the Dunhuang Mural holds significant historical, artistic, cultural, and economic values, and plays a positive role in the preservation and inheritance of Chinese culture, as well as in the promotion of cultural exchange and mutual understanding.
- Research Article
- 10.1142/s021800142452027x
- Nov 30, 2024
- International Journal of Pattern Recognition and Artificial Intelligence
This paper presents AMFANet, an advanced deep learning model engineered for high-quality image style transfer. AMFANet integrates cutting-edge techniques such as the Adaptive Multi-Scale Feature Fusion (AMSF) module and Hybrid Attention Mechanism (HAM) to significantly improve style consistency, content fidelity, and texture preservation. The model also utilizes Segmented Atrous Spatial Pyramid Pooling (SASPP) for effective multi-scale feature extraction. Comprehensive experimental evaluations demonstrate that AMFANet surpasses current state-of-the-art models like StyleGAN3, ChipGAN, ACL-GAN, and CycleGAN in generating high-fidelity stylized images while preserving intricate details and artistic essence. Future research will focus on optimizing computational efficiency, enabling multi-style transfer, enhancing user interaction, and exploring cross-domain applications. These findings highlight AMFANet’s potential as a robust solution for advanced image style transfer in both artistic and practical domains.
- Research Article
15
- 10.1109/jsen.2021.3077468
- Nov 15, 2021
- IEEE Sensors Journal
It is a big challenge to realize accurate security detection of blast furnace bearing at the same time so as to guarantee the security of equipment. To end this problem, this paper proposed a computer vision technology based on sensor data and hybrid deep learning method for the solution. We use Variational Mode Decomposition (VMD) algorithm which is a new time-frequency analysis method, which can decompose multi-component signals into multiple single-component amplitude-modulated signals at one time to decompose and deal with the sensor data of bearing fault, so as to realize the effective stripping of fault components and original components from sensor data. Using the artificial intelligence mentioned above, the features can be quickly and accurately extracted. By combining the advantages of deep learning, we improve the coupling mechanism and implement a hybrid deep learning-based computer vision method which greatly improves the calculation speed and accuracy of bearing fault diagnosis. It can be fully connected with the feature extraction algorithm VMD, which overcomes the problem that the bearing feature component is easy to be submerged and difficult to extract under the condition of high temperature and strong noise. The results show that the optimal selection of parameters of computer vision technology based on sensor data and hybrid deep learning can be realized through training the sensor data obtained from the experiment. The optimized hybrid deep learning-based computer vision algorithm can achieve 97.4% bearing fault diagnosis hit rate, which is an advanced application of deep learning algorithm in the engineering field.
- Research Article
6
- 10.7717/peerj-cs.1670
- Nov 27, 2023
- PeerJ Computer Science
Deep learning, a subset of artificial intelligence, gives easy way for the analytical and physical tasks to be done automatically. There is a less necessity for human intervention while performing these tasks. Deep hybrid learning is a blended approach to combine machine learning with deep learning. A hybrid deep learning (HDL) model using convolutional neural network (CNN), residual network (ResNet) and long short term memory (LSTM) is proposed for better course selection of the enrolled candidates in an online learning platform. In this work, a hybrid framework that facilitates the analysis and design of a recommendation system for course selection is developed. A student’s schedule for the next course should consist of classes in which the student has shown interest. For universities to schedule classes optimally, they need to know what courses each student wants to take before each course begins. The proposed recommendation system selects the most appropriate course that can encourage students to base their selection on informed decision making. This system will enable learners to obtain the correct choices of courses to be studied.
- Research Article
- 10.1200/jco.2022.40.16_suppl.e16550
- Jun 1, 2022
- Journal of Clinical Oncology
e16550 Background: Improved computational power and modern algorithms have generated significant interest in radiomics for cancer diagnosis and staging. Here we assess the performance of deep learning (DL) models as a means for feature extraction in combination with supervised machine learning (ML) algorithms for accurate staging and chemotherapy response assessment of bladder cancer. Methods: Deidentified grayscale CT images from bladder cancer patients scheduled to undergo radical cystectomy were included in this retrospective study. These images were manually annotated with two regional masks (normal region and cancer region). Five DL models- namely, AlexNet, GoogleNet, InceptionV3, ResNet-50, and XceptionNet pre-trained on the ImageNet dataset, a public dataset, were then fine-tuned on our bladder CT scan data to extract features. Through feature selection process, the subset of the features was used to build ML classifiers for classification. The classification was performed using five different ML classifiers, namely k-Nearest Neighbor (KNN), Naïve-Bayes (NB), Support Vector Machine (SVM), Linear Discriminant Analysis (LDA), and Decision Tree (DT). The classification task was performed with 10-fold cross-validation, and each of the experiments contained a different but not mutually exclusive subset of samples. The evaluation metrics include accuracy, sensitivity, specificity, precision, and F1-score. Results: A total of 200 deidentified grayscale CT images of 100 patients with histologically proven bladder cancer, were included in this study. For experiment (1) normal vs. cancer, the LDA classifier on XceptionNet based features provides the best performance with an accuracy of 86.07%, sensitivity of 96.75%, specificity of 69.65%, precision of 83.07%, and F1-score of 89.39%. For experiment (2) non-muscle invasive Cancer (NMIBC) vs. muscle invasive bladder cancer (MIBC), the LDA classifier on XceptionNet based features provided the best performance with an accuracy of 79.72%, sensitivity of 66.62%, specificity of 87.39%, precision of 75.58%, and F1-score of 70.81%. For experiment (3) T0 lesion vs. MIBC, the LDA classifier on XceptionNet based features provides the best performance with an accuracy of 74.96%, sensitivity of 80.51%, specificity of 70.22%, precision of 69.78%, and F1-score of 74.73%. Conclusions: Our proposed model has shown good results in differentiating normal vs cancer and promising performance in differentiating T0 vs MIBC after chemotherapy treatment. We are expanding our dataset to further improve the performance in differentiating T0 vs MIBC. In addition, we will investigate the applicability of GAN for data augmentation to address data limit. We believe the hybrid DL and ML framework may facilitates radiologists' decisions and clinical decision-making in patients with bladder cancer.
- Research Article
16
- 10.1002/mp.15810
- Aug 17, 2022
- Medical Physics
ObjectiveAccurate segmentation of the lung nodule in computed tomography images is a critical component of a computer‐assisted lung cancer detection/diagnosis system. However, lung nodule segmentation is a challenging task due to the heterogeneity of nodules. This study is to develop a hybrid deep learning (H‐DL) model for the segmentation of lung nodules with a wide variety of sizes, shapes, margins, and opacities.Materials and methodsA dataset collected from Lung Image Database Consortium image collection containing 847 cases with lung nodules manually annotated by at least two radiologists with nodule diameters greater than 7 mm and less than 45 mm was randomly split into 683 training/validation and 164 independent test cases. The 50% consensus consolidation of radiologists' annotation was used as the reference standard for each nodule. We designed a new H‐DL model combining two deep convolutional neural networks (DCNNs) with different structures as encoders to increase the learning capabilities for the segmentation of complex lung nodules. Leveraging the basic symmetric U‐shaped architecture of U‐Net, we redesigned two new U‐shaped deep learning (U‐DL) models that were expanded to six levels of convolutional layers. One U‐DL model used a shallow DCNN structure containing 16 convolutional layers adapted from the VGG‐19 as the encoder, and the other used a deep DCNN structure containing 200 layers adapted from DenseNet‐201 as the encoder, while the same decoder with only one convolutional layer at each level was used in both U‐DL models, and we referred to them as the shallow and deep U‐DL models. Finally, an ensemble layer was used to combine the two U‐DL models into the H‐DL model. We compared the effectiveness of the H‐DL, the shallow U‐DL and the deep U‐DL models by deploying them separately to the test set. The accuracy of volume segmentation for each nodule was evaluated by the 3D Dice coefficient and Jaccard index (JI) relative to the reference standard. For comparison, we calculated the median and minimum of the 3D Dice and JI over the individual radiologists who segmented each nodule, referred to as M‐Dice, min‐Dice, M‐JI, and min‐JI.ResultsFor the 164 test cases with 327 nodules, our H‐DL model achieved an average 3D Dice coefficient of 0.750 ± 0.135 and an average JI of 0.617 ± 0.159. The radiologists' average M‐Dice was 0.778 ± 0.102, and the average M‐JI was 0.651 ± 0.127; both were significantly higher than those achieved by the H‐DL model (p < 0.05). The radiologists' average min‐Dice (0.685 ± 0.139) and the average min‐JI (0.537 ± 0.153) were significantly lower than those achieved by the H‐DL model (p < 0.05). The results indicated that the H‐DL model approached the average performance of radiologists and was superior to the radiologist whose manual segmentation had the min‐Dice and min‐JI. Moreover, the average Dice and average JI achieved by the H‐DL model were significantly higher than those achieved by the individual shallow U‐DL model (Dice of 0.745 ± 0.139, JI of 0.611 ± 0.161; p < 0.05) or the individual deep U‐DL model alone (Dice of 0.739 ± 0.145, JI of 0.604 ± 0.163; p < 0.05).ConclusionOur newly developed H‐DL model outperformed the individual shallow or deep U‐DL models. The H‐DL method combining multilevel features learned by both the shallow and deep DCNNs could achieve segmentation accuracy comparable to radiologists' segmentation for nodules with wide ranges of image characteristics.
- Research Article
5
- 10.3390/diagnostics14171894
- Aug 28, 2024
- Diagnostics
Background: The risk of cardiovascular disease (CVD) has traditionally been predicted via the assessment of carotid plaques. In the proposed study, AtheroEdge™ 3.0HDL (AtheroPoint™, Roseville, CA, USA) was designed to demonstrate how well the features obtained from carotid plaques determine the risk of CVD. We hypothesize that hybrid deep learning (HDL) will outperform unidirectional deep learning, bidirectional deep learning, and machine learning (ML) paradigms. Methodology: 500 people who had undergone targeted carotid B-mode ultrasonography and coronary angiography were included in the proposed study. ML feature selection was carried out using three different methods, namely principal component analysis (PCA) pooling, the chi-square test (CST), and the random forest regression (RFR) test. The unidirectional and bidirectional deep learning models were trained, and then six types of novel HDL-based models were designed for CVD risk stratification. The AtheroEdge™ 3.0HDL was scientifically validated using seen and unseen datasets while the reliability and statistical tests were conducted using CST along with p-value significance. The performance of AtheroEdge™ 3.0HDL was evaluated by measuring the p-value and area-under-the-curve for both seen and unseen data. Results: The HDL system showed an improvement of 30.20% (0.954 vs. 0.702) over the ML system using the seen datasets. The ML feature extraction analysis showed 70% of common features among all three methods. The generalization of AtheroEdge™ 3.0HDL showed less than 1% (p-value < 0.001) difference between seen and unseen data, complying with regulatory standards. Conclusions: The hypothesis for AtheroEdge™ 3.0HDL was scientifically validated, and the model was tested for reliability and stability and is further adaptable clinically.
- Research Article
48
- 10.1016/j.energy.2023.127701
- May 10, 2023
- Energy
Short-term solar radiation forecasting using hybrid deep residual learning and gated LSTM recurrent network with differential covariance matrix adaptation evolution strategy
- Research Article
145
- 10.1016/j.compbiomed.2021.104803
- Aug 27, 2021
- Computers in Biology and Medicine
Artificial intelligence-based hybrid deep learning models for image classification: The first narrative review
- Research Article
50
- 10.3390/s23187740
- Sep 7, 2023
- Sensors
The rapid advancements in technology have paved the way for innovative solutions in the healthcare domain, aiming to improve scalability and security while enhancing patient care. This abstract introduces a cutting-edge approach, leveraging blockchain technology and hybrid deep learning techniques to revolutionize healthcare systems. Blockchain technology provides a decentralized and transparent framework, enabling secure data storage, sharing, and access control. By integrating blockchain into healthcare systems, data integrity, privacy, and interoperability can be ensured while eliminating the reliance on centralized authorities. In conjunction with blockchain, hybrid deep learning techniques offer powerful capabilities for data analysis and decision making in healthcare. Combining the strengths of deep learning algorithms with traditional machine learning approaches, hybrid deep learning enables accurate and efficient processing of complex healthcare data, including medical records, images, and sensor data. This research proposes a permissions-based blockchain framework for scalable and secure healthcare systems, integrating hybrid deep learning models. The framework ensures that only authorized entities can access and modify sensitive health information, preserving patient privacy while facilitating seamless data sharing and collaboration among healthcare providers. Additionally, the hybrid deep learning models enable real-time analysis of large-scale healthcare data, facilitating timely diagnosis, treatment recommendations, and disease prediction. The integration of blockchain and hybrid deep learning presents numerous benefits, including enhanced scalability, improved security, interoperability, and informed decision making in healthcare systems. However, challenges such as computational complexity, regulatory compliance, and ethical considerations need to be addressed for successful implementation. By harnessing the potential of blockchain and hybrid deep learning, healthcare systems can overcome traditional limitations, promoting efficient and secure data management, personalized patient care, and advancements in medical research. The proposed framework lays the foundation for a future healthcare ecosystem that prioritizes scalability, security, and improved patient outcomes.
- Research Article
43
- 10.3390/diagnostics11081405
- Aug 4, 2021
- Diagnostics
Background: COVID-19 lung segmentation using Computed Tomography (CT) scans is important for the diagnosis of lung severity. The process of automated lung segmentation is challenging due to (a) CT radiation dosage and (b) ground-glass opacities caused by COVID-19. The lung segmentation methodologies proposed in 2020 were semi- or automated but not reliable, accurate, and user-friendly. The proposed study presents a COVID Lung Image Analysis System (COVLIAS 1.0, AtheroPoint™, Roseville, CA, USA) consisting of hybrid deep learning (HDL) models for lung segmentation. Methodology: The COVLIAS 1.0 consists of three methods based on solo deep learning (SDL) or hybrid deep learning (HDL). SegNet is proposed in the SDL category while VGG-SegNet and ResNet-SegNet are designed under the HDL paradigm. The three proposed AI approaches were benchmarked against the National Institute of Health (NIH)-based conventional segmentation model using fuzzy-connectedness. A cross-validation protocol with a 40:60 ratio between training and testing was designed, with 10% validation data. The ground truth (GT) was manually traced by a radiologist trained personnel. For performance evaluation, nine different criteria were selected to perform the evaluation of SDL or HDL lung segmentation regions and lungs long axis against GT. Results: Using the database of 5000 chest CT images (from 72 patients), COVLIAS 1.0 yielded AUC of ~0.96, ~0.97, ~0.98, and ~0.96 (p-value < 0.001), respectively within 5% range of GT area, for SegNet, VGG-SegNet, ResNet-SegNet, and NIH. The mean Figure of Merit using four models (left and right lung) was above 94%. On benchmarking against the National Institute of Health (NIH) segmentation method, the proposed model demonstrated a 58% and 44% improvement in ResNet-SegNet, 52% and 36% improvement in VGG-SegNet for lung area, and lung long axis, respectively. The PE statistics performance was in the following order: ResNet-SegNet > VGG-SegNet > NIH > SegNet. The HDL runs in <1 s on test data per image. Conclusions: The COVLIAS 1.0 system can be applied in real-time for radiology-based clinical settings.
- Research Article
52
- 10.1016/j.compbiomed.2021.105131
- Dec 13, 2021
- Computers in Biology and Medicine
A hybrid deep learning paradigm for carotid plaque tissue characterization and its validation in multicenter cohorts using a supercomputer framework
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.