Super-resolution of turbulent velocity fields in two-way coupled particle-laden flows
This paper introduces a deep learning-based super-resolution framework specifically developed for accurately reconstructing high-resolution velocity fields in two-way coupled particle-laden turbulent flows. Leveraging conditional generative adversarial networks, the generator network architecture incorporates explicit conditioning on physical parameters, such as effective particle mass density and subgrid kinetic energy, while the discriminator network is conditioned on low-resolution data as well as high-frequency content of the input data. High-fidelity direct numerical simulation datasets, covering a range of particle Stokes numbers, particle mass loadings, and carrier gas turbulence regimes, including forced- and decaying-turbulence, serve as training and testing datasets. Extensive validation studies, including detailed analyses of energy spectra, probability density functions, vorticity distributions, and wavelet-based decomposition, demonstrate the model's accuracy and generalization capabilities across different particle parameters. The results show that the network utilizes particle data, mainly in the reconstruction of high-frequency details modulated by particles. Additionally, systematic assessment of the model's performance in capturing previously unseen flow regimes further validates its predictive capabilities.
197
- 10.1103/physrevfluids.6.050504
- May 12, 2021
- Physical Review Fluids
87
- 10.1007/s00162-023-00663-0
- Jun 16, 2023
- Theoretical and Computational Fluid Dynamics
67
- 10.1017/jfm.2014.62
- Mar 10, 2014
- Journal of Fluid Mechanics
45
- 10.1016/j.proci.2014.05.146
- Jun 27, 2014
- Proceedings of the Combustion Institute
95
- 10.1007/s00707-017-1803-x
- Feb 20, 2017
- Acta Mechanica
98
- 10.1063/1.2001691
- Jul 28, 2005
- Physics of Fluids
12
- 10.1016/j.combustflame.2019.05.032
- Jun 6, 2019
- Combustion and Flame
5
- 10.1007/s10494-019-00099-9
- Dec 30, 2019
- Flow, Turbulence and Combustion
66
- 10.1016/j.jcp.2022.111090
- Feb 25, 2022
- Journal of Computational Physics
275
- 10.1017/jfm.2018.770
- Nov 2, 2018
- Journal of Fluid Mechanics
- Research Article
6
- 10.1016/j.eclinm.2024.102772
- Jul 26, 2024
- eClinicalMedicine
Development and validation of a deep learning-based framework for automated lung CT segmentation and acute respiratory distress syndrome prediction: a multicenter cohort study
- Conference Article
3
- 10.1109/asru.2009.5373359
- Jan 1, 2009
From statistical learning theory, the generalization capability of a model is the ability to generalize well on unseen test data which follow the same distribution as the training data. This paper investigates how generalization capability can also improve robustness when testing and training data are from different distributions in the context of speech recognition. Two discriminative training (DT) methods are used to train the hidden Markov model (HMM) for better generalization capability, namely the minimum classification error (MCE) and the soft-margin estimation (SME) methods. Results on Aurora-2 task show that both SME and MCE are effective in improving one of the measures of acoustic model's generalization capability, i.e. the margin of the model, with SME be moderately more effective. In addition, the better generalization capability translates into better robustness of speech recognition performance, even when there is significant mismatch between the training and testing data. We also applied the mean and variance normalization (MVN) to preprocess the data to reduce the training-testing mismatch. After MVN, MCE and SME perform even better as the generalization capability now is more closely related to robustness. The best performance on Aurora-2 is obtained from SME and about 28% relative error rate reduction is achieved over the MVN baseline system. Finally, we also use SME to demonstrate the potential of better generalization capability in improving robustness in more realistic noisy task using the Aurora-3 task, and significant improvements are obtained.
- Research Article
41
- 10.1109/access.2019.2927726
- Jan 1, 2019
- IEEE Access
Detection of QRS complexes in electrocardiogram (ECG) signal is crucial for automated cardiac diagnosis. Automated QRS detection has been a research topic for over three decades and several of the traditional QRS detection methods show acceptable detection accuracy, however, the applicability of these methods beyond their study-specific databases was not explored. The non-stationary nature of ECG and signal variance of intra and inter-patient recordings impose significant challenges on single QRS detectors to achieve reasonable performance. In real life, a promising QRS detector may be expected to achieve acceptable accuracy over diverse ECG recordings and, thus, investigation of the model's generalization capability is crucial. This paper investigates the generalization capability of convolutional neural network (CNN) based-models from intra (subject wise leave-one-out and five-fold cross validation) and inter-database (training with single and multiple databases) points-of-view over three publicly available ECG databases, namely MIT-BIH Arrhythmia, INCART, and QT. Leave-one-out test accuracy reports 99.22%, 97.13%, and 96.25% for these databases accordingly and inter-database tests report more than 90% accuracy with the single exception of INCART. The performance variation reveals the fact that a CNN model's generalization capability does not increase simply by adding more training samples, rather the inclusion of samples from a diverse range of subjects is necessary for reasonable QRS detection accuracy.
- Research Article
- 10.1038/s41598-025-21783-z
- Oct 29, 2025
- Scientific reports
Early skin disease detection significantly improves patient survival rates, yet limited access to dermatological expertise creates an urgent need for automated diagnostic systems. In this paper, we develop a dual-branch deep learning framework that simultaneously performs skin lesion segmentation and classification from dermoscopic images. The proposed segmentation branch uses a modified EfficientNet-B7 encoder with Atrous Spatial Pyramid Pooling (ASPP) for multi-scale feature extraction and transformer blocks for global context modeling. Attention gates and Squeeze-and-Excitation blocks enhance feature selection and boundary precision. The classification branch fuses DenseNet-121 visual features with morphological characteristics extracted from predicted segmentation masks, creating a hybrid appearance-morphology analysis approach. The proposed framework achieved strong and consistent segmentation performance across five benchmark datasets. On HAM10000, the highest Dice score (0.9568) and IoU (0.9242) were recorded, with an accuracy of 0.9708. PH2 achieved a Dice of 0.9250 and a sensitivity of 0.9734, while ISIC 2016 reached a Dice of 0.9298 and an IoU of 0.8811. For ISIC 2017 and ISIC 2018, Dice scores were 0.8972 and 0.9020, respectively. All datasets reported high specificity (> 0.93) and accuracy (> 0.95), confirming the model's robustness and generalization capability. Our dual-branch framework achieves state-of-the-art accuracy by effectively integrating visual appearance and structural morphological features for comprehensive skin lesion analysis. The consistent high performance across diverse datasets indicates strong potential for clinical deployment as a diagnostic support tool.
- Research Article
18
- 10.1007/s11042-020-09188-8
- Jul 2, 2020
- Multimedia Tools and Applications
Salient object detection for RGB-D image aims to automatically detect the objects of human interest by color and depth information. In the paper generative adversarial network is adopted to improve its performance by adversarial learning. Generator network takes RGB-D images as inputs and outputs synthetic saliency maps. It adopts double stream network to extract color and depth feature individually and then fuses them from deep to shallow progressively. Discriminator network takes RGB image and synthetic saliency maps (RGBS), RGB image and ground truth saliency map (RGBY) as inputs, and outputs their labels indicating whether input is synthetics or ground truth. It consists of three convolution blocks and three fully connected layers. In order to pursuit long-range dependency of feature, self-attention layer is inserted in both generator and discriminator network. Supervised by real labels and ground truth saliency map, discriminator network and generator network are adversarial trained to make generator network cheat discriminator network successfully and discriminator network distinguish synthetics or ground truth correctly. Experiments demonstrate adversarial learning enhances the ability of generator network, RGBS and RGBY input in discriminator network and self-attention layer play an important role in improving the performance. Meanwhile our method outperforms state-of-the-art methods.
- Research Article
- 10.54097/bntk8836
- Dec 25, 2024
- Highlights in Science, Engineering and Technology
In the practical implementation of deep learning, challenges such as a heavy reliance on large datasets, high costs associated with data annotation, and significant computational resource consumption can lead to issues like incomplete datasets and inaccurate annotations. These factors negatively impact the model's generalization capabilities and overall performance. To address these concerns, recent research has increasingly focused on adaptive technologies. Domain adaptation seeks to transfer abundant labeled information from a source domain to an unlabeled target domain, thereby tackling the decline in machine learning model performance when faced with varying data distributions. By fine-tuning and optimizing models, it is possible to bridge gaps between different domains or scenarios effectively; this allows for leveraging knowledge from the source domain to better align with the characteristics of the target domain. Consequently, this approach reduces training and annotation costs while enhancing both accuracy and robustness in real-world applications. This review intends to thoroughly investigate the distinct contributions of two specific techniques: Joint Adversarial Domain Adaptation, and Deep Joint Distribution Optimal Transport for Unsupervised Domain Adaptation. Through this examination, we aim to uncover strategies for more effectively addressing challenges posed by limited data availability, high expenses, and substantial computing resource demands in deep learning by generating similar high-quality data that further minimizes discrepancies between source and target domains ultimately leading to marked improvements in model adaptability and generalization abilities while facilitating efficient deployment and performance enhancement of deep learning technologies across various practical application scenarios.
- Conference Article
5
- 10.1109/ccdc.2019.8833176
- Jun 1, 2019
In optical remote sensing images, many ships have very similar shapes and textures with backgrounds. In this case, it is very hard to accurately detect these ships. In this paper, we introduce generative adversarial networks (GANs) to perform hard ship detection. GANs consist of one generative network and one discriminator network. We take state-of-the-art object (ship) detection network Faster R-CNN as the generative network, which outputs the detection results as fake samples. The ground-truth ships in the input image are set as the real samples. The discriminator network is responsible for distinguishing between fake samples and real samples. The two networks are simultaneously trained. Through continuous adversarial training, the fake samples generated by the generative network can be very similar to the real samples, and the discriminator network would not correctly distinguish between fake samples and real samples. As a result, the ship detection network (generative network) correctly recognizes hard-detection ships, producing satisfactory detection results. What’s more, the discriminator network is only used in training process, and thus the proposed method not only improves detection accuracy, but also does not increase computational cost.
- Research Article
- 10.1016/j.knosys.2024.112649
- Oct 23, 2024
- Knowledge-Based Systems
Considering representation diversity and prediction consistency for domain generalization semantic segmentation
- Conference Article
11
- 10.1109/iscslp.2018.8706647
- Nov 1, 2018
Most conventional speech enhancement methods work poorly at low SNRs. And the speech enhancement method based on generative adversarial network (SEGAN) gets lower speech quality though it has lots of parameters in its generator. To solve these problems, we propose a speech enhancement method based on a new architecture of Wasserstein generative adversarial network (SEWGAN), whose generator network and discriminator network are structured on the basis of fully convolutional neural networks (FCNNs) and deep neural networks (DNNs) respectively. In the paper, multiple noise and different signal-noise ratios (SNRs) are used to train the proposed method for improving its generalization capability. Experimental results show that the proposed method outperforms SEGAN and minimum mean square error estimators based on magnitude-squared spectrum (MMSE-MSS) in terms of both short-time objective intelligibility (STOI) and perceptual evaluation of speech quality (PESQ). As expected, the work also demonstrates the proposed method has strong generalization capability in a real-world scenario.
- Research Article
1
- 10.3390/s25071959
- Mar 21, 2025
- Sensors (Basel, Switzerland)
YOLOv8-PEL shows outstanding performance in detection accuracy, computational efficiency, and generalization capability, making it suitable for real-time and resource-constrained applications. This study aims to address the challenges of vehicle detection in scenarios with fixed camera angles, where precision is often compromised for the sake of cost control and real-time performance, by leveraging the enhanced YOLOv8-PEL model. We have refined the YOLOv8n model by introducing the innovative C2F-PPA module within the feature fusion segment, bolstering the adaptability and integration of features across varying scales. Furthermore, we have proposed ELA-FPN, which further refines the model's multi-scale feature fusion and generalization capabilities. The model also incorporates the Wise-IoUv3 loss function to mitigate the deleterious gradients caused by extreme examples in vehicle detection samples, resulting in more precise detection outcomes. We employed the COCO-Vehicle dataset and the VisDrone2019 dataset for our training, with the former being a subset of the COCO dataset that exclusively contains images and labels of cars, buses, and trucks. Experimental results demonstrate that the YOLOv8-PEL model achieved a mAP@0.5 of 66.9% on the COCO-Vehicle dataset, showcasing excellent efficiency with only 2.23 M parameters, 7.0 GFLOPs, a mere 4.5 MB model size, and 176.8 FPS-an increase from the original YOLOv8n's inference speed of 165.7 FPS. Despite a marginal 0.2% decrease in accuracy compared to the original YOLOv8n, the parameters, GFLOPs, and model size were reduced by 25%, 13%, and 25%, respectively. The YOLOv8-PEL model excels in detection precision, computational efficiency, and generalizability, making it well-suited for real-time and resource-constrained application scenarios.
- Research Article
- 10.1016/j.yofte.2024.103793
- Apr 13, 2024
- Optical Fiber Technology
High generalization identification method based on MI-SI distributed optical fiber sensor and video signals
- Research Article
- 10.1016/j.compbiomed.2025.110992
- Oct 1, 2025
- Computers in biology and medicine
Language-guided multimodal domain generalization for outcome prediction of head and neck cancer.
- Research Article
- 10.1016/j.neunet.2025.107871
- Jul 1, 2025
- Neural networks : the official journal of the International Neural Network Society
A triple-branch hybrid dynamic-static alignment strategy for vision-language tasks.
- Research Article
- 10.1109/jbhi.2025.3540894
- Jul 1, 2025
- IEEE journal of biomedical and health informatics
Autism Spectrum Disorder (ASD) is a complex neurodevelopmental disorder, and precise prediction using imaging or other biological information is of great significance. However, predicting ASD in individuals presents the following challenges: first, there is extensive heterogeneity among subjects; second, existing models fail to fully utilize rs-fMRI and non-imaging information, resulting in less accurate classification results. Therefore, this paper proposes a novel framework, named HE-MF, which consists of a Hierarchical Feature Extraction Module and a Multimodal Deep Feature Integration Module. The Hierarchical Feature Extraction Module aims to achieve multi-level, fine-grained feature extraction and enhance the model's discriminative ability by progressively extracting the most discriminative functional connectivity features at both the intra-group and overall subject levels. The Multimodal Deep Feature Integration Module extracts common and distinctive features based on rs-fMRI and non-imaging information through two separate channels, and utilizes an attention mechanism for dynamic weight allocation, thereby achieving deep feature fusion and significantly improving the model's predictive performance. Experimental results on the ABIDE public dataset show that the HE-MF model achieves an accuracy of 95.17% in the ASD identification task, significantly outperforming existing state-of-the-art methods, demonstrating its effectiveness and superiority. To verify the model's generalization capability, we successfully applied it to relevant tasks in the ADNI dataset, further demonstrating the HE-MF model's outstanding performance in feature learning and generalization capabilities.
- Research Article
23
- 10.1109/tasl.2009.2031236
- Jan 1, 2010
- IEEE Transactions on Audio, Speech, and Language Processing
In this paper, we explore the generalization capability of acoustic model for improving speech recognition robustness against noise distortions. While generalization in statistical learning theory originally refers to the model's ability to generalize well on unseen testing data drawn from the same distribution as that of the training data, we show that good generalization capability is also desirable for mismatched cases. One way to obtain such general models is to use margin-based model training method, e.g., soft-margin estimation (SME), to enable some tolerance to acoustic mismatches without a detailed knowledge about the distortion mechanisms through enhancing margins between competing models. Experimental results on the Aurora-2 and Aurora-3 connected digit string recognition tasks demonstrate that, by improving the model's generalization capability through SME training, speech recognition performance can be significantly improved in both matched and low to medium mismatched testing cases with no language model constraints. Recognition results show that SME indeed performs better with than without mean and variance normalization, and therefore provides a complimentary benefit to conventional feature normalization techniques such that they can be combined to further improve the system performance. Although this study is focused on noisy speech recognition, we believe the proposed margin-based learning framework can be extended to dealing with different types of distortions and robustness issues in other machine learning applications.
- Research Article
- 10.1063/5.0291030
- Nov 1, 2025
- Physics of Fluids
- Research Article
- 10.1063/5.0288726
- Nov 1, 2025
- Physics of Fluids
- Research Article
- 10.1063/5.0297258
- Nov 1, 2025
- Physics of Fluids
- Research Article
- 10.1063/5.0293498
- Nov 1, 2025
- Physics of Fluids
- Research Article
- 10.1063/5.0301528
- Nov 1, 2025
- Physics of Fluids
- Research Article
- 10.1063/5.0296579
- Nov 1, 2025
- Physics of Fluids
- Research Article
- 10.1063/5.0294580
- Nov 1, 2025
- Physics of Fluids
- Research Article
- 10.1063/5.0293788
- Nov 1, 2025
- Physics of Fluids
- Research Article
- 10.1063/5.0299738
- Nov 1, 2025
- Physics of Fluids
- Research Article
- 10.1063/5.0297575
- Nov 1, 2025
- Physics of Fluids
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.