Articles published on Dimension Reduction
Authors
Select Authors
Journals
Select Journals
Duration
Select Duration
35699 Search results
Sort by Recency
- New
- Research Article
- 10.1016/j.est.2026.120842
- Apr 1, 2026
- Journal of Energy Storage
- Thanaphon Mathuravech + 1 more
Optimal scheduling of hydropower and pumped storage hydropower for high renewable energy share in Thailand: A novel hybrid optimization approach with dimensionality reduction
- New
- Research Article
- 10.1016/j.ijar.2026.109625
- Apr 1, 2026
- International Journal of Approximate Reasoning
- Linzi Yin + 3 more
Parallel attribute reduction algorithm based on simplified neighborhood matrix with Apache Spark
- New
- Research Article
- 10.1016/j.aap.2026.108406
- Apr 1, 2026
- Accident; analysis and prevention
- Zeinab Bayati + 1 more
Beyond the norm: Identifying rare and high-risk pedestrian crash patterns using unsupervised learning.
- New
- Research Article
- 10.1016/j.ijar.2026.109635
- Apr 1, 2026
- International Journal of Approximate Reasoning
- Mohd Aquib + 1 more
Fuzzy neighborhood components analysis: Supervised dimensionality reduction under uncertain labels
- New
- Research Article
- 10.1016/j.jlp.2025.105868
- Apr 1, 2026
- Journal of Loss Prevention in the Process Industries
- Ziqi Han + 5 more
Real-time prediction of gas leakage and diffusion for buried natural gas pipelines by deep learning and dimensionality reduction methods
- New
- Research Article
- 10.1016/j.chemgeo.2026.123323
- Apr 1, 2026
- Chemical Geology
- Elise B Laupland + 4 more
Archived and curated major and trace element and isotope ratio geochemical datasets are growing at a rapid pace, commonly comprising data for many thousands of samples. The multi-dimensional nature of such vast datasets, listing information across the periodic table, is challenging to interrogate and visualize. A common pitfall is over-simplification of the analysis by focusing on ‘key’ geochemical ratios, projecting data into fields with arbitrarily set boundaries, or assigning binary classifiers. This situation invites the evaluation of available dimensionality reduction techniques. In this study, the relatively rapid non-linear dimensionality reduction technique ‘Uniform Manifold Approximation and Projection’ (UMAP) was used to simplify large tabulated (many thousands of datapoints) datasets of zircon and apatite geochemistry. The resulting two-dimensional plot reveals both local and global structure in the data. When datapoints are colour-coded for individual element concentrations, element ratios, mineral ages, or independent classification (e.g., host granite type), the geological relevance of the UMAP embedding can be evaluated. As a porphyry copper fertility prospecting tool, UMAP not only makes visual the multi-dimensional data, but the projection serves as a simplifying, pre-processing step for machine learning classification beyond strictly classifying data into ‘barren’ or ‘fertile’. Namely, the UMAP projection and point cloud structure can reveal trends with magmatic age, fertility potential vectors, and dissimilarities between regional datasets in the context of the global data structure. Beyond magmatic copper fertility, UMAP was found to group zircon according to petrologically established geochemical features, such as S-type magmatic zircon dominated by the xenotime substitution mechanism. Applied to apatite geochemistry, UMAP was able to group analyses from grains with known provenance and form data regions related to petrogenetic similarity (e.g., sedimentary and high-grade metapelite metamorphic). The overall UMAP approach of reducing dimensionality via projection into two-dimensional space and pre-processing the emerging data structure for machine learning classification can potentially be applied to a diverse range of large geochemical datasets. In doing so, care must be taken to balance local structure against global dataset coherence, whilst testing the viability of the projection and grouping through colour-coding datapoints for known geochemical parameters, age or tectonic setting. • UMAP dimensionality reduction is an effective method of visualizing large geochemical datasets, balancing local and global data structure. • Petrogenetically informed UMAP embedding highlights a zircon magmatic fertility gradient from fertile to barren. • Dataset dimensionality reduction may decrease the complexity of machine learning models needed to classify grains. • UMAP of zircon elemental compositions distinguished zircon from S-type magmas. • When applied to apatite, UMAP forms petrogenetically significant clusters.
- New
- Research Article
2
- 10.1016/j.patcog.2025.112557
- Apr 1, 2026
- Pattern Recognition
- Yingjie Cai + 3 more
Multi-subspace graph clustering joint dimensionality reduction and feature selection
- New
- Research Article
- 10.1016/j.jcrc.2025.155365
- Apr 1, 2026
- Journal of critical care
- Manel Santafé + 6 more
Subphenotyping aneurysmal subarachnoid Hemorrhage using clinical and biological data clustering.
- New
- Research Article
- 10.1016/j.oceaneng.2026.124623
- Apr 1, 2026
- Ocean Engineering
- Nhu Son Doan
Surrogate-based dimension reduction for reliability analysis and LRFD calibration: Breakwater foundations in depth-varying soils
- New
- Research Article
- 10.30574/wjaets.2026.18.3.0129
- Mar 31, 2026
- World Journal of Advanced Engineering Technology and Sciences
- Mayowa Samuel, Alade + 5 more
The rapid growth of interconnected devices in small office and home networks has introduced heightened cybersecurity risks, yet traditional Intrusion Detection Systems (IDS) often demand extensive computational resources, making them unsuitable for deployment in resource-constrained environments. This study presents the design, implementation, and evaluation of a lightweight machine learning-based IDS optimized for small networks with limited processing power and memory. The research employed the CICIDS2017 dataset as the primary benchmark, subjecting it to comprehensive preprocessing, including data cleaning, normalization, encoding, feature scaling, and dimensionality reduction through Principal Component Analysis (PCA). Multiple classical Machine Learning algorithms, including Decision Tree, Random Forest (pruned), Naïve Bayes, K-Nearest Neighbors, and Ridge Classifier, were implemented and comparatively evaluated. Performance metrics such as accuracy, precision, recall, F1-score, CPU utilization, memory usage, and latency were used for assessment. Results indicated that the Random Forest achieved the best balance between accuracy and efficiency with low false positive rates, and minimal computational requirements suitable for lightweight environments. The Random Forest was integrated into a Flask-based RESTful API and a Streamlit dashboard. By bridging machine learning techniques with practical deployment frameworks, it contributes a resource-efficient, scalable, and user-friendly security solution tailored to small enterprises and personal network environments.
- New
- Research Article
- 10.62622/teiee.026.4.1.45-59
- Mar 30, 2026
- Trends in Ecological and Indoor Environmental Engineering
- Tomokazu Konishi
Background: Seismology has accumulated extensive observational data, yet modern statistical methodologies have rarely been applied comprehensively to seismic datasets. Consequently, several long-standing interpretations, including magnitude distributions and aftershock decay laws, may reflect analytical constraints rather than physical processes. Re-examining earthquake behaviour using contemporary statistical tools provides an opportunity to reassess empirical relationships and clarify persistent ambiguities in seismic patterns. Objectives: This study applies modern statistical methods to complete seismic datasets to reassess magnitude distributions, aftershock decay, and three-dimensional active-zone structure, testing whether systematic temporal variations, including precursory changes before major earthquakes, can be objectively identified. Methods: Earthquake data were obtained from the Japan Meteorological Agency and analysed without any filtering, using a 1° latitude–longitude grid. The grid with the highest count in each year was examined, except for 2011, which focused on the Tohoku earthquake hypocentre. All calculations were performed in R, with three-dimensional visualizations generated using the rgl package. Hypocentre distributions were projected onto plate boundaries using principal component analysis (PCA) in two stages: 3D-to-2D dimensionality reduction and boundary-specific projection. Maps were verified and appropriately transformed to align with square-based latitude–longitude diagrams for quantitative analysis. Results: Analysis of hypocentre distributions in Japan identifies two major subducting boundaries, the East and Southwest, connected by the shallow Seto structure. Three-dimensional visualisations and Principal Component Analysis reveal these boundaries as planar yet gently curved, with deeper earthquakes concentrated along the Sanriku plate. The Pacific Plate subducts beneath surrounding plates, influencing lateral displacement of the Philippine Sea Plate and creating complex stress patterns. Hypocentre counts increase prior to major events, while deeper earthquakes tend to exhibit higher magnitudes. Aftershock decay follows a half-life process, with energy release distributed heterogeneously across regions, indicating that seismic activity is controlled by interactions between plate geometry, depth, and elastic properties. These findings provide a more robust framework for interpreting seismicity, revising plate boundary models, and informing risk assessment in Japan. Conclusion: Modern statistical analyses clarified Japan's plate boundaries, revised Pacific and Philippine Sea Plate configurations, and updated aftershock decay models, revealing temporal magnitude decay. These findings enhance understanding of earthquake mechanisms, improve seismic hazard assessment, and highlight the need for continued monitoring to address potential false negatives.
- New
- Research Article
- 10.70121/001c.154601
- Mar 15, 2026
- Scholarly Review Journal
- Tianle Liang + 1 more
This paper proposes and develops a robust, cross-benchmarked machine learning framework for predicting SAT benchmark percentages across unified school districts in Indiana. The study utilizes state-provided datasets from the Indiana Department of Education for 300+ districts and created indices representing early-stage ethnic and gender disparity, average early-stage performance, enrollment, and a variety of funding mechanisms. After dimensionality reduction through principal component analysis (captured 83.35% original variance) and quantile-binning the SAT label, random forest (RF) and XGBoost (XGB) ensemble learning models were trained independently, reaching 0.71 and 0.72 classification accuracy, respectively. A composite model was developed by blending probability distributions from RF and XGB, reaching an optimal accuracy of 0.73. Macro-averaged values for precision, recall, and F1 scores similarly increased to 0.74, 0.69, and 0.71, respectively, enhancing class balance overall. Subsequently, XGBoost gain calculations and Shapley Additive Explanations (SHAP) plots were discussed to examine specific feature relationships with SAT classes. Notable variables including ethnic disparity and enrollment were geospatially graphed to identify key geographical patterns for future mitigation planning. This study concluded by recommending baseline policies encompassing small-district funding prioritization, corporate partnerships, and early-stage intervention.
- New
- Research Article
- 10.1093/evolut/qpag044
- Mar 14, 2026
- Evolution; international journal of organic evolution
- Daniel S Caetano + 1 more
Principal Component Analysis (PCA) is one of the most widely used approaches for multivariate datasets. Biologists use PCA to visualize data, identify patterns in large datasets, determine independent axes of variation, and reduce dimensionality for further statistical analyses. Phylogenetic PCA is an extension of regular PCA that seeks to identify the major axes of variation independent of the phylogeny. We extend these methods by estimating PCA parameters using an explicit probability modeling framework. We implement multiple models of trait evolution (Brownian motion, Ornstein-Uhlenbeck, Early Burst, and Pagel's λ) and use the Akaike Information Criterion (AIC) for model selection. We also introduce a probabilistic approach to select the number of principal components to retain from a PCA. We demonstrate the advantages of probabilistic PCA, such as incorporating the error, or noise, arising from dimensionality reduction, which is ignored in regular PCA. We use extensive simulations and an empirical dataset with 35 traits to show the method's performance. We implemented the new approach in the R package "do3PCA" available from the RCran repository.
- New
- Research Article
- 10.1080/13682199.2026.2641892
- Mar 14, 2026
- The Imaging Science Journal
- B Pravallika + 2 more
ABSTRACT Most of the existing image forgery detection methods are limited to binary classification or the detection of a single forgery type, making them unsuitable for comprehensive analysis. This article proposes a novel multi-class image forgery detection strategy capable of categorizing images into five classes: authentic, retouched, geometrically manipulated, copy-move forged, and spliced. The framework employs hybrid features comprising elliptical upper and lower ternary pattern histograms, SURF bins, and Network-Based Dimensionality Reduction (NBDR), combined with an Artificial Neural Network (ANN) classifier optimized using the Red-Tailed Hawk Algorithm (RTHA). Experimental evaluation on a dataset of 2,500 authentic and forged images demonstrated the superior performance of the proposed method, achieving 94.32% accuracy, 98.58% precision, 94.32% recall, 94.36% F1-score, and an AUC of 0.965. These results confirm that the integration of hybrid features with RTHA-optimized ANN classification offers a robust and reliable solution for multi-class digital image forgery detection.
- New
- Research Article
- 10.1002/advs.202524261
- Mar 14, 2026
- Advanced science (Weinheim, Baden-Wurttemberg, Germany)
- Hongze Li + 8 more
To resolve acoustic-mechanical conflicts and integrate research gaps in underwater coatings. Inspired by the biomechanics of jumping spiders and human bones, we design an underwater composite structure subject to hydrostatic pressure. Based on mechanisms involving weak energy entanglement driven by damping and wave-mode conversion driven by impedance mismatch. A synergistic combination of theoretical modeling, numerical simulation, and experimental validation, the structure achieves low-intensity diffuse reflection below 0.8kHz, and broadband low-frequency sound attenuation at 0.8-2.5kHz (insulation > 26dB, absorption > 0.8). Notably, this structure achieves significant sound attenuation with an absorption coefficient exceeding 0.8 below 4kHz even under 3MPa of hydrostatic pressure. The sound attenuation performance decreases by an average of only 4.5% per 1MPa increase in pressure, and the deformation nearly 100% recovers after unloading. By integrating an acoustic-electrical analogy model for component dimensionality reduction and a convolutional neural network for visual quality evaluation, we establish an integrated design-evaluation framework. This strategy provides a scalable approach for next-generation underwater acoustic skins.
- New
- Research Article
- 10.1038/s41467-026-70343-0
- Mar 13, 2026
- Nature communications
- Dongsheng Mao + 15 more
Digital medicine leverages digital biomarkers by algebraically integrating multiple biomarkers to reflect disease status. Colorimetric analysis offers an intuitive readout, but colorimetric-based digital medicine remains underexplored. Here we show an Enzymatic Colorimetric Encoding-based Digital Medicine platform (EnCODE). By harnessing enzyme-catalyzed multicolor encoding in tandem with the programmability of DNA technology, EnCODE converts multidimensional miRNA information into recognizable optical signals. We demonstrate that these signals are decodable and can be interpreted by visual inspection or spectral analysis, facilitating dimensionality reduction and visualization of disease states. Additionally, EnCODE integrates a continuous weighting mechanism that enables accurate mapping of digital biomarkers. In a cohort of 163 pancreatic cancer clinical samples, EnCODE achieves 96% detection sensitivity and 90% overall accuracy-comparable to the 96% sensitivity and 91% overall accuracy with conventional molecular diagnostic methods. We increase data density through three-dimensional color encoding and hyperspectral imaging-based analysis, enabling an intuitive color-coded molecular readout.
- New
- Research Article
- 10.1080/19452829.2026.2642024
- Mar 13, 2026
- Journal of Human Development and Capabilities
- Octaviano Rojas Luiz + 3 more
ABSTRACT Martha Nussbaum's central human capabilities provide a crucial theoretical foundation within the capability approach, yet their empirical operationalisation often relies on predefined theoretical groupings. This research adopts an empirically driven strategy to explore the dimensional reduction of human capabilities based on Nussbaum’s list and the identification of respondent profiles through clustering, using primary data collected in a specific socio-economic context. Using Principal Component Analysis, the capabilities were aggregated into four empirical dimensions: relational autonomy, social respect, physical and mental health, and environment. Additionally, the sample was clustered into five groups based on their capability profiles: Capable, Dependent, Threatened, Vulnerable and Debilitated. This study contributes to the literature on capability measurement by empirically exploring how indicators derived from Nussbaum’s framework relate to one another within a specific socio-economic context. It demonstrates that empirical dimensions can deviate significantly from Nussbaum's theoretical proposal. Furthermore, the findings support the use of dimension reduction techniques to develop capability models better suited for structural equation modelling applications.
- New
- Research Article
- 10.2196/87237
- Mar 12, 2026
- JMIR medical informatics
- Chengdong Peng + 5 more
In the field of traditional Chinese medicine (TCM), diagnostic work based on tongue images to recognize the physical constitution is a process of collecting clinical information, reasoning, and combining the patient's tongue image features with questioning. It is necessary to simulate the recognition of pathological information of tongue images by TCM practitioners and professional dialogue based on tongue image features, which helps to develop an intelligent interactive system for TCM diagnosis. This study aimed to develop and validate a vertical model of the TCM domain with TCM's understanding and reasoning capability for tongue images. A TongueVLM multimodal large model is designed, which includes a visual encoder module, a modal fusion module, and a language decoder module. First, the visual encoder based on the CLIP-ViT (Contrastive Language-Image Pre-Training With Vision Transformer) pretrained model is used for image patch, dimensionality reduction, and migration learning, which maps the high-dimensional tongue features into low-dimensional language encoding vectors. Further, a modal fusion module with a residual architecture is applied to map visual features to a natural language word embedding space, realizing the conceptual alignment between visual encoding and TCM terminology. Finally, fine-tuning of visual instructions is performed based on the LLaMA (large language model meta artificial intelligence), and a TCM-domain large language model with 7B parameters is trained. The constructed multimodal dataset has 3 test datasets, and experiments are conducted using 3000 samples from each test dataset, respectively. Experimental results indicate that the TongueVLM model outperforms general-purpose large models on all 3 tasks. On the multimodal test dataset, the TongueVLM model achieved accuracy rates of 79.8%, 78.6%, and 60.7% in evaluation tasks respectively, it achieves 9.1%, 8.4%, and 1.1% in greater accuracy than LLaVA-OneVision, and is 7.5%, 7%, and 5.9% more accurate than Qwen2.5-VL-7B, with the text generation time being around 24 tokens per second. The TongueVLM model, which achieves tongue image description generation and physical constitution reasoning in TCM, is suitable for the application of a Chinese medicine intelligent diagnosis system.
- New
- Research Article
- 10.1097/cmr.0000000000001087
- Mar 12, 2026
- Melanoma research
- Ahmad A Tarhini + 11 more
Host genetic ancestry plays an important role in shaping somatic mutation landscapes and may influence therapeutic outcomes as well as the risk of developing treatment-related adverse events. As genetic ancestry has been associated with differential susceptibility to melanoma subtypes, distinct somatic mutation frequencies and variable responses to immune checkpoint inhibitors warrant further investigation. This study investigated the genetic ancestry of a North American melanoma population using banked biospecimens from 744 patients enrolled in the ECOG-ACRIN E1609 phase III clinical trial (stages IIIB, IIIC, M1a, or M1b). Peripheral blood samples were genotyped using the Illumina Infinium Global Screening Array v3.0 + Multi-Disease BeadChip, followed by quality control, integration with reference dataset, and linkage disequilibrium pruning (198 064 single nucleotide polymorphisms). Dimensionality reduction was performed with Uniform Manifold Approximation and Projection analysis, and genetic ancestry was inferred using unsupervised ADMIXTURE models. Most patients (728 of 744; 97.8%) had predominant European (EUR) ancestry, followed by minor representation from admixed American (12 of 744; 1.6%) and East Asian (4 of 744; 0.5%) populations. Moreover, based on ADMIXTURE model (K = 5), 96.9% of participants had an estimated EUR ancestry proportion exceeding 80%. Self-reported race and ethnicity demonstrated strong concordance with genetically inferred ancestry, although a small subset of participants exhibited discordant ancestry components. Participants who self-identified as Hispanic exhibited mixed EUR-Admixed American ancestry components. Most patients represented predominant EUR ancestry, with limited representation of non-EUR populations. Integrating ancestry-informed genomic analyses will enhance understanding of melanoma susceptibility, improve prediction of immune-related adverse events, and support the development of tailored immunotherapy strategies.
- New
- Research Article
- 10.1080/00207160.2026.2643403
- Mar 12, 2026
- International Journal of Computer Mathematics
- Ming Zheng + 2 more
In this paper, we explore the quadratic immersed finite element approximation, and a dimensional reduction is employed to solve second-order elliptic interface problems on circular and spherical domains. Specifically, by utilizing the orthogonality properties of Fourier series and spherical harmonics, we reformulate the original problem as a sequence of decoupled one-dimensional second-order interface problems, which are solely related to the radial direction. In addition, we introduce the essential polar condition and the corresponding weighted Sobolev space, which address the variable coefficients and singularities introduced by coordinate transformations. The well-posedness of the weak solution and its approximate solution, as well as the error estimates between them, are theoretically analysed and proven. Lastly, numerous numerical experiments provide evidence for the algorithm's efficiency and convergence.