Non-negative Factor Research Articles

Most existing techniques to handle imbalanced data might be invalid in the presence of data missing since they are based on the assumption that the data is complete. To shorten such a gap, a novel synthetic minority oversampling technique (SMOTE), i.e., Non-negative latent factor analysis-incorporated and Switching triple-weight-SMOTE (NSS), is proposed. The main idea of NSS is 4-fold: 1) a Lagrange non-negative matrix factorization (LNMF) method is put forward to impute the missing values with a guaranteed non-negativity according to the original distribution owing to the consideration of the global feature information; 2) by mapping the complete data after imputation into an empirical feature space (EFS), a more separable dataset is achieved, which rigidly maintains the same geometrical structure as the original data while efficiently reducing the redundant features to enhance model generalization and operational efficiency; 3) after fulfilling the fuzzy <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$c$</tex-math> </inline-formula> -means (FCM) clustering, the inter-cluster distance, the capacity of each minority cluster and its sparsity are comprehensively taken into account to develop a triple-weight assignment strategy, which contributes to allocate the number of synthetic samples to each cluster appropriately; 4) a switching oversampling strategy is provided in response to the clusters with different distributions (i.e., either Gaussian or uniform distribution). Moreover, a posterior is used to check the correctness of the synthetic samples. Finally, experiments on a real dataset and <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$12$</tex-math> </inline-formula> public datasets show that the proposed NSS outperforms other <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$11$</tex-math> </inline-formula> state-of-art methods. <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Note to Practitioners</i> —Data classification is an important computer task, which has been successfully used in many domains, including but not limited to, the medical domain, finance domain, and manufacturing domain. However, there are two big challenges when a classifier is to handle real-world data in the presence of imbalanced data and missing values. More specifically, the model performance is very likely to be degraded due to the missing information and the inclination of classifiers to the majority class. To surmount this problem, a natural idea is to use the LNMF model to obtain the desired recovery. Then, for the complete data after imputation, the empirical-feature-space-based switching triple-weight-SMOTE is applied to synthesize safe and correct data (i.e., lie solidly on the region of minority class) to achieve the balance. Such a working principle generates a novel NSS strategy. The proposed NSS strategy has the following obvious merits: 1) the imputation guarantees the similarity to the original dataset; and 2) new synthetic data are safely generated by taking adequate consideration of the information and distribution of datasets. Thus, the proposed NSS can greatly help improve the classification accuracy of real-world datasets.

Read full abstract

IntroductionThis study has investigated the temporal disruptive effects of tributyltin (TBT) on lipid homeostasis in Daphnia magna. To achieve this, the study used Liquid Chromatography–Mass Spectrometry (LC–MS) analysis to analyze biological samples of Daphnia magna treated with TBT over time. The resulting data sets were multivariate and three-way, and were modeled using bilinear and trilinear non-negative factor decomposition chemometric methods. These methods allowed for the identification of specific patterns in the data and provided insight into the effects of TBT on lipid homeostasis in Daphnia magna.ObjectivesInvestigation of how are the changes in the lipid concentrations of Daphnia magna pools when they were exposed with TBT and over time using non-targeted LC–MS and advanced chemometric analysis.MethodsThe simultaneous analysis of LC–MS data sets of Daphnia magna samples under different experimental conditions (TBT dose and time) were analyzed using the ROIMCR method, which allows the resolution of the elution and mass spectra profiles of a large number of endogenous lipids. Changes obtained in the peak areas of the elution profiles of these lipids caused by the dose of TBT treatment and the time after its exposure are analyzed by principal component analysis, multivariate curve resolution-alternative least square, two-way ANOVA and ANOVA-simultaneous component analysis.Results87 lipids were identified. Some of these lipids are proposed as Daphnia magna lipidomic biomarkers of the effects produced by the two considered factors (time and dose) and by their interaction. A reproducible multiplicative effect between these two factors is confirmed and the optimal approach to model this dataset resulted to be the application of the trilinear factor decomposition model.ConclusionThe proposed non-targeted LC–MS lipidomics approach resulted to be a powerful tool to investigate the effects of the two factors on the Daphnia magna lipidome using chemometric methods based on bilinear and trilinear factor decomposition models, according to the type of interaction between the design factors.

Read full abstract

Non-negative Factor Research Articles

Related Topics

Articles published on Non-negative Factor

Alternating nonnegative least squares-incorporated regularized symmetric latent factor analysis for undirected weighted networks

Electricity Retail Plan Recommendation Method Based on Multigranular Hesitant Fuzzy Sets and an Improved Non-Negative Latent Factor Model

Switching Triple-Weight-SMOTE in Empirical Feature Space for Imbalanced and Incomplete Data

Adaptive Divergence-Based Non-Negative Latent Factor Analysis of High-Dimensional and Incomplete Matrices From Industrial Applications

An Adaptive Divergence-Based Non-Negative Latent Factor Model

An Improved Non-Negative Latent Factor Model for Missing Data Estimation via Extragradient-Based Alternating Direction Method.

Non-target ROIMCR LC–MS analysis of the disruptive effects of TBT over time on the lipidomics of Daphnia magna

An Alternating-Direction-Method of Multipliers-Incorporated Approach to Symmetric Non-Negative Latent Factor Analysis.

Accurate regularized Tucker decomposition for image restoration

Proximal Alternating-Direction-Method-of-Multipliers-Incorporated Nonnegative Latent Factor Analysis

FISHFactor: a probabilistic factor model for spatial transcriptomics data with subcellular resolution.

Fast and Accurate Non-Negative Latent Factor Analysis of High-Dimensional and Sparse Matrices in Recommender Systems

A Second-Order Symmetric Non-Negative Latent Factor Model for Undirected Weighted Network Representation

Accelerating probabilistic tensor canonical polyadic decomposition with nonnegative factors: An inexact BCD Approach

Layered solutions for a nonlocal Ginzburg-Landau model with periodic modulation

Nonnegative Latent Factor Analysis-Incorporated and Feature-Weighted Fuzzy Double $c$-Means Clustering for Incomplete Data

Triple Factorization-Like Symmetric NLF Models With Latent Item–Item Relationship

Obtaining clean and informative mass spectra from complex chromatographic and high-resolution all-ions-fragmentation data by nonnegative parallel factor analysis 2

SAR Image Reconstruction of Vehicle Targets Based on Tensor Decomposition

An improved feature selection method for classification on incomplete data: Non-negative latent factor-incorporated duplicate MIC

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Non-negative Factor Research Articles

Related Topics

Articles published on Non-negative Factor

Alternating nonnegative least squares-incorporated regularized symmetric latent factor analysis for undirected weighted networks

Electricity Retail Plan Recommendation Method Based on Multigranular Hesitant Fuzzy Sets and an Improved Non-Negative Latent Factor Model

Switching Triple-Weight-SMOTE in Empirical Feature Space for Imbalanced and Incomplete Data

Adaptive Divergence-Based Non-Negative Latent Factor Analysis of High-Dimensional and Incomplete Matrices From Industrial Applications

An Adaptive Divergence-Based Non-Negative Latent Factor Model

An Improved Non-Negative Latent Factor Model for Missing Data Estimation via Extragradient-Based Alternating Direction Method.

Non-target ROIMCR LC–MS analysis of the disruptive effects of TBT over time on the lipidomics of Daphnia magna

An Alternating-Direction-Method of Multipliers-Incorporated Approach to Symmetric Non-Negative Latent Factor Analysis.

Accurate regularized Tucker decomposition for image restoration

Proximal Alternating-Direction-Method-of-Multipliers-Incorporated Nonnegative Latent Factor Analysis

FISHFactor: a probabilistic factor model for spatial transcriptomics data with subcellular resolution.

Fast and Accurate Non-Negative Latent Factor Analysis of High-Dimensional and Sparse Matrices in Recommender Systems

A Second-Order Symmetric Non-Negative Latent Factor Model for Undirected Weighted Network Representation

Accelerating probabilistic tensor canonical polyadic decomposition with nonnegative factors: An inexact BCD Approach

Layered solutions for a nonlocal Ginzburg-Landau model with periodic modulation

Nonnegative Latent Factor Analysis-Incorporated and Feature-Weighted Fuzzy Double $c$-Means Clustering for Incomplete Data

Triple Factorization-Like Symmetric NLF Models With Latent Item–Item Relationship

Obtaining clean and informative mass spectra from complex chromatographic and high-resolution all-ions-fragmentation data by nonnegative parallel factor analysis 2

SAR Image Reconstruction of Vehicle Targets Based on Tensor Decomposition

An improved feature selection method for classification on incomplete data: Non-negative latent factor-incorporated duplicate MIC