Data In Low-dimensional Space Research Articles

BackgroundRecent developments in single-cell RNA sequencing have opened up a multitude of possibilities to study tissues at the level of cellular populations. However, the heterogeneity in single-cell sequencing data necessitates appropriate procedures to adjust for technological limitations and various sources of noise when integrating datasets from different studies. While many analysis procedures employ various preprocessing steps, they often overlook the importance of selecting and optimizing the employed data transformation methods.ResultsThis work investigates data transformation approaches used in single-cell clustering analysis tools and their effects on batch integration analysis. In particular, we compare 16 transformations and their impact on the low-dimensional representations, aiming to reduce the batch effect and integrate multiple single-cell sequencing data. Our results show that data transformations strongly influence the results of single-cell clustering on low-dimensional data space, such as those generated by UMAP or PCA. Moreover, these changes in low-dimensional space significantly affect trajectory analysis using multiple datasets, as well. However, the performance of the data transformations greatly varies across datasets, and the optimal method was different for each dataset. Additionally, we explored how data transformation impacts the analysis of deep feature encodings using deep neural network-based models, including autoencoder-based models and proto-typical networks. Data transformation also strongly affects the outcome of deep neural network models.ConclusionsOur findings suggest that the batch effect and noise in integrative analysis are highly influenced by data transformation. Low-dimensional features can integrate different batches well when proper data transformation is applied. Furthermore, we found that the batch mixing score on low-dimensional space can guide the selection of the optimal data transformation. In conclusion, data preprocessing is one of the most crucial analysis steps and needs to be cautiously considered in the integrative analysis of multiple scRNA-seq datasets.

Read full abstract

Graph representation learning aims to learn the representations of graph structured data in low-dimensional space, and has a wide range of applications in graph analysis tasks. Real-world networks are generally heterogeneous and dynamic, which contain multiple types of nodes and edges, and the graph may evolve at a high speed over time. The complex heterogeneous properties and rapidly evolving graph structures make it difficult to learn high-quality graph representations for dynamic heterogeneous graphs. Currently, studies concentrated on representation learning of temporal heterogeneous networks are insufficient. Existing methods either rely on meta-paths where the embedding quality heavily depending on experts’ selection, or use network snapshots where the fine-grained temporal information cannot be captured. In this paper, we propose a novel graph neural network model–node signature based Temporal Heterogeneous Graph Attention Network, termed as THGAT, for learning the representations of dynamic heterogeneous networks. THGAT improves the aggregation way of neighborhood information, and pays attention to the enlightenment of the importance of neighbor nodes by heterogeneous information and temporal information that cannot be ignored in the network. We also innovatively propose three node signature methods for encoding the heterogeneous information of the nodes and use the time encoding technique suitable for real-time networks to directly represent the temporal information, so as to overcome the limitations of existing methods. We conduct experiments on four real-world datasets, and the results demonstrate that THGAT improves the representation learning quality significantly, in aspects of link prediction, node classification, and node clustering, compared to the state-of-the-art methods. To make the work more complete, we also analyze the applicable scenarios of the three node signature methods through experiments, respectively.

Read full abstract

Data In Low-dimensional Space Research Articles

Related Topics

Articles published on Data In Low-dimensional Space

The effect of data transformation on low-dimensional integration of single-cell RNA-seq

Laplacian-based Cluster-Contractive t-SNE for High-Dimensional Data Visualization

Dynamic heterogeneous graph representation learning with neighborhood type modeling

Visualizing hierarchies in scRNA-seq data using a density tree-biased autoencoder.

From Anomaly Detection to Novel Fault Discrimination for Wind Turbine Gearboxes With a Sparse Isolation Encoding Forest

Deep Learning and Machine Learning for Early Detection of Stroke and Haemorrhage

Manifold regularization ensemble clustering with many objectives using unsupervised extreme learning machines

Dimensionality Reduction for Tensor Data Based on Local Decision Margin Maximization.

Root cause analysis approach based on reverse cascading decomposition in QFD and fuzzy weight ARM for quality accidents

Toward Making Unsupervised Graph Hashing Discriminative

Projection Analysis Optimization for Human Transition Motion Estimation

A New Method for Emulating Self-Organizing Maps for Visualization of Datasets

Feature selection based dual-graph sparse non-negative matrix factorization for local discriminative clustering

Gene selection for microarray data classification via subspace learning and manifold regularization.

Band selection for hyperspectral image classification with spatial–spectral regularized sparse graph

An application of PART to the Football Manager data for players clusters analyses to inform club team formation

Non-negative Matrix Factorization with Pairwise Constraints and Graph Laplacian

Extensional Neighborhood Preserving Embedding

Performance Analysis of Hybrid (supervised and unsupervised) method for multiclass data set

Graph-based local concept coordinate factorization

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Data In Low-dimensional Space Research Articles

Related Topics

Articles published on Data In Low-dimensional Space

The effect of data transformation on low-dimensional integration of single-cell RNA-seq

Laplacian-based Cluster-Contractive t-SNE for High-Dimensional Data Visualization

Dynamic heterogeneous graph representation learning with neighborhood type modeling

Visualizing hierarchies in scRNA-seq data using a density tree-biased autoencoder.

From Anomaly Detection to Novel Fault Discrimination for Wind Turbine Gearboxes With a Sparse Isolation Encoding Forest

Deep Learning and Machine Learning for Early Detection of Stroke and Haemorrhage

Manifold regularization ensemble clustering with many objectives using unsupervised extreme learning machines

Dimensionality Reduction for Tensor Data Based on Local Decision Margin Maximization.

Root cause analysis approach based on reverse cascading decomposition in QFD and fuzzy weight ARM for quality accidents

Toward Making Unsupervised Graph Hashing Discriminative

Projection Analysis Optimization for Human Transition Motion Estimation

A New Method for Emulating Self-Organizing Maps for Visualization of Datasets

Feature selection based dual-graph sparse non-negative matrix factorization for local discriminative clustering

Gene selection for microarray data classification via subspace learning and manifold regularization.

Band selection for hyperspectral image classification with spatial–spectral regularized sparse graph

An application of PART to the Football Manager data for players clusters analyses to inform club team formation

Non-negative Matrix Factorization with Pairwise Constraints and Graph Laplacian

Extensional Neighborhood Preserving Embedding

Performance Analysis of Hybrid (supervised and unsupervised) method for multiclass data set

Graph-based local concept coordinate factorization