Latent Representation Research Articles

The conventional method of computing personality scores through extensive questionnaire-based surveys poses practical challenges in real-world scenarios. An alternate route is to predict personality scores from user reviews by analyzing various linguistic features such as writing style, word choices, and specific phrases. However, the reviews are domain-dependent and classification models trained on one domain cannot be readily applied to other domains. To mitigate this challenge, we propose a cross-domain recommendation framework called PEMF-CD which leverages a novel mixing strategy to integrate user reviews from multiple domains with common joint embedding space and predict user personality scores using a transformer model. By capturing the underlying semantics and latent representations within the textual data, the transformer architecture can effectively model the linguistic cues to infer users’ personality traits, and the learning is transferred across domains. To further enhance the recommendation process, our model integrates personality-wise and rating pattern-based similarities of users into a probabilistic matrix factorization method that fosters user neighborhoods based on similarity scores among users. Comprehensive experiments were conducted using five real-world datasets from TripAdvisor and Amazon with varied numbers of users, items, and reviews of up to 44187, 26386, and 426791, respectively. The performance has been benchmarked against thirteen baseline algorithms and the experimental results demonstrate a significant improvements of up to 24.72%, 64.28%, 48.79%, and 61% in RMSE, and 55.9%, 76.7%, 67.6%, and 71.5% in MAE for a 90:10 train-test split with Digital Music, Fashion, Magazine Subscriptions and Video Games datasets from Amazon, respectively. Similar results have been observed for the 80:20 train-test split.

Read full abstract

Effective molecular feature representation is crucial for drug property prediction. Recent years have seen increased attention on graph neural networks (GNNs) that are pre-trained using self-supervised learning techniques, aiming to overcome the scarcity of labeled data in molecular property prediction. Traditional GNNs in self-supervised molecular property prediction typically perform a single masking operation on the nodes and edges of the input molecular graph, masking only local information and insufficient for thorough self-supervised training. Hence, we propose a model for molecular property prediction based on generative double-masking self-supervised learning, termed as GDMol. This integrates generative learning into the self-supervised learning framework for latent representation, and applies a second round of masking to these latent representations, enabling the model to better capture global information and semantic knowledge of the molecules for a richer, more informative representation, thereby achieving more accurate and robust molecular property prediction. Our experiments on 5 datasets demonstrated superior performance of GDMol in predicting molecular properties across different domains. Moreover, we used the masking operation to traverse through the gradient changes of each node, the magnitude and sign of which reflect the positive and negative contribution respectively of the local structure in the molecule to the prediction outcome. This in-depth interpretative analysis not only enhances the model's interpretability, but also provides more targeted insights and direction for optimizing drug molecules. In summary, this research offers novel insights on improving molecular property prediction tasks, and paves the way for further research on the application of generative learning and self-supervised learning in the field of chemistry.

Read full abstract

Latent Representation Research Articles

Related Topics

Articles published on Latent Representation

From Envelope Spectra to Bearing Remaining Useful Life: An Intelligent Vibration-Based Prediction Model with Quantified Uncertainty

Bimodal PET/MRI generative reconstruction based on VAE architectures.

How Resilient Are Kolmogorov–Arnold Networks in Classification Tasks? A Robustness Investigation

Megavariate Methods Capture Complex Genotype-by-Environment Interactions.

Unsupervised representation learning of Kohn–Sham states and consequences for downstream predictions of many-body effects

Exploring Inherent Consistency for Semi-Supervised Anatomical Structure Segmentation in Medical Imaging.

BrainMass: Advancing Brain Network Analysis for Diagnosis With Large-Scale Self-Supervised Learning.

Enhancing cross-domain recommendations: Leveraging personality-based transfer learning with probabilistic matrix factorization

Roodmus: a toolkit for benchmarking heterogeneous electron cryo-microscopy reconstructions.

An Effective Network With Discrete Latent Representation Designed for Massive MIMO CSI Feedback

TRust Your GENerator (TRYGEN): Enhancing Out-of-Model Scope Detection

Partial Quantile Tensor Regression

Diffusion-driven Incomplete Multimodal Learning for Air Quality Prediction

GDMol: Generative Double-Masking Self-Supervised Learning for Molecular Property Prediction.

Point-MPP: Point Cloud Self-Supervised Learning From Masked Position Prediction.

Pre-training with a rational approach for antibody sequence representation.

Functional Principal Component Analysis for Continuous Non-Gaussian, Truncated, and Discrete Functional Data.

Prior cocaine use diminishes encoding of latent information by orbitofrontal, but not medial, prefrontal ensembles.

FedATA: Adaptive attention aggregation for federated self-supervised medical image segmentation

A novel enhancing method for terahertz imaging of integrated circuits flaw detection

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Latent Representation Research Articles

Related Topics

Articles published on Latent Representation

From Envelope Spectra to Bearing Remaining Useful Life: An Intelligent Vibration-Based Prediction Model with Quantified Uncertainty

Bimodal PET/MRI generative reconstruction based on VAE architectures.

How Resilient Are Kolmogorov–Arnold Networks in Classification Tasks? A Robustness Investigation

Megavariate Methods Capture Complex Genotype-by-Environment Interactions.

Unsupervised representation learning of Kohn–Sham states and consequences for downstream predictions of many-body effects

Exploring Inherent Consistency for Semi-Supervised Anatomical Structure Segmentation in Medical Imaging.

BrainMass: Advancing Brain Network Analysis for Diagnosis With Large-Scale Self-Supervised Learning.

Enhancing cross-domain recommendations: Leveraging personality-based transfer learning with probabilistic matrix factorization

Roodmus: a toolkit for benchmarking heterogeneous electron cryo-microscopy reconstructions.

An Effective Network With Discrete Latent Representation Designed for Massive MIMO CSI Feedback

TRust Your GENerator (TRYGEN): Enhancing Out-of-Model Scope Detection

Partial Quantile Tensor Regression

Diffusion-driven Incomplete Multimodal Learning for Air Quality Prediction

GDMol: Generative Double-Masking Self-Supervised Learning for Molecular Property Prediction.

Point-MPP: Point Cloud Self-Supervised Learning From Masked Position Prediction.

Pre-training with a rational approach for antibody sequence representation.

Functional Principal Component Analysis for Continuous Non-Gaussian, Truncated, and Discrete Functional Data.

Prior cocaine use diminishes encoding of latent information by orbitofrontal, but not medial, prefrontal ensembles.

FedATA: Adaptive attention aggregation for federated self-supervised medical image segmentation

A novel enhancing method for terahertz imaging of integrated circuits flaw detection