Probabilistic Generative Model Research Articles

With the rapid proliferation of social networking sites (SNS), automatic topic extraction from various text messages posted on SNS are becoming an important source of information for understanding current social trends or needs. Latent Dirichlet Allocation (LDA), a probabilistic generative model, is one of the popular topic models in the area of Natural Language Processing (NLP) and has been widely used in information retrieval, topic extraction, and document analysis. Unlike long texts from formal documents, messages on SNS are generally short. Traditional topic models such as LDA or pLSA (probabilistic latent semantic analysis) suffer performance degradation for short-text analysis due to a lack of word co-occurrence information in each short text. To cope with this problem, various techniques are evolving for interpretable topic modeling for short texts, pretrained word embedding with an external corpus combined with topic models is one of them. Due to recent developments of deep neural networks (DNN) and deep generative models, neural-topic models (NTM) are emerging to achieve flexibility and high performance in topic modeling. However, there are very few research works on neural-topic models with pretrained word embedding for generating high-quality topics from short texts. In this work, in addition to pretrained word embedding, a fine-tuning stage with an original corpus is proposed for training neural-topic models in order to generate semantically coherent, corpus-specific topics. An extensive study with eight neural-topic models has been completed to check the effectiveness of additional fine-tuning and pretrained word embedding in generating interpretable topics by simulation experiments with several benchmark datasets. The extracted topics are evaluated by different metrics of topic coherence and topic diversity. We have also studied the performance of the models in classification and clustering tasks. Our study concludes that though auxiliary word embedding with a large external corpus improves the topic coherency of short texts, an additional fine-tuning stage is needed for generating more corpus-specific topics from short-text data.

Probabilistic generative models are attractive for scientific modeling because their inferred parameters can be used to generate hypotheses and design experiments. This requires that the learned model provides an accurate representation of the input data and yields a latent space that effectively predicts outcomes relevant to the scientific question. Supervised Variational Autoencoders (SVAEs) have previously been used for this purpose, as a carefully designed decoder can be used as an interpretable generative model of the data, while the supervised objective ensures a predictive latent representation. Unfortunately, the supervised objective forces the encoder to learn a biased approximation to the generative posterior distribution, which renders the generative parameters unreliable when used in scientific models. This issue has remained undetected as reconstruction losses commonly used to evaluate model performance do not detect bias in the encoder. We address this previously-unreported issue by developing a second-order supervision framework (SOS-VAE) that updates the decoder parameters, rather than the encoder, to induce a predictive latent representation. This ensures that the encoder maintains a reliable posterior approximation and the decoder parameters can be effectively interpreted. We extend this technique to allow the user to trade-off the bias in the generative parameters for improved predictive performance, acting as an intermediate option between SVAEs and our new SOS-VAE. We also use this methodology to address missing data issues that often arise when combining recordings from multiple scientific experiments. We demonstrate the effectiveness of these developments using synthetic data and electrophysiological recordings with an emphasis on how our learned representations can be used to design scientific experiments.

Probabilistic Generative Model Research Articles

Related Topics

Articles published on Probabilistic Generative Model

Regularized Bayesian calibration and scoring of the WD-FAB IRT model improves predictive performance over marginal maximum likelihood.

Fault Diagnosis of Machines Using Deep Convolutional Beta-Variational Autoencoder

Get out of the BAG! Silos in AI Ethics Education: Unsupervised Topic Modeling Analysis of Global AI Curricula

Non-volume preserving-based fusion to group-level emotion recognition on crowd videos

A whole brain probabilistic generative model: Toward realizing cognitive architectures for developmental robots

Neural networks with upper and lower bound constraints and its application on industrial soft sensing modeling with missing values

The cultural evolution of love in literary history

AIDA: An Active Inference-Based Design Agent for Audio Processing Algorithms

Resilient Cyber-Physical Energy Systems Using Prior Information Based on Gaussian Process

Reciprocity, community detection, and link prediction in dynamic networks

Map completion from partial observation using the global structure of multiple environmental maps

Multiagent multimodal categorization for symbol emergence: emergent communication via interpersonal cross-modal inference

A 30 year topic analysis of Veterinary Medicine literature

Investigating the Efficient Use of Word Embedding with Neural-Topic Models for Interpretable Topics from Short Texts.

Machine learning methods for generating high dimensional discrete datasets

Modeling and Detecting Communities in Node Attributed Networks

Probabilistic Generative Model for Hyperspectral Unmixing Accounting for Endmember Variability

Supervising the Decoder of Variational Autoencoders to Improve Scientific Utility.

Dynamic Self-Supervised Teacher-Student Network Learning.

Generative Text Convolutional Neural Network for Hierarchical Document Representation Learning.

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Probabilistic Generative Model Research Articles

Related Topics

Articles published on Probabilistic Generative Model

Regularized Bayesian calibration and scoring of the WD-FAB IRT model improves predictive performance over marginal maximum likelihood.

Fault Diagnosis of Machines Using Deep Convolutional Beta-Variational Autoencoder

Get out of the BAG! Silos in AI Ethics Education: Unsupervised Topic Modeling Analysis of Global AI Curricula

Non-volume preserving-based fusion to group-level emotion recognition on crowd videos

A whole brain probabilistic generative model: Toward realizing cognitive architectures for developmental robots

Neural networks with upper and lower bound constraints and its application on industrial soft sensing modeling with missing values

The cultural evolution of love in literary history

AIDA: An Active Inference-Based Design Agent for Audio Processing Algorithms

Resilient Cyber-Physical Energy Systems Using Prior Information Based on Gaussian Process

Reciprocity, community detection, and link prediction in dynamic networks

Map completion from partial observation using the global structure of multiple environmental maps

Multiagent multimodal categorization for symbol emergence: emergent communication via interpersonal cross-modal inference

A 30 year topic analysis of Veterinary Medicine literature

Investigating the Efficient Use of Word Embedding with Neural-Topic Models for Interpretable Topics from Short Texts.

Machine learning methods for generating high dimensional discrete datasets

Modeling and Detecting Communities in Node Attributed Networks

Probabilistic Generative Model for Hyperspectral Unmixing Accounting for Endmember Variability

Supervising the Decoder of Variational Autoencoders to Improve Scientific Utility.

Dynamic Self-Supervised Teacher-Student Network Learning.

Generative Text Convolutional Neural Network for Hierarchical Document Representation Learning.