Text Data Analysis Using Generalized Linear Mixed Model and Bayesian Visualization

Sunghae Jun

doi:10.3390/axioms11120674

Abstract

Many parts of big data, such as web documents, online posts, papers, patents, and articles, are in text form. So, the analysis of text data in the big data domain is an important task. Many methods based on statistics or machine learning algorithms have been studied for text data analysis. Most of them were analytical methods based on the generalized linear model (GLM). For the GLM, text data analysis is performed based on the assumption of the error included in the given data and follows the Gaussian distribution. However, the GLM has shown limitations in the analysis of text data, including data sparseness. This is because the preprocessed text data has a zero-inflated problem. To solve this problem, we proposed a text data analysis using the generalized linear mixed model (GLMM) and Bayesian visualization. Therefore, the objective of our study is to propose the use of GLMM to overcome the limitations of the conventional GLM in the analysis of text data with a zero-inflated problem. The GLMM uses various probability distributions as well as Gaussian for error terms and considers the difference between observations by clustering. We also use Bayesian visualization to find meaningful associations between keywords. Lastly, we carried out the analysis of text data searched from real domains and provided the analytical results to show the performance and validity of our proposed method.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Axioms	Publication Date: Nov 26, 2022
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Text Data Analysis Using Generalized Linear Mixed Model and Bayesian Visualization

Abstract

Talk to us

Similar Papers

More From: Axioms

Lead the way for us

Similar Papers

Genotype Selection for Grain Yield of Sorghum through Generalized Linear Mixed Model
Mulugeta Tesfa ... Mark Laing
Agronomy | VOL. 13
Mulugeta Tesfa, et. al.Mulugeta Tesfa ... Mark Laing
14 Mar 2023
Agronomy | VOL. 13

Text Data Management and Analysis: A Practical Introduction to Information Retrieval and Text Mining
Chengxiang Zhai ... Sean Massung
-
Chengxiang Zhai, et. al.Chengxiang Zhai ... Sean Massung
23 Jun 2016
23 Jun 2016

Chapter 3 - Linear Models, Generalized Linear Models (GLMs), and Random Effects Models: The Components of Hierarchical Models
Marc Kéry ... J Andrew Royle
Applied Hierarchical Modeling in Ecology: Analysis of distribution, abundance and species richness in R and BUGS | VOL. -
Marc Kéry, et. al.Marc Kéry ... J Andrew Royle
04 Dec 2015
Applied Hierarchical Modeling in Ecology: Analysis of distribution, abundance and species richness in R and BUGS | VOL. -

TextScope: Enhance human perception via text mining
Chengxiang Zhai
-
Chengxiang ZhaiChengxiang Zhai
01 Dec 2017
01 Dec 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Text Data Analysis Using Generalized Linear Mixed Model and Bayesian Visualization

Abstract

Talk to us

Similar Papers

More From: Axioms