Mining Clusters in XML Corpora Based on Bayesian Generative Topic Modeling

Gianni Costa,Riccardo Ortale

doi:10.1109/icmla.2015.148

Mining Clusters in XML Corpora Based on Bayesian Generative Topic Modeling

Gianni Costa, Riccardo Ortale

https://doi.org/10.1109/icmla.2015.148

Copy DOI

Publication Date: Dec 1, 2015

Citations: 32

Affiliation: National Research Council

#XML Corpora #Approximate Posterior Inference + Show 8 more

Abstract
Full-Text
Similar Papers

Abstract

We study XML partitioning via unsupervised topic modeling. A new mixed-membership Bayesian generative model of the latent topics in XML corpora is proposed. Approximate posterior inference and parameter estimation are derived for the devised XML topic model and implemented by a Gibbs sampling algorithm. This is used to infer the topic distributions of the input XML documents. In turn, such distributions are separated to divide the whole XML corpus by latent-topic similarity. Experiments on real-world XML corpora reveal an overcoming effectiveness with respect to several state-of-the-art competitors.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

View more papers

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.