Abstract

Probabilistic topic model [6] is an algorithm to discover and annotate large archive of documents with thematic information. The relationship between topics in terminological ontology can be generated by using previous generated topics. This research implemented a probabilistic topic model, Latent Dirichlet Allocation (LDA), to discover topics about terrorism in Indonesian languages articles. The ontology is developed from topics that were produced by using LDA. This research has examined 2032 articles from anti-terrorism portals to generate the terminology by implementing Spark and Java programming language. Based on LDA result, the highest accuracy of 70% is obtained by using 5 topics with 100 iterations. The implementation of Global Similarity Hierarchy Learning (GSHL) algorithm defines “broader” and “related” relationship among the topics.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call