Bringing Shape to Textual Data – A Feasible Demonstration

Anoud Shaikh,Mukhtiar Ali Unar,Naeem Ahmed Mahoto

doi:10.22581/muet1982.1904.04

Anoud Shaikh, Mukhtiar Ali Unar + Show 1 more

Open Access

https://doi.org/10.22581/muet1982.1904.04

Copy DOI

Abstract

The Internet has revolutionized the communication paradigm. This has led towards immense amount of unstructured data (i.e. textual data), which is a major source to get useful knowledge about people in several application domains. TM (Text Mining) extracts high quality information to discover knowledge by drawing patterns and relationships in textual data. This field has taken great attention of the research community. As a result, several attempts have been made to propose, introduce and refine techniques applied for uncovering knowledge from text data. This study aims at: (1) presenting existing TM techniques in the scientific literature, (2) reporting challenges/issues and gaps that still need attention, and (3) proposing a framework to bring shape to textual data. A prototype has been developed to demonstrate the effectiveness and potential worth of proposed approach to display how unstructured data (i.e. news articles in this study) has been brought to a shape representing interesting knowledge. The proposed framework implements basic NLP (Natural Language Processing) functions in combination of AYLIEN API (Application Programming Interface) functions. The results reveal the fact that how events, celebrities and popular news-items have been covered in the electronic media, and it also represents subjectivity of topical news events. The news coverage trends highlight the significance of daily news events, which may assist in getting insight about the media groups.

Highlights

The Internet has revolutionized the communication paradigm
The TM techniques that aim at providing such insights include text categorization, clustering, summarization, concept extraction, topic detection, information retrieval and prediction
This study reported recent developments in the knowledge discovery from textual data