Abstract
The amount of data that produced is increased day after day especially data as a text, so with this massive production it would be difficult to analyze or extract information to discover the patterns from the unstructured text. Text mining is used for availing the massive amount of knowledge that is in the text and deriving high quality information from the text automatically. This Process would save effort and time. Text mining considered as a subset of data mining where data mining is more generic. This paper proposes a methodology of mining a text for a case study related to publication papers. Some of text mining approaches will be introduced for mining the publication papers using machine learning (ML) and natural language processing (NLP) techniques. Describing each phase as following: First phase is keywords extraction using natural language processing techniques, second phase named entity recognition and last phase is document classification. The last two phases are using the ML techniques. Then a case study is built to simulate the system phases, showing what is the input and the output in each phase.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have