Abstract

Internet connects everyone to everything globally. The existence of Internet eases people in completing daily tasks. Thanks to Internet, information is being digitalized and spread openly to the public. Online news articles not only provide us with useful and reliable information and reports, it also eases information extraction and gathering for research purposes especially in Natural Language Processing (NLP) and Machine Learning (ML). The topics regarding the South China Sea have been popular lately due to the rise of conflicts between several countries claim on the islands in the sea. Gathering data through Internet and online sources proves to be easy, but to process a huge amount data and to identify only useful information manually takes a longer time to complete. Extracting important features from a text document can be done by using one or a combination of feature extraction methods. Relevant information and the classification of news articles in relation to the conflicts in South China Sea need to be done. In this paper, a model is proposed to use Named Entity Recognition (NER) that search for and classifies important information regarding to the conflicts. In order to do that, a combination of Part-of-Speech (POS) and NER are needed to extract type of conflicts from the news. This study also claims to classify news by using Conditional Random Field (CRF) algorithm and Multinomial Naïve Bayes (MNB) as classification methods by training and testing the data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.