A Text Clustering Preprocessing Technique for Mixed Bisaya and English Short Message Service (SMS) Messages for Higher Education Institutions (HEIs) Enrolment-Related Inquiries

Michelle Bao-Torayno

doi:10.17485/ijst/2020/v13i06/149363

Abstract

Objectives: This study is aimed to develop a text preprocessing technique for mixed Bisaya and English short message service (SMS) messages. This technique is used to extract significant keywords for SMS message clustering procedure as the basis for SMS automated response on Higher Education Institution (HEI)’s enrollmentrelated inquiries. Methods/statistical analysis: In this study, a text clustering preprocessing technique is introduced and developed for mixed Bisaya and English SMS messages for Higher Education Institution (HEI) enrollment-related inquiries. The technique is a relatively new approach to extract significant keywords while addressing key challenges in morphological complexities on mixed Bisaya and English SMS messages. The method has seven (7) stages namely: tokenization, language tagging, stop-word removal, stemming, Soundex, final-tagging, and language translation. The term frequency co-occurrence clustering approach is applied to evaluate the precision and effectiveness of the text preprocessing technique. Findings: Test results revealed that the method produces a good preprocessing procedure with approximately 73%–83% accuracy rate on text processing and 87%–90% accuracy rate when text preprocessing is applied to clustering. Application/ improvements: The results of this study may assist academic institutions in maximizing the opportunity to effectively entertain more enrollment-related inquiries via SMS as an alternative communication medium to its target market. This also promotes technological advancement for the institution as it utilizes an ICTenhanced marketing approach through mobile technology. Keywords: Text Preprocessing, Text Clustering, SMS Messaging, Stemming Algorithm, Enrollment-related Inquiries.

Highlights

Document or text clustering is an unsupervised classification of text collections into distinct groups of similar documents where similarity is defined as some function on documents
To overcome the shortcomings of the preprocessing techniques for short messages, and at the same time provide a suitable approach for the Bisaya dialect, this study developed a text preprocessing technique for mixed Bisaya and English short messaging service (SMS) messages
The results of the experiment for this study show that having a database lookup as parts of speech (POS) tagger does not decrease processing time, but instead causes the processing to take longer

Summary

Introduction

Document or text clustering is an unsupervised classification of text collections into distinct groups of similar documents where similarity is defined as some function on documents. A text clustering algorithm partitions a document based on their topic similarities. This means that documents which discuss the same topic are assigned to a single cluster [1]. Recent developments on the Internet and mobile technologies resulted in an overwhelming growth of multilingual documents on the web and short messaging service (SMS) messages. These documents are written in numerous different languages and on diverse topics, and organizing these documents have become a critical problem. Due to the need for methods that deal with text collections in various languages simultaneously, there is an increased demand for a robust multilingual document clustering algorithms

Objectives

Methods

Results

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Text Clustering Preprocessing Technique for Mixed Bisaya and English Short Message Service (SMS) Messages for Higher Education Institutions (HEIs) Enrolment-Related Inquiries

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Indian Journal of Science and Technology

Lead the way for us

Journal: Indian Journal of Science and Technology	Publication Date: Feb 14, 2020
License type: cc-by

Similar Papers

An efficient short messages transmission in cellular networks
Z Naor
-
Z NaorZ Naor
07 Mar 2004
07 Mar 2004

Communicating textual health information to the mobile phones of visually-impaired users
Per Egil Kummervold ... Halgeir Holthe
Journal of Telemedicine and Telecare | VOL. 14
Per Egil Kummervold, et. al.Per Egil Kummervold ... Halgeir Holthe
01 Jun 2008
Journal of Telemedicine and Telecare | VOL. 14

The Use of Mobile Apps and SMS Messaging as Physical and Mental Health Interventions: Systematic Review.
Amy Leigh Rathbone ... Julie Prescott
Journal of Medical Internet Research | VOL. 19
Amy Leigh Rathbone, et. al.Amy Leigh Rathbone ... Julie Prescott
24 Aug 2017
Journal of Medical Internet Research | VOL. 19

How to safely respond to SMS messages in automobiles
Yun-Cheng Ju ... Tim Paek
-
Yun-Cheng Ju, et. al.Yun-Cheng Ju ... Tim Paek
07 Feb 2010
07 Feb 2010

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Text Clustering Preprocessing Technique for Mixed Bisaya and English Short Message Service (SMS) Messages for Higher Education Institutions (HEIs) Enrolment-Related Inquiries

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Indian Journal of Science and Technology