Improving Data Extraction System to Parse Data from Scraped Job Advertisements

Claudia Nathasia Jason

doi:10.9744/jirae.5.1.19-22

Abstract

Extracting the information from an online job advertisement might be a little tricky. The information is wrapped with redundant information, called boilerplate, that is not related to the job at all. The information also needs to be segmented and classified into the right class or groups. After the information has been classified, it is easier to find the features (e.g., required skills and required education) that make the later processing faster.

Highlights

The Internet provides so much information, including job advertisements
A job advertisement on the internet is wrapped with redundant information that needs to be removed before the data is matched to the job seeker’s profile
Good results are obtained after the research and several tests

Summary

Introduction

The Internet provides so much information, including job advertisements. Job advertisements are updated almost every day and it is almost impossible to keep track of every single job advertisement that has been uploaded. The problem is job seekers will spend more time searching for the ideal job by gathering so much information. This process can be cut down by parsing the job advertisement and help people to find their ideal job by matching the features of the job and the skills of the job seekers. This is where feature extraction is needed. A job advertisement on the internet is wrapped with redundant information that needs to be removed before the data is matched to the job seeker’s profile. The research done is to find the best method to clean the job advertisement as most of the job advertisement website has unrelated advertisement inside the pages

Research The research is done by using the Development Oriented

Result and Discussion

Texttiling The texttiling method can be found within the Natural

Results in many short segments

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Improving Data Extraction System to Parse Data from Scraped Job Advertisements

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Industrial Research and Applied Engineering

Lead the way for us

Journal: International Journal of Industrial Research and Applied Engineering	Publication Date: Aug 26, 2021
License type: cc-by

Similar Papers

Data Mining of Job Requirements in Online Job Advertisements Using Machine Learning and SDCA Logistic Regression
Bogdan Walek ... Ondrej Pektor
Mathematics | VOL. 9
Bogdan Walek, et. al.Bogdan Walek ... Ondrej Pektor
03 Oct 2021
Mathematics | VOL. 9

Employers most desirable attributes in early-career physiotherapists: a content analysis of job advertisements
R Mcaleer ... A Kenny
BMC Health Services Research | VOL. 24
R Mcaleer, et. al.R Mcaleer ... A Kenny
06 Sep 2024
BMC Health Services Research | VOL. 24

Mining People Analytics from StackOverflow Job Advertisements
Maria Papoutsoglou ... Nikolaos Mittas
-
Maria Papoutsoglou, et. al.Maria Papoutsoglou ... Nikolaos Mittas
01 Aug 2017
01 Aug 2017

Automated Analysis of Job Requirements for Computer Scientists in Online Job Advertisements
Georg Schneider ... Joscha Grüger
-
Georg Schneider, et. al.Georg Schneider ... Joscha Grüger
01 Jan 2019
01 Jan 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Improving Data Extraction System to Parse Data from Scraped Job Advertisements

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Industrial Research and Applied Engineering