Abstract

Extracting the information from an online job advertisement might be a little tricky. The information is wrapped with redundant information, called boilerplate, that is not related to the job at all. The information also needs to be segmented and classified into the right class or groups. After the information has been classified, it is easier to find the features (e.g., required skills and required education) that make the later processing faster.

Highlights

  • The Internet provides so much information, including job advertisements

  • A job advertisement on the internet is wrapped with redundant information that needs to be removed before the data is matched to the job seeker’s profile

  • Good results are obtained after the research and several tests

Read more

Summary

Introduction

The Internet provides so much information, including job advertisements. Job advertisements are updated almost every day and it is almost impossible to keep track of every single job advertisement that has been uploaded. The problem is job seekers will spend more time searching for the ideal job by gathering so much information. This process can be cut down by parsing the job advertisement and help people to find their ideal job by matching the features of the job and the skills of the job seekers. This is where feature extraction is needed. A job advertisement on the internet is wrapped with redundant information that needs to be removed before the data is matched to the job seeker’s profile. The research done is to find the best method to clean the job advertisement as most of the job advertisement website has unrelated advertisement inside the pages

Research The research is done by using the Development Oriented
Result and Discussion
Texttiling The texttiling method can be found within the Natural
Results in many short segments
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.