Abstract

Websites are regarded as domains of limitless information which anyone and everyone can access. The new trend of technology has shaped the way we do and manage our businesses. Today, advancements in Internet technology has given rise to the proliferation of e-commerce websites. This, in turn made the activities and lifestyles of marketers/vendors, retailers and consumers (collectively regarded as users in this paper) easier as it provides convenient platforms to sale/order items through the internet. Unfortunately, these desirable benefits are not without drawbacks as these platforms require that the users spend a lot of time and efforts searching for best product deals, products updates and offers on ecommerce websites. Furthermore, they need to filter and compare search results by themselves which takes a lot of time and there are chances of ambiguous results. In this paper, we applied web crawling and scraping methods on an e-commerce website to obtain HTML data for identifying products updates based on the current time. These HTML data are preprocessed to extract details of the products such as name, price, post date and time, etc. to serve as useful information for users.

Highlights

  • The advancement of internet technology has enabled the fast growth of e-commerce websites for marketers/vendors and consumers

  • Web scraping is a natural language processing (NLP) technique that describes the use of a program to extract data from Hyper Text Markup Language (HTML) files on the internet which can be stored as textfiles or in databases for further analysis

  • Unlike the traditional way of extracting data by copying and pasting, web scraping is automated by using programming languages like python by defining some parameters and retrieving data in a shorter time

Read more

Summary

INTRODUCTION

The advancement of internet technology has enabled the fast growth of e-commerce websites for marketers/vendors and consumers (collectively regarded as users in this paper). There are many web scrapping techniques use for raking data on internet They are (1) Traditional copy and paste, (2) Text grapping and regular expression, (3) Hypertext Transfer Protocol (HTTP) Programming, (4) Hyper Text Markup Language (HTML) Parsing, (5) Document Object Model (DOM) Parsing, (6) Web Scraping Software, (7) Vertical aggregation platforms, (8) Semantic annotation recognizing, (9) Computer vision web page analyzers. They can provide valuable opportunities in the search for products updates by making searches of multiple websites (or web pages of a website) more resource-efficient [3]. The scrapping scripts are written using python libraries and web crawling works on HTML tags

RELATED WORK
Data Collection
Mapping Selected Web Pages
Developing Web Scrapper
Process Scrapped Data
RESULTS
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.