Client-Server System for Parsing Data from Web Pages

Artur Britvin,Jawad Hammad Alrawashdeh,Rostyslav Tkachuck

doi:10.23939/acps2022.01.008

Abstract

An overview of the basic principles and approaches for extracting information and processing information from web pages has been conducted. A methodology for developing a client-server system based on a tool for automation of work in Selenium web browsers based on the analyzed information about data parsing has been created. A third-party API as a user interface to simplify and speed up system development has been used. User access without downloading additional software has been enabled. Data from web pages have been received and processed. Development has been based on this methodology of its own client-server system, which is used to parse and collect the information presented on web pages. Analysis of cloud technology services for further deployment of data collection system from web pages has been carried out. Assessment and analysis of the viability of the system in an autonomous state have been deployed in the cloud service during long-term operation.

Full Text