Abstract

The Internet is the largest source of information created by humanity. It contains a variety of materials availablein various formats, such as text, audio, video, and much more. In all, web scraping is one way. There is a setof strategies here in which we get information from the website instead of copying the data manually. Many web-based data extraction methods are designed to solve specific problems and work onad hocdomains. Various toolsand technologies have been developed to facilitate web scraping. Unfortunately, the appropriateness and ethicsof using these web scraping tools are often overlooked. There are hundreds of web scraping software availabletoday, most of them designed for Java, Python, and Ruby. There is also open-source software and commercialsoftware. Web-based software such as Yahoo! Pipes, Google Web Scrapers, and Firefox extensions for Outwitare the best tools for beginners in web cutting. Web extraction is basically used to cut this manual extractionand editing process and provide an easy and better way to collect data from a web page and convert it into thedesired format and save it to a local or archive directory. In this study, among other kinds of scrub, we focus onthose techniques that extract the content of a web page. In particular, we use scrubbing techniques for a varietyof diseases with their own symptoms and precautions.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.