IMPLEMENTASI WEB SCRAPING PADA PORTAL BERITA ONLINE

Y A Hafiz,Endah Sudarmilah

doi:10.59344/inisiasi.v12i1.120

Abstract

This report discusses the use of web scraping on several well-known online news portals in Indonesia, such as CNBC Indonesia, CNN Indonesia, Kompas, Merdeka, Suara, Jawapos, JPNN, Republika, and Inews. Web scraping is used as a method to collect data from web pages automatically and efficiently. The web scraping process is carried out using the Visual Studio Code application with the Python programming language and the beautifulsoup library. This web scraping method is divided into several stages, starting from opening the scraping template, determining the website from which the data will be collected, exploring and navigating the site to identify the important elements that you want to retrieve, to running the web scraping and generating the desired data. The results of web scraping are in the form of media, titles, subtitles, news URLs, content, news dates, news editors, news journalists, and news locations. Through the implementation of web scraping, this research concludes several important things. First, web scraping helps simplify the process of collecting data from various news sources and combining them into one easily accessible place. Second, with web scraping, data analysis can be done more quickly and efficiently than manual collection methods. Finally, the results of web scraping can be used to see trends and patterns and analyze public sentiment towards news and certain topics, providing valuable insights for people's understanding of current issues.

Full Text