Abstract
The rapid development of information has made online news increasingly needed. Online news attracts readers' attention by providing convenience and speed in presenting news from various fields. However, the large amount (volume) of online news that spreads in a short time (velocity) and the public's need to consume news in various references (variety) can affect people's lives. Therefore, the government as the regulator and news agencies need to monitor online news circulating. Based on these problems, the researcher proposes a data lake architectural design that is suitable for online news and can run in real-time. Data lakes can solve the main problems of Big Data (volume, velocity, variety). In proposing this data lake architecture, the researcher conducted a literature study and analyzed the flow of the data lake architecture according to online news. Furthermore, the researcher will use this architecture to combine and uniform the online news data structure from several online news channels and then stream it in real-time to fill the data lake. The results of using the data lake architecture for online news will be stored on MongoDB which functions as a database to store all data for both the short and long term. Finally, this data lake will be a means to accommodate, dive into, and analyze the circulating online news data. Keywords – Data Lake, Online News, Real-Time.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.