Abstract
Certain applications requires a scalable cost effective storage and execution system with facility to store data and have feature to analyze data to its finest granularity level in future. This increase the quality and accuracy of result analysis. Wireless sensor Network (WSN) nodes deployed for certain data intensive applications such as surveillance, war zone monitoring etc. generates a massive amount of raw data. There is an essential requirement of storing this data in its native format for analytics purpose in anticipation of future requirements. In present work, a data lake implemented on Amazon AWS is presented for storage of data in original version for future reference. Data Lake implementation service is utilized for storing the data generated in big volumes, high speed and in variety. The data in Data Lake is stored in three zones i.e. raw, reformed and curated. This paper proposes an efficient method of storing structured, unstructured and semi-structured, data in to Data Lake for future retrieval and analytics purpose. The results are comprehensively presented highlighting the advantages of using Data Lake in place of data warehouses.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.