Abstract

With the continuous development of Internet technology, from a mass of data real-time, efficient analysis and dig out the valuable information, especially important for enterprises. At present, relatively common practice is built up data analysis system in the Hadoop environment based on Hive. But it is more suitable for the batch processing in large data of clusters, and is not suitable for the real-time processing of large data requirements brought about by the development of the business adjustment. This paper presents a real-time data analysis system based on Impala. It can be used as a good supplement scheme. This paper will explain the thought and method of the construction of the real-time data analysis system based on Impala, from the system selection, system architecture, and practical.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call