Machine-to-Machine (M2M) communication relies on the physical objects (e.g., satellites, sensors, and so forth) interconnected with each other, creating mesh of machines producing massive volume of data about large geographical area (e.g., living and non-living environment). Thus, the M2M is an ideal example of Big Data. On the contrary, the M2M platforms that handle Big Data might perform poorly or not according to the goals of their operator (in term of cost, database utilization, data quality, processing and computational efficiency, analysis and feature extraction applications). Therefore, to address the aforementioned needs, we propose a new effective, memory and processing efficient system architecture for Big Data in M2M, which, unlike other previous proposals, does not require whole set of data to be processed (including raw data sets), and to be kept in the main memory. Our designed system architecture exploits divide-and-conquer approach and data block-wise vertical representation of the database follows a particular petitionary strategy, which formalizes the problem of feature extraction applications. The architecture goes from physical objects to the processing servers, where Big Data set is first transformed into a several data blocks that can be quickly processed, then it classifies and reorganizes these data blocks from the same source. In addition, the data blocks are aggregated in a sequential manner based on a machine ID, and equally partitions the data using fusion algorithm. Finally, the results are stored in a server that helps the users in making decision. The feasibility and efficiency of the proposed system architecture are implemented on Hadoop single node setup on UBUNTU 14.04 LTS core™i5 machine with 3.2GHz processor and 4GB memory. The results show that the proposed system architecture efficiently extract various features (such as River) from the massive volume of satellite data.
Read full abstract