Supporting efficient geographically-distributed autonomous data management is one of the critical obstacles for opening up a big data era. It creates the need for investigating such kind of systems in big data environments. In this paper, a distributed autonomous data management system is put forward, exhibiting the following features. (1) A distributed architecture designed to meet the requirements of autonomous data management by allowing interconnection, intercommunication, and interoperation of multiple sites over the Internet. (2) An autonomous, multi-level, unstructured data storage system to meet high-efficiency storage needs, with reference to the distributed heterogeneous data storage theories. (3) A distributed autonomous data indexing and retrieval system to support metadata search & full-text searching, fast loading, remote access, and unified view. Experimental and industrial application results demonstrate that the proposed system has high potential to reduce access time and improve storage efficiency, while maintaining satisfactory availability and scalability.
Read full abstract