The geological big data is a kind of spatio-temporal big data whose characteristics has both similarities and marked differences with the general big data. Due to long-term evolution of geological objects with huge space and influence of various geological process, its development undergoes a complicated process. Geological bodies are buried in deep underground so that geological data has the feature of incomplete information of parameter, structure, relationship and evolution, which has high computation complexity with high dimension and significant uncertainty. In order to address those challenges, the theories, methods and techniques of big data should be introduced to integrate and utilize geological science big data. It is necessary to set up unified spatial reference system for geological spatio-temporal big data to realize data consistency processing and fusion with regards to spatial standard benchmark, tense, scale and semantics. Various kinds of integrated storage and management methods for static geological exploration data and dynamic distributed geological observation data will be developed. On the one hand, current big data techniques can be adapted, such as realtime data storage and management, big data storage and management architecture, and analysis and processing architecture. On the other hand, specialized big data techniques based on actual situation should be developed, such as distributed parallel spatio-temporal indexing for geological data, and pre-dispatch method considering spatio- temporal relationship and geological semantics. To judge a traditional algorithm whether is suitable for being transformed into big data environment, main criteria include algorithm task decomposability, data decomposability and data flow segmented relevance. Hence, the transformation of traditional geological data processing methods will be from these three aspects to excavate the task parallelism, geometry decomposability and reduce the data flow segmented relevance as much as possible to adapt to the distributed computing and high performance computing of big data environment. The big data technique, which may be a breakthrough in geological science, can mine knowledge from massive spatio-temporal and text data directly in spite of the limits in geological sampling methods such as the randomness and sparsity of sample space, the judge relied on inadequate observational data and fixed mode, and traditional data analysis methods. A series of mathematical geology methods and spatial data mining methods which are frequently used nowadays can be used in geological spatio-temporal big data mining after being remolded. The key science and technology issues referred in geological science development in big data era can be listed as follows—the integrated storage management and handling of structured, semi-structured and unstructured data, big and small data, hybrid and precision data, model and data, static exploration model and dynamic monitoring model, the combination of data mining and data analysis, the unification of correlation and causality, deep mining and visualization of geological science big data, and so on. Taking metal mineral metallogenic prediction as an example, geological big data utilization and analysis are studied. In terms of a global view, Glass Earth is the underground extention of Digital Earth. Glass Earth is a kind of 3D visualized virtual shallow crust, which is an aggregation of geological and geographical information and stored on the computer network to be accessed by multi-user and provide analysis service and decision making to the research and applications of geology, resources and environment. The Glass Earth is the effective vehicle of geological science big data and the construction of the Glass Earth is one of the best ways to solve all the aforementioned scientific and technical problems.
Read full abstract