Abstract

Big data have 4V characteristics of volume, variety, velocity, and veracity, which authentically calls for big data analytics. However, what are the dominant characteristics of big data analysis? Here, the analytics is related to the entire methodology rather than the individual specific analysis. In this paper, six techniques concerning big data analytics are proposed, which include: (1) Ensemble analysis related to a large volume of data, (2) Association analysis related to unknown data sampling, (3) High-dimensional analysis related to a variety of data, (4) Deep analysis related to the veracity of data, (5) Precision analysis related to the veracity of data, and (6) Divide-and-conquer analysis related to the velocity of data. The essential of big data analytics is the structural analysis of big data in an optimal criterion of physics, computation, and human cognition. Fundamentally, two theoretical challenges, ie the violation of independent and identical distribution, and the extension of general set-theory, are posed. In particular, we have illustrated three kinds of association in geographical big data, ie geometrical associations in space and time, spatiotemporal correlations in statistics, and space-time relations in semantics. Furthermore, we have illustrated three kinds of spatiotemporal data analysis, ie measurement (observation) adjustment of geometrical quantities, human spatial behavior analysis with trajectories, data assimilation of physical models and various observations, from which spatiotemporal big data analysis may be largely derived.

Highlights

  • Big data have 4V characteristics of volume, variety, velocity, and veracity, which authentically calls for big data analytics

  • Topographic observation and geographical phenomena sensing are digitally recorded in the computer, and geometrical relation analysis is emphasized in geographical big data analytics

  • With the rapid development of computer and communication technologies, the human-machine-environment system is increasingly observed by the space, air- and ground-based sensor digital networks

Read more

Summary

Big data and its 4V characteristics

There exist two common sources of big data, ie collective gathering and individual generation. Computer software development is concerned with three aspects of computation, ie problems computability, algorithm complexity, and distributed intelligence. In 1936, Alan Turing proved that a general algorithm to solve the halting problem because all possible program-input pairs cannot exist. In a computing environment of the networked and high-density storage, the space complexity of algorithms can be greatly reduced. It is summarized that big data have 4V characteristics of volume, variety, velocity, and veracity (Barwick 2012; Hilbert 2015). The veracity, roughly termed data value or data usability, seems especially important in practice

Revisiting two mathematical theories for big data analysis
Independent and identical distribution
Set theory
Six techniques of big data analysis
Ensemble analysis
Association analysis
High-dimensional analysis
Deep analysis
Divide-and-conquer analysis
Geographical big data analysis
Conclusions
Notes on contributor
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call