Abstract

Large volumes of data are generated continuously by billions of human data producers, sensors, surveillance systems, communication devices and networks (e.g., the Internet). Proper analysis of this data can lead to new scientific insights, new products and services, more creative outputs (e.g., new recipes, music scores, fashion styles), improved performance of business and civic organizations, and to better informed government and non-government organizations. In other words, deriving information from these large volumes of data can lead to, among other things, smarter individuals capable of making scientific breakthroughs, producing innovative products, and making effective decisions. Data can be well structured or not. We have observed that the term “semistructured” data is also used in cases where the structure of the data is not yet known or is overly complex (e.g., the structure of natural language).One of the challenges facing the big data community relates to inferring the structure behind the datawhen it is not knownbeforehand. In terms of modeling, this challenge relates to inferring a data model from a set of data. The challenge arises because there may be more than one way to structure data, only some of which may be based on inherent data properties. Deriving structures that facilitate human understanding or appropriate computer manipulation may require considerations beyond inherent data properties. For example in Grounded Theory, analysis involves mapping data to “ideas” and “ideas to ideas”. In the context of object-oriented modeling we might ask: What is the class diagram that adequately models a given set

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call