Abstract

With so many things around us continuously producing and processing data, be it mobile phones, or sensors attached to devices, or satellites sitting thousands of kilometres above our heads, data is becoming increasingly heterogeneous. Scientists are inevitably faced with data challenges, coined as the 4 V’s of data - volume, variety, velocity and veracity. In this paper, we address the issue of data variety. The task of integrating and querying such heterogeneous data is further compounded if the data is in unstructured form. We hence propose an approach using Semantic Web and Natural Language Processing techniques to resolve the heterogeneity arising in data formats, bring together structured and unstructured data and provide a unified data model to query from disparate data sets.

Highlights

  • Recent advances in technology have led to an explosion in the availability of data, often referred to as ‘big data’

  • Can we naturally extend the above solution to unify both structured and unstructured data sources, through an approach based on Natural Language Processing; 3

  • The linked data model was set up by integrating different data sets together on GraphDB. These data sets were represented principally by the information extracted from the Section19 report, the Communities at Risk data set and the metadata about flood concepts provided through the flood ontologies

Read more

Summary

Introduction

Recent advances in technology have led to an explosion in the availability of data, often referred to as ‘big data’. At present, this data is often siloed and this is increasingly a major problem for the field This siloing problem is overcome by bringing the data together in one place, for example exploiting the potential of cloud computing. We can add a layer of metadata above ‘raw’ data by enriching each data atom with an ontological concept having a definition This metadata layer can enable to abstract over disparate data sources and enable their integration. If bird feathers are labelled as Feather and BirdFeather in two separate datasets, they can still be annotated with the same ontological concept about bird feathers This abstraction can enable to bring both datasets together

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.