Abstract

Given the increasing number of heterogeneous data stored in relational databases, file systems or cloud environment, it needs to be easily accessed and semantically connected for further data analytic. The potential of data federation is largely untapped, this paper presents an interactive data federation system (https://vimeo.com/319473546) by applying large-scale techniques including heterogeneous data federation, natural language processing, association rules and semantic web to perform data retrieval and analytics on social network data. The system first creates a Virtual Database (VDB) to virtually integrate data from multiple data sources. Next, a RDF generator is built to unify data, together with SPARQL queries, to support semantic data search over the processed text data by natural language processing (NLP). Association rule analysis is used to discover the patterns and recognize the most important co-occurrences of variables from multiple data sources. The system demonstrates how it facilitates interactive data analytic towards different application scenarios (e.g., sentiment analysis, privacy-concern analysis, community detection).

Highlights

  • We proposed RDF-supported data visualization framework over federated databases enriched with data mining and natural language processing (NLP) techniques

  • 1 www.w3.org/2001/11/IsaViz 2 https://www.salzburgresearch.at/publikation/rdf-gravity-3/ 3 https://www.psychometrics.cam.ac.uk/productsservices/mypersonality into a unified federated database interface, where user can choose data sources in their area of interest; we provide data analytics including data exploration and search, which empowers users with ability to explore the data via data mining algorithms, and search by queries to lead advanced analytics. we implement interactive visualization, which allows users to plot the result in different formats like tables, graphs, scatter plots, and download the results

  • The system is scalable by adding other data sources, applying other data mining algorithms, and aiming at other data analytic scenarios

Read more

Summary

INTRODUCTION

Considering advanced data analytics across federated data is ignored In this demo paper, we proposed RDF-supported data visualization framework over federated databases enriched with data mining (e.g., association rules) and NLP techniques (e.g., sentiment analysis). We proposed RDF-supported data visualization framework over federated databases enriched with data mining (e.g., association rules) and NLP techniques (e.g., sentiment analysis) It can efficiently federate and analyze large heterogeneous data sources for general or specific analysis needs (e.g., community detection and the like). 1 www.w3.org/2001/11/IsaViz 2 https://www.salzburgresearch.at/publikation/rdf-gravity-3/ 3 https://www.psychometrics.cam.ac.uk/productsservices/mypersonality into a unified federated database interface, where user can choose data sources in their area of interest; we provide data analytics including data exploration and search, which empowers users with ability to explore the data via data mining algorithms (i.e., association rules), and search by queries to lead advanced analytics. The system is scalable by adding other data sources, applying other data mining algorithms, and aiming at other data analytic scenarios

SYSTEM DESCRIPTION
Data Federation and Data Linkage
Data Analytics
DEMONSTRATION OF THE SYSTEM
Case-study 1: association rules based sentiment analysis
Case-study 2: personality and sentiment analysis
Case-study 3: privacy-concern analysis
Implementation and demo environment
Findings
CONCLUSION

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.