Generating Actionable Knowledge from Big Data

Xiu Susie Fang

doi:10.1145/2744680.2744687

Abstract

The last few years have seen a rapid increase of sheer amount of data produced and communicated over the Internet and the Web. While it is widely believed that the availability of such ``Big Data'' holds the potential to revolutionize many aspects of our modern society (e.g., intelligent transportation, environmental monitoring, and energy saving), many challenges need to be addressed before this potential can be realized. This PhD project focuses on one critical challenge, namely extracting actionable knowledge from Big Data. Tremendous efforts have been contributed on mining large-scale data on the Web and constructing comprehensive knowledge bases (KBs). However, existing knowledge extraction systems retrieve data from limited types of Web sources. In addition, data fusion approaches consider very little of the noises produced by those knowledge extraction systems. Consequently, the constructed KBs are far from being comprehensive and accurate. In this paper, we present our initial design of a framework for extracting machine-readable data with high precision and recall from four types of data sources, namely Web texts, Document Object Model (DOM) trees, existing KBs, and query stream. Confidence scores are attached to the resulting knowledge, which can be used to further improve the knowledge fusion results.

Full Text