Abstract

Broad-coverage knowledge bases (KBs) such as Wikipedia, Freebase, Microsoft's Satori and Google's Knowledge Graph contain structured data describing real-world entities. These data sources have become increasingly important for a wide range of intelligent systems: from information retrieval and question answering, to Facebook's Graph Search, IBM's Watson, and more. Previous work on learning to populate knowledge bases from text has, for the most part, made the simplifying assumption that facts remain constant over time. But this is inaccurate -- we live in a rapidly changing world. Knowledge should not be viewed as a static snapshot, but instead a rapidly evolving set of facts that must change as the world changes. In this paper we demonstrate the feasibility of accurately identifying entity-transition-events, from real-time news and social media text streams, that drive changes to a knowledge base. We use Wikipedia's edit history as distant supervision to learn event extractors, and evaluate the extractors based on their ability to predict online updates. Our weakly supervised event extractors are able to predict 10 KB revisions per month at 0.8 precision. By lowering our confidence threshold, we can suggest 34.3 correct edits per month at 0.4 precision. 64% of predicted edits were detected before they were added to Wikipedia. The average lead time of our forecasted knowledge revisions over Wikipedia's editors is 40 days, demonstrating the utility of our method for suggesting edits that can be quickly verified and added to the knowledge graph.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call