Abstract

We propose a method for scalable first-order rule learning on large-scale Twitter data. By learning rules, probabilistic inference queries can be executed to reason over the data to ascertain its veracity. Our method employs a divide-and-conquer approach, graph-based modeling, and data parallel processing during rule learning using a commodity cluster to overcome the problem of slow structure learning on large-scale Twitter data. The first-order predicates (constructed on the posts) are first partitioned in a balanced way by pivoting around users to reduce the chance of missing relevant rules. By constructing a weighted graph and applying graph partitioning, balanced partitions of the ground predicates can be created. Each partition is then processed using an existing structure learning approach to get the set of rules for that partition. We report a preliminary evaluation of our method to show that it offers a promising solution for scalable first-order rule learning on Twitter data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call